Parsing ascii file

Discussion in 'Python' started by diablo, Jun 17, 2004.

  1. diablo

    diablo Guest

    Hello ,

    I have a file that contains the following data (example) and does NOT have
    any line feeds:

    11 22 33 44 55 66 77 88 99 00 aa bb cc
    dd ....to 128th byte 11 22 33 44 55 66 77 88 99
    00 aa bb cc dd .... and so on

    record 1 starts at 0 and finishes at 128, record 2 starts at 129 and
    finishes at 256 and so on. there can be as many as 5000 record per file. I
    would like to parse the file and retreive the value at field at byte 64-65
    and conduct an arithmetical operation on the field (sum them all up).

    Can I do this with python?

    if I was to use awk it would look something like this :

    cat <filename> | fold -w 128 | awk ' { SUM=SUM + substr($0,64,2) } END
    {print SUM}'


    Regards
    Dean
     
    diablo, Jun 17, 2004
    #1
    1. Advertising

  2. diablo

    Peter Otten Guest

    diablo wrote:

    > Hello ,
    >
    > I have a file that contains the following data (example) and does NOT have
    > any line feeds:
    >
    > 11 22 33 44 55 66 77 88 99 00 aa bb cc
    > dd ....to 128th byte 11 22 33 44 55 66 77 88
    > 99
    > 00 aa bb cc dd .... and so on
    >
    > record 1 starts at 0 and finishes at 128, record 2 starts at 129 and
    > finishes at 256 and so on. there can be as many as 5000 record per file. I
    > would like to parse the file and retreive the value at field at byte 64-65
    > and conduct an arithmetical operation on the field (sum them all up).
    >
    > Can I do this with python?
    >
    > if I was to use awk it would look something like this :
    >
    > cat <filename> | fold -w 128 | awk ' { SUM=SUM + substr($0,64,2) } END
    > {print SUM}'


    Is it an ascii or a binary file? I'm not entire sure from your description.
    In the following I assume binary data, but it should be easy to modify the
    value() function if those two bytes are ascii digits.

    import struct, sys
    from itertools import imap

    def fold(instream, width=80):
    while 1:
    line = instream.read(width)
    if not line: break
    yield line

    def value(line, start=64): # may be an "off by one" bug
    # return int(line[start:start+2]))
    return struct.unpack("h", line[start:start+2])[0]

    if __name__ == "__main__":
    try:
    filename = sys.argv[1]
    except IndexError:
    instream = sys.stdin
    else:
    instream = file(filename)

    print sum(imap(value, fold(instream, 128)))

    Peter
     
    Peter Otten, Jun 17, 2004
    #2
    1. Advertising

  3. diablo

    Eddie Corns Guest

    "diablo" <> writes:

    >Hello ,


    >I have a file that contains the following data (example) and does NOT have
    >any line feeds:


    >11 22 33 44 55 66 77 88 99 00 aa bb cc
    >dd ....to 128th byte 11 22 33 44 55 66 77 88 99
    >00 aa bb cc dd .... and so on


    >record 1 starts at 0 and finishes at 128, record 2 starts at 129 and
    >finishes at 256 and so on. there can be as many as 5000 record per file. I
    >would like to parse the file and retreive the value at field at byte 64-65
    >and conduct an arithmetical operation on the field (sum them all up).


    >Can I do this with python?


    >if I was to use awk it would look something like this :


    >cat <filename> | fold -w 128 | awk ' { SUM=SUM + substr($0,64,2) } END
    >{print SUM}'


    You can use stdin.read(128) to get consecutive records and slicing to extract
    the fields. Something like:

    from sys import stdin
    sum = 0
    while True:
    record = stdin.read(128)
    if not record: break
    sum += int(record[64:65])
    print sum

    Frankly, I'd stick with the Awk version unless it's a pedagogical exercise.
    Actually I'd go further and have a script that simplys sums up all the numbers
    in the input and add 'cut' into the pipeline to extract the columns first.

    Eddie
     
    Eddie Corns, Jun 17, 2004
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. TOXiC
    Replies:
    5
    Views:
    1,326
    TOXiC
    Jan 31, 2007
  2. James O'Brien
    Replies:
    3
    Views:
    306
    Ben Morrow
    Mar 5, 2004
  3. Alextophi
    Replies:
    8
    Views:
    584
    Alan J. Flavell
    Dec 30, 2005
  4. bruce
    Replies:
    38
    Views:
    323
    Mark Lawrence
    Nov 1, 2013
  5. MRAB
    Replies:
    0
    Views:
    116
Loading...

Share This Page