Text parsing

Discussion in 'Python' started by Michiel Sikma, Aug 20, 2006.

  1. Hello everybody.

    Inspired by an example from the book Beginning Python: From Novice to
    Professional, I started working on a simple text parser which I can
    hopefully then extend into a more comprehensive system. I've got a
    little problem, though.

    My code:

    ---- test.py ----
    import sys

    def preparse(file):
    block = []
    for line in file:
    if line.strip():
    block.append(line)
    elif block:
    yield ''.join(block).strip()
    block = []
    yield '\n'

    def makeList(file):
    testOutput = list(preparse(file))
    print testOutput

    testInput = open("test", "r")

    makeList(testInput)
    ----

    ---- test ----
    test1
    test2

    test3
    test4
    test5
    test6

    test7
    test8

    test9

    test10
    ----

    When I run test.py, it prints this:
    michiel-sikmas-computer:~/Desktop msikma$ python test.py
    ['test1\ntest2', 'test3\ntest4\ntest5\ntest6', 'test7\ntest8',
    'test9', '\n']

    What happened to "test10"? It seems to be gone unless I add two
    linebreaks at the end of the file.

    Greets,

    Michiel Sikma
     
    Michiel Sikma, Aug 20, 2006
    #1
    1. Advertising

  2. In <>, Michiel Sikma
    wrote:

    > My code:
    >
    > ---- test.py ----
    > import sys
    >
    > def preparse(file):
    > block = []
    > for line in file:
    > if line.strip():
    > block.append(line)
    > elif block:
    > yield ''.join(block).strip()
    > block = []

    + yield ''.join(block).strip()

    Because your line "test10\n" is still in `block` at this point.

    > yield '\n'
    >
    > […]
    >
    > ---- test ----
    > test1
    > test2
    >
    > test3
    > test4
    > test5
    > test6
    >
    > test7
    > test8
    >
    > test9
    >
    > test10
    > ----
    >
    > When I run test.py, it prints this:
    > michiel-sikmas-computer:~/Desktop msikma$ python test.py
    > ['test1\ntest2', 'test3\ntest4\ntest5\ntest6', 'test7\ntest8',
    > 'test9', '\n']
    >
    > What happened to "test10"? It seems to be gone unless I add two
    > linebreaks at the end of the file.


    Ciao,
    Marc 'BlackJack' Rintsch
     
    Marc 'BlackJack' Rintsch, Aug 20, 2006
    #2
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. GIMME
    Replies:
    2
    Views:
    877
    GIMME
    Feb 11, 2004
  2. Naren
    Replies:
    0
    Views:
    585
    Naren
    May 11, 2004
  3. Christopher Diggins
    Replies:
    0
    Views:
    612
    Christopher Diggins
    Jul 9, 2007
  4. Kai Schlamp
    Replies:
    1
    Views:
    419
    Arne Vajhøj
    Mar 27, 2008
  5. Domenico Discepola

    Assistance parsing text file using Text::CSV_XS

    Domenico Discepola, Sep 1, 2004, in forum: Perl Misc
    Replies:
    6
    Views:
    454
    Domenico Discepola
    Sep 2, 2004
Loading...

Share This Page