Re: file reading by record separator (not line by line)

Discussion in 'Python' started by Steve Howell, Jun 1, 2007.

  1. Steve Howell

    Steve Howell Guest

    --- Tijs <> wrote:

    > Steve Howell wrote:
    > > [...] but I wonder if the Python community
    > > couldn't help a lot of newbies (or insufficiently
    > > caffeinated non-newbies) by any of the following:
    > >

    > Well, I'm not a newbie, and I always make sure to be
    > thoroughly caffeinated
    > before sitting down for coding.


    :)

    > But I think the
    > itertools example in the
    > parent post shows a problem with that package:
    > unless you have intimate
    > knowledge of the package, it is not clear what the
    > code does when reading
    > it. In other words, it is not intuitive.
    >


    Agreed.

    > Perhaps a dedicated "block-read" module that wraps
    > any read()-able object
    > would be better. At least it would immediately be
    > clear what the code
    > means.
    >
    > from blockread import BlockReader
    >
    > b = BlockReader(f, boundary='>')
    > for block in b:
    > # whatever
    >


    Yep, I like this idea. You might have a few
    variations:

    def simple_block_reader(f, start_char='>'):
    # returns list or iterator where each item
    # is a list of lines all belonging to the same
    # block

    def regex_block_reader(f, start_regex,
    end_regex=None):
    # start_regex is regular expression to match on
    # if end_regex is none, then end of block is
    # signified by start of next block or end of file

    def block_reader3(f, start_method, end_method=None):
    # start_method, end_method should be functions
    # that evaluate line, return True/False
    #
    # if end_method is None, then end of block is
    # signified by start of next block or end of
    # input stream





    ____________________________________________________________________________________
    Sick sense of humor? Visit Yahoo! TV's
    Comedy with an Edge to see what's on, when.
    http://tv.yahoo.com/collections/222
     
    Steve Howell, Jun 1, 2007
    #1
    1. Advertising

  2. Steve Howell

    Tijs Guest

    Steve Howell wrote:

    >>
    >> from blockread import BlockReader
    >>
    >> b = BlockReader(f, boundary='>')
    >> for block in b:
    >> # whatever
    >>

    >
    > Yep, I like this idea. You might have a few
    > variations:
    >


    Yes, or a single one that takes a wide range of construction possibilities,
    like strings, lambdas or regexes in various keyword parameters.

    BlockReader(f, start='>')
    BlockReader(f, start=re.compile('>|<'), end='---')
    BlockReader(f, start=lambda x: x.startswith('>'))

    Maybe make variations for character-based readers and line-based readers.

    --

    Regards,
    Tijs
     
    Tijs, Jun 1, 2007
    #2
    1. Advertising

  3. Steve Howell

    Neil Cerutti Guest

    On 2007-06-01, Tijs <> wrote:
    > Steve Howell wrote:
    >>>
    >>> from blockread import BlockReader
    >>>
    >>> b = BlockReader(f, boundary='>')
    >>> for block in b:
    >>> # whatever

    >>
    >> Yep, I like this idea. You might have a few
    >> variations:

    >
    > Yes, or a single one that takes a wide range of construction
    > possibilities, like strings, lambdas or regexes in various
    > keyword parameters.
    >
    > BlockReader(f, start='>')
    > BlockReader(f, start=re.compile('>|<'), end='---')
    > BlockReader(f, start=lambda x: x.startswith('>'))
    >
    > Maybe make variations for character-based readers and
    > line-based readers.


    I would prefer, "f.readlines(delim='>')" etc., a la C++
    str::getline.

    --
    Neil Cerutti
     
    Neil Cerutti, Jun 1, 2007
    #3
  4. On Jun 1, 7:00 am, Steve Howell <> wrote:
    > --- Tijs <> wrote:
    >
    > > Yes, or a single one that takes a wide range of
    > > construction possibilities,
    > > like strings, lambdas or regexes in various keyword
    > > parameters.

    >
    > > BlockReader(f, start='>')
    > > BlockReader(f, start=re.compile('>|<'), end='---')
    > > BlockReader(f, start=lambda x: x.startswith('>'))

    >
    > Definitely. I like your idea for regexes that you
    > just pass the method in, rather than the regex. It
    > means fewer variations, and it also leads to slightly
    > more explicit code from the user, without being too
    > cumbersome.
    >
    > Do you have any free time on your hands? It seems
    > like it would be fairly straightforward to do a quick
    > prototype implementation of this. I'm off to work
    > soon, so I can't do it today, but maybe Sunday.


    I'm afraid I beat you to it :)

    http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/521877

    George
     
    George Sakkis, Jun 2, 2007
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. bjam
    Replies:
    3
    Views:
    4,096
  2. Angelic Devil

    Record separator for readlines()

    Angelic Devil, Sep 2, 2005, in forum: Python
    Replies:
    3
    Views:
    312
    Bengt Richter
    Sep 3, 2005
  3. Lee Sander
    Replies:
    6
    Views:
    309
    Hendrik van Rooyen
    Jun 1, 2007
  4. Johny

    Readline and record separator

    Johny, Oct 30, 2007, in forum: Python
    Replies:
    12
    Views:
    745
    Dennis Lee Bieber
    Nov 2, 2007
  5. William James
    Replies:
    8
    Views:
    168
    William James
    Dec 5, 2005
Loading...

Share This Page