Re: Fast forward-backward (write-read)

Discussion in 'Python' started by Dennis Lee Bieber, Oct 24, 2012.

  1. On Tue, 23 Oct 2012 16:35:40 -0700, emile <> declaimed the
    following in gmane.comp.python.general:

    > On 10/23/2012 04:19 PM, David Hutto wrote:
    > > forward = [line.rstrip('\n') for line in f.readlines()]

    >
    > f.readlines() will be big(!) and have overhead... and forward results in
    > something again as big.
    >

    Well, since file objects are iterable, could one just drop the
    ..readlines() ? ( ... line in f )

    > > backward = [line.rstrip('\n') for line in reversed(forward)]

    >
    > and defining backward looks to me to require space to build backward and
    > hold reversed(forward)
    >

    And since the line-ends have already been stripped from forward,
    backward should just be:

    backward = reversed(forward)
    --
    Wulfraed Dennis Lee Bieber AF6VN
    HTTP://wlfraed.home.netcom.com/
     
    Dennis Lee Bieber, Oct 24, 2012
    #1
    1. Advertising

  2. On Wed, 24 Oct 2012 01:23:58 -0400, Dennis Lee Bieber wrote:

    > On Tue, 23 Oct 2012 16:35:40 -0700, emile <> declaimed the
    > following in gmane.comp.python.general:
    >
    >> On 10/23/2012 04:19 PM, David Hutto wrote:
    >> > forward = [line.rstrip('\n') for line in f.readlines()]

    >>
    >> f.readlines() will be big(!) and have overhead... and forward results
    >> in something again as big.
    >>

    > Well, since file objects are iterable, could one just drop the
    > .readlines() ? ( ... line in f )


    Yes, but the bottleneck is still that the list comprehension will run to
    completion, trying to process the entire 100+ GB file in one go.

    [...]
    > And since the line-ends have already been stripped from forward,
    > backward should just be:
    >
    > backward = reversed(forward)


    reversed returns a lazy iterator, but it requires that forward is a non-
    lazy (eager) sequence. So again you're stuck trying to read the entire
    file into RAM.

    --
    Steven
     
    Steven D'Aprano, Oct 24, 2012
    #2
    1. Advertising

  3. On 24 Oct 2012 08:05:02 GMT, Steven D'Aprano
    <> declaimed the following in
    gmane.comp.python.general:

    >
    > Yes, but the bottleneck is still that the list comprehension will run to
    > completion, trying to process the entire 100+ GB file in one go.
    >

    Concede, but 100GB once has to still be better than 100GB twice <G>
    [or, as an algorithm used for smaller data sets, the non-readlines
    version may fit in memory when the other fails]

    --
    Wulfraed Dennis Lee Bieber AF6VN
    HTTP://wlfraed.home.netcom.com/
     
    Dennis Lee Bieber, Oct 24, 2012
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Virgil Stokes

    Fast forward-backward (write-read)

    Virgil Stokes, Oct 23, 2012, in forum: Python
    Replies:
    7
    Views:
    153
    Virgil Stokes
    Oct 25, 2012
  2. Tim Chase
    Replies:
    0
    Views:
    189
    Tim Chase
    Oct 23, 2012
  3. Dennis Lee Bieber

    Re: Fast forward-backward (write-read)

    Dennis Lee Bieber, Oct 23, 2012, in forum: Python
    Replies:
    0
    Views:
    145
    Dennis Lee Bieber
    Oct 23, 2012
  4. Virgil Stokes

    Re: Fast forward-backward (write-read)

    Virgil Stokes, Oct 23, 2012, in forum: Python
    Replies:
    4
    Views:
    201
    Tim Golden
    Oct 24, 2012
  5. Virgil Stokes

    Re: Fast forward-backward (write-read)

    Virgil Stokes, Oct 23, 2012, in forum: Python
    Replies:
    0
    Views:
    122
    Virgil Stokes
    Oct 23, 2012
Loading...

Share This Page