Re: Fast forward-backward (write-read)

Discussion in 'Python' started by Virgil Stokes, Oct 23, 2012.

  1. On 23-Oct-2012 19:56, Tim Chase wrote:
    > On 10/23/12 12:17, Virgil Stokes wrote:
    >> On 23-Oct-2012 18:09, Tim Chase wrote:
    >>>> Finally, to my question --- What is a fast way to write these
    >>>> variables to an external file and then read them in
    >>>> backwards?
    >>> Am I missing something, or would the fairly-standard "tac"
    >>> utility do the reversal you want? It should[*] be optimized to
    >>> handle on-disk files in a smart manner.

    >> Not sure about "tac" --- could you provide more details on this
    >> and/or a simple example of how it could be used for fast reversed
    >> "reading" of a data file?

    > Well, if you're reading input.txt (and assuming it's one record per
    > line, separated by newlines), you can just use
    >
    > tac < input.txt > backwards.txt
    >
    > which will create a secondary file that is the first file in reverse
    > order. Your program can then process this secondary file in-order
    > (which would be backwards from your source).
    >
    > I might have misunderstood your difficulty, but it _sounded_ like
    > you just want to inverse the order of a file.

    Yes, I do wish to inverse the order, but the "forward in time" file will be in
    binary.

    --V
    Virgil Stokes, Oct 23, 2012
    #1
    1. Advertising

  2. Virgil Stokes

    Paul Rubin Guest

    Virgil Stokes <> writes:
    > Yes, I do wish to inverse the order, but the "forward in time" file
    > will be in binary.


    I really think it will be simplest to just write the file in forward
    order, then use mmap to read it one record at a time. It might be
    possible to squeeze out a little more performance with reordering tricks
    but that's the first thing to try.
    Paul Rubin, Oct 24, 2012
    #2
    1. Advertising

  3. Virgil Stokes

    Dave Angel Guest

    On 10/24/2012 03:14 AM, Virgil Stokes wrote:
    > On 24-Oct-2012 01:46, Paul Rubin wrote:
    >> Virgil Stokes <> writes:
    >>> Yes, I do wish to inverse the order, but the "forward in time" file
    >>> will be in binary.

    >> I really think it will be simplest to just write the file in forward
    >> order, then use mmap to read it one record at a time. It might be
    >> possible to squeeze out a little more performance with reordering tricks
    >> but that's the first thing to try.

    > Thanks Paul,
    > I am working on this approach now...


    If you're using mmap to map the whole file, you'll need 64bit Windows to
    start with. I'd be interested to know if Windows will allow you to mmap
    100gb at one stroke. Have you tried it, or are you starting by figuring
    how to access the data from the mmap?

    --

    DaveA
    Dave Angel, Oct 28, 2012
    #3
  4. On 28-Oct-2012 12:18, Dave Angel wrote:
    > On 10/24/2012 03:14 AM, Virgil Stokes wrote:
    >> On 24-Oct-2012 01:46, Paul Rubin wrote:
    >>> Virgil Stokes <> writes:
    >>>> Yes, I do wish to inverse the order, but the "forward in time" file
    >>>> will be in binary.
    >>> I really think it will be simplest to just write the file in forward
    >>> order, then use mmap to read it one record at a time. It might be
    >>> possible to squeeze out a little more performance with reordering tricks
    >>> but that's the first thing to try.

    >> Thanks Paul,
    >> I am working on this approach now...

    > If you're using mmap to map the whole file, you'll need 64bit Windows to
    > start with. I'd be interested to know if Windows will allow you to mmap
    > 100gb at one stroke. Have you tried it, or are you starting by figuring
    > how to access the data from the mmap?

    Thanks very much for pursuing my query, Dave.

    I have not tried it yet --- temporarily side-tracked; but, I will post my
    findings on this issue.
    Virgil Stokes, Oct 28, 2012
    #4
  5. On 28 October 2012 14:20, Virgil Stokes <> wrote:
    > On 28-Oct-2012 12:18, Dave Angel wrote:
    >>
    >> On 10/24/2012 03:14 AM, Virgil Stokes wrote:
    >>>
    >>> On 24-Oct-2012 01:46, Paul Rubin wrote:
    >>>>
    >>>> Virgil Stokes <> writes:
    >>>>>
    >>>>> Yes, I do wish to inverse the order, but the "forward in time" file
    >>>>> will be in binary.
    >>>>
    >>>> I really think it will be simplest to just write the file in forward
    >>>> order, then use mmap to read it one record at a time. It might be
    >>>> possible to squeeze out a little more performance with reordering tricks
    >>>> but that's the first thing to try.
    >>>
    >>> Thanks Paul,
    >>> I am working on this approach now...

    >>
    >> If you're using mmap to map the whole file, you'll need 64bit Windows to
    >> start with. I'd be interested to know if Windows will allow you to mmap
    >> 100gb at one stroke. Have you tried it, or are you starting by figuring
    >> how to access the data from the mmap?

    >
    > Thanks very much for pursuing my query, Dave.
    >
    > I have not tried it yet --- temporarily side-tracked; but, I will post my
    > findings on this issue.


    If you are going to use mmap then look at the numpy.memmap function.
    This wraps pythons mmap so that you can access the contents of the
    mapped binary file as if it was a numpy array. This means that you
    don't need to handle the bytes -> float conversions yourself.

    >>> import numpy
    >>> a = numpy.array([4,5,6], numpy.float64)
    >>> a

    array([ 4., 5., 6.])
    >>> with open('tmp.bin', 'wb') as f: # write forwards

    .... a.tofile(f)
    .... a.tofile(f)
    ....
    >>> a2 = numpy.memmap('tmp.bin', numpy.float64)
    >>> a2

    memmap([ 4., 5., 6., 4., 5., 6.])
    >>> a2[3]

    4.0
    >>> a2[5:2:-1] # read backwards

    memmap([ 6., 5., 4.])


    Oscar
    Oscar Benjamin, Oct 28, 2012
    #5
  6. On 2012-10-28 19:21, Oscar Benjamin wrote:
    > On 28 October 2012 14:20, Virgil Stokes <> wrote:
    >> On 28-Oct-2012 12:18, Dave Angel wrote:
    >>> On 10/24/2012 03:14 AM, Virgil Stokes wrote:
    >>>> On 24-Oct-2012 01:46, Paul Rubin wrote:
    >>>>> Virgil Stokes <> writes:
    >>>>>> Yes, I do wish to inverse the order, but the "forward in time" file
    >>>>>> will be in binary.
    >>>>> I really think it will be simplest to just write the file in forward
    >>>>> order, then use mmap to read it one record at a time. It might be
    >>>>> possible to squeeze out a little more performance with reordering tricks
    >>>>> but that's the first thing to try.
    >>>> Thanks Paul,
    >>>> I am working on this approach now...
    >>> If you're using mmap to map the whole file, you'll need 64bit Windows to
    >>> start with. I'd be interested to know if Windows will allow you to mmap
    >>> 100gb at one stroke. Have you tried it, or are you starting by figuring
    >>> how to access the data from the mmap?

    >> Thanks very much for pursuing my query, Dave.
    >>
    >> I have not tried it yet --- temporarily side-tracked; but, I will post my
    >> findings on this issue.

    > If you are going to use mmap then look at the numpy.memmap function.
    > This wraps pythons mmap so that you can access the contents of the
    > mapped binary file as if it was a numpy array. This means that you
    > don't need to handle the bytes -> float conversions yourself.
    >
    >>>> import numpy
    >>>> a = numpy.array([4,5,6], numpy.float64)
    >>>> a

    > array([ 4., 5., 6.])
    >>>> with open('tmp.bin', 'wb') as f: # write forwards

    > ... a.tofile(f)
    > ... a.tofile(f)
    > ...
    >>>> a2 = numpy.memmap('tmp.bin', numpy.float64)
    >>>> a2

    > memmap([ 4., 5., 6., 4., 5., 6.])
    >>>> a2[3]

    > 4.0
    >>>> a2[5:2:-1] # read backwards

    > memmap([ 6., 5., 4.])
    >
    >
    > Oscar

    Thanks Oscar!
    Virgil Stokes, Oct 28, 2012
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Virgil Stokes

    Fast forward-backward (write-read)

    Virgil Stokes, Oct 23, 2012, in forum: Python
    Replies:
    7
    Views:
    145
    Virgil Stokes
    Oct 25, 2012
  2. Tim Chase
    Replies:
    0
    Views:
    172
    Tim Chase
    Oct 23, 2012
  3. Dennis Lee Bieber

    Re: Fast forward-backward (write-read)

    Dennis Lee Bieber, Oct 23, 2012, in forum: Python
    Replies:
    0
    Views:
    133
    Dennis Lee Bieber
    Oct 23, 2012
  4. Virgil Stokes

    Re: Fast forward-backward (write-read)

    Virgil Stokes, Oct 23, 2012, in forum: Python
    Replies:
    4
    Views:
    188
    Tim Golden
    Oct 24, 2012
  5. Virgil Stokes

    Re: Fast forward-backward (write-read)

    Virgil Stokes, Oct 23, 2012, in forum: Python
    Replies:
    0
    Views:
    111
    Virgil Stokes
    Oct 23, 2012
Loading...

Share This Page