Re: Fast forward-backward (write-read)

Discussion in 'Python' started by Oscar Benjamin, Oct 24, 2012.

  1. On 23 October 2012 15:31, Virgil Stokes <> wrote:
    > I am working with some rather large data files (>100GB) that contain time
    > series data. The data (t_k,y(t_k)), k = 0,1,...,N are stored in ASCII
    > format. I perform various types of processing on these data (e.g. moving
    > median, moving average, and Kalman-filter, Kalman-smoother) in a sequential
    > manner and only a small number of these data need be stored in RAM when
    > being processed. When performing Kalman-filtering (forward in time pass, k =
    > 0,1,...,N) I need to save to an external file several variables (e.g. 11*32
    > bytes) for each (t_k, y(t_k)). These are inputs to the Kalman-smoother
    > (backward in time pass, k = N,N-1,...,0). Thus, I will need to input these
    > variables saved to an external file from the forward pass, in reverse order
    > --- from last written to first written.
    >
    > Finally, to my question --- What is a fast way to write these variables to
    > an external file and then read them in backwards?


    You mentioned elsewhere that you are using numpy. I'll assume that the
    data you want to read/write are numpy arrays.

    Numpy arrays can be written very efficiently in binary form using
    tofile/fromfile:

    >>> import numpy
    >>> a = numpy.array([1, 2, 5], numpy.int64)
    >>> a

    array([1, 2, 5])
    >>> with open('data.bin', 'wb') as f:

    .... a.tofile(f)
    ....

    You can then reload the array with:

    >>> with open('data.bin', 'rb') as f:

    .... a2 = numpy.fromfile(f, numpy.int64)
    ....
    >>> a2

    array([1, 2, 5])

    Numpy arrays can be reversed before writing or after reading using;

    >>> a2

    array([1, 2, 5])
    >>> a2[::-1]

    array([5, 2, 1])

    Assuming you wrote the file forwards you can make an iterator to yield
    the file in chunks backwards like so (untested):

    def read_backwards(f, dtype, chunksize=1024 ** 2):
    dtype = numpy.dtype(dtype)
    nbytes = chunksize * dtype.itemsize
    f.seek(0, 2)
    fpos = f.tell()
    while fpos > nbytes:
    f.seek(fpos, 0)
    yield numpy.fromfile(f, dtype, chunksize)[::-1]
    fpos -= nbytes
    yield numpy.fromfile(f, dtype)[::-1]


    Oscar
     
    Oscar Benjamin, Oct 24, 2012
    #1
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Virgil Stokes

    Fast forward-backward (write-read)

    Virgil Stokes, Oct 23, 2012, in forum: Python
    Replies:
    7
    Views:
    159
    Virgil Stokes
    Oct 25, 2012
  2. Tim Chase
    Replies:
    0
    Views:
    192
    Tim Chase
    Oct 23, 2012
  3. Dennis Lee Bieber

    Re: Fast forward-backward (write-read)

    Dennis Lee Bieber, Oct 23, 2012, in forum: Python
    Replies:
    0
    Views:
    147
    Dennis Lee Bieber
    Oct 23, 2012
  4. Virgil Stokes

    Re: Fast forward-backward (write-read)

    Virgil Stokes, Oct 23, 2012, in forum: Python
    Replies:
    4
    Views:
    205
    Tim Golden
    Oct 24, 2012
  5. Virgil Stokes

    Re: Fast forward-backward (write-read)

    Virgil Stokes, Oct 23, 2012, in forum: Python
    Replies:
    0
    Views:
    123
    Virgil Stokes
    Oct 23, 2012
Loading...

Share This Page