Re: fastest data / database format for reading large files

Discussion in 'Python' started by Chris Rebert, Oct 28, 2012.

  1. Chris Rebert

    Chris Rebert Guest

    On Tue, Oct 16, 2012 at 11:35 AM, Pradipto Banerjee
    <> wrote:
    > I am working with a series of large files with sizes 4 to 10GB and may need to read these files repeated. What data format (i.e. pickle, json, csv, etc.) is considered the fastest for reading via python?


    Pickle /ought/ to be fastest, since it's binary (unless you use the
    oldest protocol version) and native to Python. Be sure to specify
    HIGHEST_PROTOCOL and use cPickle.
    http://docs.python.org/2/library/pickle.html#module-cPickle
    http://docs.python.org/2/library/pickle.html#pickle.HIGHEST_PROTOCOL

    You might consider using SQLite (or some other database) if you will
    be doing queries over the data that would be amenable to SQL or
    similar.
    http://docs.python.org/2/library/sqlite3.html

    Cheers,
    Chris

    P.S. The verbose disclaimer at the end of your emails is kinda annoying...
     
    Chris Rebert, Oct 28, 2012
    #1
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Chu
    Replies:
    3
    Views:
    4,194
    Shawn
    Aug 15, 2006
  2. p.h.

    the fastest reading from LPT port

    p.h., Sep 29, 2006, in forum: C Programming
    Replies:
    5
    Views:
    531
    Richard Heathfield
    Sep 30, 2006
  3. Seth Brundle
    Replies:
    4
    Views:
    456
    A. Sinan Unur
    Sep 22, 2005
  4. Replies:
    5
    Views:
    881
    Xho Jingleheimerschmidt
    Apr 2, 2009
  5. Replies:
    3
    Views:
    220
    Uri Guttman
    Jan 7, 2014
Loading...

Share This Page