Binary file output using python

C

Chi Yin Cheung

Hi,
Is there a way in python to output binary files? I need to python to
write out a stream of 5 million floating point numbers, separated by
some separator, but it seems that all python supports natively is string
information output, which is extremely space inefficient.

I'd tried using the pickle module, but it crashed whenever I tried using
it due to the large amount of data involved.

Thanks for your help!
 
K

kyosohma

Hi,
Is there a way in python to output binary files? I need to python to
write out a stream of 5 million floating point numbers, separated by
some separator, but it seems that all python supports natively is string
information output, which is extremely space inefficient.

I'd tried using the pickle module, but it crashed whenever I tried using
it due to the large amount of data involved.

Thanks for your help!

You can create a binary file by doing something like this:

f = open(r'filename, 'b')
f.write('1,2,3,4,5,6')
f.close()

See also: http://www.devshed.com/c/a/Python/File-Management-in-Python/

Have fun!

Mike
 
M

Michael Hoffman

Chi said:
Hi,
Is there a way in python to output binary files? I need to python to
write out a stream of 5 million floating point numbers, separated by
some separator, but it seems that all python supports natively is string
information output, which is extremely space inefficient.

I recommend using PyTables for this sort of thing. It also allows you to
choose from several compression algorithms. I'm using it to store files
with 22000 x (2000, 12) datasets, or 528 million Float64s.
 
M

Michael Hoffman

Michael said:
I recommend using PyTables for this sort of thing. It also allows you to
choose from several compression algorithms. I'm using it to store files
with 22000 x (2000, 12) datasets, or 528 million Float64s.

Addendum: it should also deal with endianness issues I wouldn't want to
handle myself, so your code will also be portable.
 
T

Thomas Dybdahl Ahle

Den Tue, 17 Apr 2007 11:07:38 -0700 skrev kyosohma:

I don't understand. To me it seams like there is no space difference:

[thomas@localhost ~]$ python
Python 2.4.4 (#1, Oct 23 2006, 13:58:00)
[GCC 4.1.1 20061011 (Red Hat 4.1.1-30)] on linux2
Type "help", "copyright", "credits" or "license" for more information.[thomas@localhost ~]$ ls -l test test2
-rw-rw-r-- 1 thomas thomas 88888890 17 apr 22:28 test
-rw-rw-r-- 1 thomas thomas 88888890 17 apr 22:27 test2
[thomas@localhost ~]$
 
B

bvukov

Den Tue, 17 Apr 2007 11:07:38 -0700 skrev kyosohma:

I don't understand. To me it seams like there is no space difference:

[thomas@localhost ~]$ python
Python 2.4.4 (#1, Oct 23 2006, 13:58:00)
[GCC 4.1.1 20061011 (Red Hat 4.1.1-30)] on linux2
Type "help", "copyright", "credits" or "license" for more information.>>> f = open("test2", "w")
[thomas@localhost ~]$ ls -l test test2
-rw-rw-r-- 1 thomas thomas 88888890 17 apr 22:28 test
-rw-rw-r-- 1 thomas thomas 88888890 17 apr 22:27 test2
[thomas@localhost ~]$

That's OK, but he might also take a look at the 'struct' module which
can solve the "stream of 5 million floating point numbers, separated
by
some separator" part of the issue ( if binary format is needed ). From
the python docs...8
 
P

Peter Otten

Chi said:
Is there a way in python to output binary files? I need to python to
write out a stream of 5 million floating point numbers, separated by
some separator, but it seems that all python supports natively is string
information output, which is extremely space inefficient.

I'd tried using the pickle module, but it crashed whenever I tried using
it due to the large amount of data involved.

A minimalistic alternative is array.tofile()/fromfile(), but pickle should
handle a list, say, of 5 million floating point numbers just fine. What
exactly are you doing to provoke a crash, and what does it look like?
Please give minimal code and the traceback.

Peter
 
N

Nick Craig-Wood

Peter Otten said:
A minimalistic alternative is array.tofile()/fromfile(), but pickle should
handle a list, say, of 5 million floating point numbers just fine. What
exactly are you doing to provoke a crash, and what does it look like?
Please give minimal code and the traceback.

cPickle worked fine when I tried it...
-rw-r--r-- 1 ncw ncw 45010006 Apr 19 18:43 z
0
Indicating each float took 9 bytes to store, which is 1 byte more than
a 64 bit float would normally take.

The pickle dump / load each took about 2 seconds.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,432
Messages
2,571,680
Members
48,796
Latest member
Greg L.

Latest Threads

Top