How to write fast into a file in python?

S

Steven D'Aprano

tOn Sat, 18 May 2013 08:49:55 +0100, Fábio Santos
<[email protected]> declaimed the following in
gmane.comp.python.general:

Windows certainly may mess with what you write to your file, if it is
opened in Text mode instead of Binary mode. In text mode, Windows will:

- interpret Ctrl-Z characters as End Of File when reading;

- convert \r\n to \n when reading;

- convert \n to \r\n when writing.

http://msdn.microsoft.com/en-us/library/z5hh6ee9(v=vs.80).aspx

Neither... It goes back to Teletype machines where one sent a
carriage return to move the printhead back to the left, then sent a line
feed to advance the paper (while the head was still moving left), and in
some cases also provided a rub-out character (a do-nothing) to add an
additional character time delay.

Yes, if you go back far enough, you get to teletype machines. But Windows
inherits its text-mode behaviour from DOS, which inherits it from CP/M.

TRS-80 Mod 1-4 used <cr> for "new line", I believe Apple used <lf>
for "new line"...

I can't comment about TRS, but classic Apple Macs (up to System 9) used
carriage return \r as the line separator.
 
C

Carlos Nepomuceno

----------------------------------------
Date: Sat, 18 May 2013 22:41:32 -0400
From: (e-mail address removed)
To: (e-mail address removed)
Subject: Re: How to write fast into a file in python?



That's backwards. '\r\n' on Windows, IF you omit the b in the mode when
creating the file.

Indeed! My mistake just made me find out that Acorn used that inversion on Acorn MOS.

According to this[1] (at page 449) the OSNEWL routine outputs '\n\r'.

What the hell those guys were thinking??? :p

"OSNEWL
This call issues an LF CR (line feed, carriage return) to the currently selected
output stream. The routine is entered at &FFE7."

[1] http://regregex.bbcmicro.net/BPlusUserGuide-1.07.pdf
 
C

Carlos Nepomuceno

Thanks Dan! I've never used CPython or PyPy. Will try them later.

I think the main difference between your create_file_numbers_file_like()
and the fastwrite5.py I sent earlier is that I've used cStringIO
instead of StringIO. It took 12s less using cStringIO.

My numbers are much greater, but I've used Python 2.7.5 instead:

C:\src\Python>python create_file_numbers.py
time taken to write a file of size 52428800  is  39.1199457743 seconds

time taken to write a file of size 52428800  is  14.8704800436 seconds

time taken to write a file of size 52428800  is  23.0011990985 seconds


I've downloaded bufsock.py and python2x3.py. The later one was hard to remove the source code from the web page.

Can I use them on my projects? I'm not used to the UCI license[1]. What's the difference to the GPL?




[1] http://stromberg.dnsalias.org/~dstromberg/UCI-license.html

________________________________
 
C

Carlos Nepomuceno

BTW, I've downloaded from the following places:

http://stromberg.dnsalias.org/svn/bufsock/trunk/bufsock.py
http://stromberg.dnsalias.org/~dstromberg/backshift/documentation/html/python2x3-pysrc.html

Are those the latest versions?

----------------------------------------
From: (e-mail address removed)
To: (e-mail address removed)
Subject: RE: How to write fast into a file in python?
Date: Sun, 19 May 2013 08:31:08 +0300
CC: (e-mail address removed)

Thanks Dan! I've never used CPython or PyPy. Will try them later.

I think the main difference between your create_file_numbers_file_like()
and the fastwrite5.py I sent earlier is that I've used cStringIO
instead of StringIO. It took 12s less using cStringIO.

My numbers are much greater, but I've used Python 2.7.5 instead:

C:\src\Python>python create_file_numbers.py
time taken to write a file of size 52428800 is 39.1199457743 seconds

time taken to write a file of size 52428800 is 14.8704800436 seconds

time taken to write a file of size 52428800 is 23.0011990985 seconds


I've downloaded bufsock.py and python2x3.py. The later one was hard to remove the source code from the web page.

Can I use them on my projects? I'm not used to the UCI license[1]. What'sthe difference to the GPL?




[1] http://stromberg.dnsalias.org/~dstromberg/UCI-license.html

________________________________
Date: Sat, 18 May 2013 12:38:30 -0700
Subject: Re: How to write fast into a file in python?
From: (e-mail address removed)
To: (e-mail address removed)
CC: (e-mail address removed)


With CPython 2.7.3:
./t
time taken to write a file of size 52428800 is 15.86 seconds

time taken to write a file of size 52428800 is 7.91 seconds

time taken to write a file of size 52428800 is 9.64 seconds


With pypy-1.9:
./t
time taken to write a file of size 52428800 is 3.708232 seconds

time taken to write a file of size 52428800 is 4.868304 seconds

time taken to write a file of size 52428800 is 1.93612 seconds
Here's the code:
#!/usr/local/pypy-1.9/bin/pypy
#!/usr/bin/python

import sys
import time
import StringIO

sys.path.insert(0, '/usr/local/lib')
import bufsock

def create_file_numbers_old(filename, size):
start = time.clock()

value = 0
with open(filename, "w") as f:
while f.tell() < size:
f.write("{0}\n".format(value))
value += 1

end = time.clock()

print "time taken to write a file of size", size, " is ", (end
-start), "seconds \n"

def create_file_numbers_bufsock(filename, intended_size):
start = time.clock()

value = 0
with open(filename, "w") as f:
bs = bufsock.bufsock(f)
actual_size = 0
while actual_size < intended_size:
string = "{0}\n".format(value)
actual_size += len(string) + 1
bs.write(string)
value += 1
bs.flush()

end = time.clock()

print "time taken to write a file of size", intended_size, " is ",
(end -start), "seconds \n"


def create_file_numbers_file_like(filename, intended_size):
start = time.clock()

value = 0
with open(filename, "w") as f:
file_like = StringIO.StringIO()
actual_size = 0
while actual_size < intended_size:
string = "{0}\n".format(value)
actual_size += len(string) + 1
file_like.write(string)
value += 1
file_like.seek(0)
f.write(file_like.read())

end = time.clock()

print "time taken to write a file of size", intended_size, " is ",
(end -start), "seconds \n"

create_file_numbers_old('output.txt', 50 * 2**20)
create_file_numbers_bufsock('output2.txt', 50 * 2**20)
create_file_numbers_file_like('output3.txt', 50 * 2**20)




On Thu, May 16, 2013 at 9:35 PM,

size = 50mb
--
http://mail.python.org/mailman/listinfo/python-list


-- http://mail.python.org/mailman/listinfo/python-list
 
C

Chris Angelico

Thanks Dan! I've never used CPython or PyPy. Will try them later.

CPython is the "classic" interpreter, written in C. It's the one
you'll get from the obvious download links on python.org.

ChrisA
 
C

Carlos Nepomuceno

ooops! I meant to say Cython. nevermind...

----------------------------------------
 
M

MRAB

----------------------------------------
Date: Sat, 18 May 2013 22:41:32 -0400
From: (e-mail address removed)
To: (e-mail address removed)
Subject: Re: How to write fast into a file in python?



That's backwards. '\r\n' on Windows, IF you omit the b in the mode when
creating the file.

Indeed! My mistake just made me find out that Acorn used that inversion on Acorn MOS.

According to this[1] (at page 449) the OSNEWL routine outputs '\n\r'.

What the hell those guys were thinking??? :p
Doing it that way saved a few bytes.

Code was something like this:

FFE3 .OSASCI CMP #&0D
FFE5 BNE OSWRCH
FFE7 .OSNEWL LDA #&0A
FFE9 JSR OSWRCH
FFEC LDA #&0D
FFEE .OSWRCH ...

This means that the contents of the accumulator would always be
preserved by a call to OSASCI.
"OSNEWL
This call issues an LF CR (line feed, carriage return) to the currently selected
output stream. The routine is entered at &FFE7."

[1] http://regregex.bbcmicro.net/BPlusUserGuide-1.07.pdf
 
C

Carlos Nepomuceno

Oh well! Just got a flashback from the old times at the 8-bit assembly line..

Dirty deeds done dirt cheap! lol

----------------------------------------
Date: Sun, 19 May 2013 16:44:55 +0100
From: (e-mail address removed)
To: (e-mail address removed)
Subject: Re: How to write fast into a file in python?

----------------------------------------
Date: Sat, 18 May 2013 22:41:32 -0400
From: (e-mail address removed)
To: (e-mail address removed)
Subject: Re: How to write fast into a file in python?

On 05/18/2013 01:00 PM, Carlos Nepomuceno wrote:
Python really writes '\n\r' on Windows. Just check the files.

That's backwards. '\r\n' on Windows, IF you omit the b in the mode when
creating the file.

Indeed! My mistake just made me find out that Acorn used that inversion on Acorn MOS.

According to this[1] (at page 449) the OSNEWL routine outputs '\n\r'.

What the hell those guys were thinking??? :p
Doing it that way saved a few bytes.

Code was something like this:

FFE3 .OSASCI CMP #&0D
FFE5 BNE OSWRCH
FFE7 .OSNEWL LDA #&0A
FFE9 JSR OSWRCH
FFEC LDA #&0D
FFEE .OSWRCH ...

This means that the contents of the accumulator would always be
preserved by a call to OSASCI.
"OSNEWL
This call issues an LF CR (line feed, carriage return) to the currently selected
output stream. The routine is entered at &FFE7."

[1] http://regregex.bbcmicro.net/BPlusUserGuide-1.07.pdf
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,013
Latest member
KatriceSwa

Latest Threads

Top