bug in file.write() ?

P

Patrick Useldinger

Hi,

I think I found a bug in the write method of file objects. It seems as
if before writing each block, a check was done in order to verifiy
that there is enough space left for the *whole* file, not for the
*remaining* data to be written.

It happens both under 2.3 and 2.2.3.
Any ideas?

========================== Python 2.3
==================================
I:\My Programs\dfc>b2

start dfc.py [v.0.19.final (July 31st, 2003)] @ 2003-08-01 00:21:48
Python 2.3.final running on win32
reading configuration file dfcCfgBackupCd
instantiating processor(s) .
paths & includes/excludes taken from configuration file
creating initial reference point i:\dfc\ref\dfcRefBackupCd.dfc
..copying i:\dfc\arc\dfcArchive cheetah 20030731-234648 F.zip to
f:\dfcArchive cheetah 20030731-234648 F.zip
Traceback (most recent call last):
File "I:\My Programs\dfc\dfc.py", line 199, in ?
dfc.doProcess(cfgFile.DFCProcTags)
File "I:\My Programs\dfc\dfc.py", line 144, in doProcess
newStat=self.newStat.get(fileName,None)) == None:
File "I:\My Programs\dfc\dfc.py", line 129, in process
return self.pubSub.publishMessage(self.pubSubProcess,kwargs,checkRet=True)
File "I:\My Programs\dfc\PubSub.py", line 170, in publishMessage
retVal.append(subscriber(**dict(args)))
File "I:\My Programs\dfc\dfcProcCopy.py", line 20, in process
shutil.copyfile(fileName,toFile)
File "C:\Python23\lib\shutil.py", line 39, in copyfile
copyfileobj(fsrc, fdst)
File "C:\Python23\lib\shutil.py", line 24, in copyfileobj
fdst.write(buf)
IOError: [Errno 28] No space left on device

I:\My Programs\dfc>dir f:
Volume in drive F is Backup 01
Volume Serial Number is E2CB-1650

Directory of F:\

18/05/2002 15:39 <DIR> .
18/05/2002 15:39 <DIR> ..
01/08/2003 00:25 299.630.592 dfcArchive cheetah
20030731-234648 F.zip
1 File(s) 299.630.592 bytes
2 Dir(s) 299.636.736 bytes free

========================= Python 2.2.3
=================================

I:\My Programs\dfc>i:\py222\python.exe dfc.py dfcCfgBackupCd

start dfc.py [v.0.19.final (July 31st, 2003)] @ 2003-08-01 00:29:08
Python 2.2.3.final running on win32
reading configuration file dfcCfgBackupCd
instantiating processor(s) .
paths & includes/excludes taken from configuration file
creating initial reference point i:\dfc\ref\dfcRefBackupCd.dfc
..copying i:\dfc\arc\dfcArchive cheetah 20030731-234648 F.zip to
f:\dfcArchive cheetah 20030731-234648 F.zip
Traceback (most recent call last):
File "dfc.py", line 199, in ?
dfc.doProcess(cfgFile.DFCProcTags)
File "dfc.py", line 144, in doProcess
newStat=self.newStat.get(fileName,None)) == None:
File "dfc.py", line 129, in process
return self.pubSub.publishMessage(self.pubSubProcess,kwargs,checkRet=True)
File "PubSub.py", line 170, in publishMessage
retVal.append(subscriber(**dict(args)))
File "dfcProcCopy.py", line 20, in process
shutil.copyfile(fileName,toFile)
File "i:\py222\lib\shutil.py", line 30, in copyfile
copyfileobj(fsrc, fdst)
File "i:\py222\lib\shutil.py", line 20, in copyfileobj
fdst.write(buf)
IOError: [Errno 28] No space left on device
========================================================================
 
J

John Machin

Hi,

I think I found a bug in the write method of file objects. It seems as
if before writing each block, a check was done in order to verifiy
that there is enough space left for the *whole* file, not for the
*remaining* data to be written.

This is very unlikely --- it is difficult to determine reliably the
available space on a disk or tape volume; if one were really
interested, one would do it once before trying to write the file, not
before writing each block. No, the expected implementation, which you
can check against reality by inspecting the source (fileobject.c), is
just to attempt to write, and throw an informative-as-possible
exception if the write fails.

Despite the appearance that you have about 6KB margin of safety, you
will probably find that you don't have any --- the quoted size of the
source file doesn't include the unused space in the last cluster or
block or whatever. It's a while since I've had to concern myself with
this sort of detail, but it's a good bet that the "chunk" size on your
disk is 8Kb or more and you're out of luck.
It happens both under 2.3 and 2.2.3.
Any ideas?
[snip]

File "C:\Python23\lib\shutil.py", line 39, in copyfile
copyfileobj(fsrc, fdst)
File "C:\Python23\lib\shutil.py", line 24, in copyfileobj
fdst.write(buf)
IOError: [Errno 28] No space left on device

I:\My Programs\dfc>dir f:
Volume in drive F is Backup 01
Volume Serial Number is E2CB-1650

Directory of F:\

18/05/2002 15:39 <DIR> .
18/05/2002 15:39 <DIR> ..
01/08/2003 00:25 299.630.592 dfcArchive cheetah
20030731-234648 F.zip
1 File(s) 299.630.592 bytes
2 Dir(s) 299.636.736 bytes free
[snip]
 
B

Bengt Richter

This is very unlikely --- it is difficult to determine reliably the
available space on a disk or tape volume; if one were really
interested, one would do it once before trying to write the file, not
before writing each block. No, the expected implementation, which you
can check against reality by inspecting the source (fileobject.c), is
just to attempt to write, and throw an informative-as-possible
exception if the write fails.

Despite the appearance that you have about 6KB margin of safety, you
will probably find that you don't have any --- the quoted size of the
source file doesn't include the unused space in the last cluster or
block or whatever. It's a while since I've had to concern myself with
this sort of detail, but it's a good bet that the "chunk" size on your
disk is 8Kb or more and you're out of luck.
Besides chunks of data, directories and allocation info also take up nonzero
space, and unless you have a file system with a fixed reserve for that stuff
(maybe some unusual custom floppy format?) it's going to take its byte out
of the total. How much will depend on how that nondata stuff is implemented.
A thick tree of directories max deep and all single-cluster data files will
surely be different from a single large file.
It happens both under 2.3 and 2.2.3.
Any ideas?
[snip]

File "C:\Python23\lib\shutil.py", line 39, in copyfile
copyfileobj(fsrc, fdst)
File "C:\Python23\lib\shutil.py", line 24, in copyfileobj
fdst.write(buf)
IOError: [Errno 28] No space left on device

I:\My Programs\dfc>dir f:
Volume in drive F is Backup 01
Volume Serial Number is E2CB-1650

Directory of F:\

18/05/2002 15:39 <DIR> .
18/05/2002 15:39 <DIR> ..
01/08/2003 00:25 299.630.592 dfcArchive cheetah
20030731-234648 F.zip
1 File(s) 299.630.592 bytes
2 Dir(s) 299.636.736 bytes free
[snip]

Regards,
Bengt Richter
 
B

Bengt Richter

Bengt Richter wrote:
[Actually this is not the part I (Bengt) wrote, though I agree
that a file.write bug per se is unlikely ;-)]
This is actually a CD-RW, and it has no files on it. I have used that
very same CD earlier, and was able to fill it to its max.


The margin is 50%, or 299 000 000 bytes, as you can see below. If I copy
the file via the NT Explorer, it works ok, so the CD-RW is not the problem.
Sorry, I didn't really read your traceback. I went with the "6kb" ;-)
I wonder if 1) the error message is for real or is another error condition
improperly reported, 2) whether you are in a threaded situation where some
kind of race condition could be involved, or 3) whether you could be
getting buffer underrun as the real cause for (1), and (4) can you reformat
(as opposed to deleting files) the CDRW or use a fresh blank and get the same
problem? Perhaps there could be a fragmentation problem?

Is the CD-RW set up to look like an ordinary disk to applications?
If so, you could try (note file extension ;-)

f=file('f:\dfcArchive cheetah 20030731-234648 F.zip_fake', 'wb')
million = '-'*(1000*1000)
rest = '+'*630592
for m in range(299):
f.write(million)
f.write(rest)
f.close()

And see if that gets there. Those are big chunks with no disk read delays,
vs shutil's 16k read-source/write-dest loop. This might give a clue for
chasing underrun or other timing problems.

If there is contention for the the CDRW between threads or processes somehow,
maybe you have to arrange for a single writer fed by a queue or such?

Hope this gives you a useful idea.

Regards,
Bengt Richter
 
J

John Machin

[John]
Despite the appearance that you have about 6KB margin of safety, you
[Patrick]
The margin is 50%, or 299 000 000 bytes, as you can see below. If I
copy
the file via the NT Explorer, it works ok, so the CD-RW is not the
problem.
[Bengt]
Sorry, I didn't really read your traceback. I went with the "6kb" ;-)

John>
Sorry, Bengt, don't believe everything you read. I misread the
tracebacks.
What Patrick's output seems to show is that after the failed copy, the
CDRom is almost exactly half full. This is suspicious. It's also
suspicious that the total is about 600Mb; shouldn't it be 640Mb or
700Mb??
What Patrick's output doesn't show is what is the size of the input
file (from the I: drive). This would be very useful information. After
the failure, there is a 299Mb file on the CDRom. Is that the whole
file or not? Does the failure happen after 0% 50% or 100% of the time
taken by a successful copy using the NT Explorer? After the failure,
can you copy say 100Kb file to the CDRom i.e. is the 299Mb free space
spurious?

[Bengt] I wonder if 1) the error message is for real or is another
error condition improperly reported

John> Me too. It's not impossible for a write failure of various sorts
to be reported as "file full.

Further observations: (1) IMHO you are taking a big gamble using stdio
to copy to a CDrom. Is I: a network drive? I was under the impression
that even with "professional" purpose-built CD-burning software, one
should burn from a hard-drive, not be running other applications at
the same time, ...

Is the type of CD that you are using the sort that allows you to write
multiple times but only the last "session" is visible, and the
previous sessions still take up space -- i.e. it's not erasable, you
can't treat it in all respects just like a big fat 700Mb floppy disk?
 
P

Patrick Useldinger

Hi all,
What Patrick's output seems to show is that after the failed copy, the
CDRom is almost exactly half full. This is suspicious. It's also
suspicious that the total is about 600Mb; shouldn't it be 640Mb or
700Mb??

It's a CD-RW 80 min, formatted with B's Clip with file system UDF1.50.
After formatting, Windows Explorer shows a capacity of 572MB, with 50KB
being used (for catalogue nd allocation tables, I suppose).
What Patrick's output doesn't show is what is the size of the input
file (from the I: drive). This would be very useful information. After
the failure, there is a 299Mb file on the CDRom. Is that the whole
file or not?

The original file is 308.711KB in size; so the copied file is 94,94% of
the original.
Does the failure happen after 0% 50% or 100% of the time
taken by a successful copy using the NT Explorer?

I just did a test, it is after 92,5% of the time used by Explorer.
After the failure,
can you copy say 100Kb file to the CDRom i.e. is the 299Mb free space
spurious?

I just copied OpenOffice 1.02, i.e. 50MB, without any problem:

Directory of F:\

02/08/2003 22:41 <DIR> .
02/08/2003 22:41 <DIR> ..
03/08/2003 08:58 300.122.112 dfcArchive cheetah 20030731-234648 F.zip
12/04/2003 18:54 51.955.157 OOo_1.0.3_Win32Intel_install.zip
2 File(s) 352.077.269 bytes
2 Dir(s) 248.174.592 bytes free
[Bengt] I wonder if 1) the error message is for real or is another
error condition improperly reported
John> Me too. It's not impossible for a write failure of various sorts
to be reported as "file full.

Quite possible, as I repeated a similar test with a floppy disk, but
that one worked OK (with a file larger than 50% of the disks capacity).
I have also repeated the procedure with different CD-RWs of different
makes and capacities, but the percentage of 50 stays the same.
Further observations: (1) IMHO you are taking a big gamble using stdio
to copy to a CDrom. Is I: a network drive? I was under the impression
that even with "professional" purpose-built CD-burning software, one
should burn from a hard-drive, not be running other applications at
the same time, ...

I: is my local 2nd harddisk, so I am really copying from a local
harddisk to a CD-RW. Is there any other means I should have used in
Python? Is used shutil.copy because it works cross-platform, and because
it copies permission information.
Is the type of CD that you are using the sort that allows you to write
multiple times but only the last "session" is visible, and the
previous sessions still take up space -- i.e. it's not erasable, you
can't treat it in all respects just like a big fat 700Mb floppy disk?

Not with UDF1.5, as far as I know. I've been using CD-RWs for a few
years now,and never experienced this situation.

Regards,
-Patrick
 
B

Bengt Richter

[... more on CDRW mystery ...]

By any chance are you doing compression implicitly with the compressor using
the destination disk as temp space in some way that requires a complete
temp image before copying and before deleting? Maybe if so it can be
reconfigured to use space elsewhere or just rename (hm, rename on CDRW
isn't implemented as copy, based on renamee not being modifiable, is it?)

Still wonder about threads too...

Regards,
Bengt Richter
 
C

'cHVAdm8ubHU=\\n'.decode('base64')

Bengt said:
By any chance are you doing compression implicitly with the compressor using
the destination disk as temp space in some way that requires a complete
temp image before copying and before deleting? Maybe if so it can be
reconfigured to use space elsewhere or just rename (hm, rename on CDRW
isn't implemented as copy, based on renamee not being modifiable, is it?)
No, the step that created the zip completely runs on the harddisk. The
step that fails simply does a copy.

Is anyone able to reproduce the error? i.e.
1- take a file that's more than 50% of the target CD-RW in size
2- do a simple shutil.copy(from, to)
and get the same result?

I'd just like to know if it is my PC only, or a more general issue. Just
to keep the developers from chasing phantoms.
Still wonder about threads too...
If it doesn't make any difference to your well-being, then don't. That's
what Buddha said ;-)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top