Accessing large >2GB file succeeds fails with open/read

R

Rolf Schroedter

(Sorry for cross-posting).
I need to access large files > 2GByte (Linux, WinXP/NTFS)
using the standard C-library calls.
Till today I thought I know how to do it, namely for
Win32:
Use open(), read(), _itelli64(), _lseeki64() with type __int64
Linux/Cygwin:
#define _FILE_OFFSET_BITS 64
Use open(), read(), lseek() with type off_t

My environment is
- WindowsXP, SP2 + latest updates
- VC 6.0 + SP6
- NTFS

Now the following little program using sequential read's
fails for me when using low-level File I/O functions
open(), read()
but succeeds with stream I/O
fopen(), fread().

What drives me crazy is that for a large project - a GUI using the fltk
library - I'm doing heavy open(),read() and _lseeki64() skipping through
the same large video files without any problem.

Any ideas are highly appreciated.

Here is the small sample program. Note that the seek() is only
to accelerate the test, it can be removed without changing the result.
Use it with any >2GB file:


/**********************************************************************
* open/read fails, fopen/fread succeeds for files >2GB
***********************************************************************/
//#define _INTEGRAL_MAX_BITS 64

#include <io.h>
#include <fcntl.h>

#include <stdio.h>
#include <stdlib.h>

/**********************************************************************
* Main
***********************************************************************/
int main(int argc, char **argv)
{
int i, result;
char buf[1000000];

/*
* Parse command line parameters
*/
if( (argc < 2) || (argc % 2 != 0) ) {
printf("call: %s filename\n", argv[0]);
exit(1);
}
#if 0
{
int fh;
fh = open(argv[1], O_RDONLY | O_BINARY);
if( fh < 0 ) {
printf("Unable to open FITS '%s'\n", argv[1]);
perror("");
exit(1);
}
lseek(fh, 2000*1000000, SEEK_SET);

i = 0;
while( ! eof(fh) ) {
result = read(fh, buf, sizeof(buf));
printf("%d: read %d bytes\n", ++i, result);
}
}
#else
{
FILE *f;
f = fopen(argv[1], "rb");
if( f == NULL ) {
printf("Unable to open FITS '%s'\n", argv[1]);
perror("");
exit(1);
}
fseek(f, 2000*1000000, SEEK_SET);

i = 0;
while( ! feof(f) ) {
result = fread(buf, 1, sizeof(buf), f);
printf("%d: read %d bytes\n", ++i, result);
}
}
#endif
printf("<eof>\n");
}
 
M

Michael Mair

Rolf said:
(Sorry for cross-posting).
I need to access large files > 2GByte (Linux, WinXP/NTFS)
using the standard C-library calls.
Till today I thought I know how to do it, namely for
Win32:
Use open(), read(), _itelli64(), _lseeki64() with type __int64
Linux/Cygwin:
#define _FILE_OFFSET_BITS 64
Use open(), read(), lseek() with type off_t

My environment is
- WindowsXP, SP2 + latest updates
- VC 6.0 + SP6
- NTFS

Now the following little program using sequential read's
fails for me when using low-level File I/O functions
open(), read()
but succeeds with stream I/O
fopen(), fread().

This is fine from the C point of view -- we do not know open() and
read()... If you want to restrict yourself to standard C library
calls, you will have to go with fopen() and fread().
For large files, ftell()/fseek() are obviously not the right choice;
use fgetpos()/fsetpos() instead.

[snip]


Cheers
Michael
 
O

Old Wolf

Rolf said:
(Sorry for cross-posting).
I need to access large files > 2GByte (Linux, WinXP/NTFS)
using the standard C-library calls.
Till today I thought I know how to do it, namely for
Win32:
Use open(), read(), _itelli64(), _lseeki64() with type __int64
Linux/Cygwin:
#define _FILE_OFFSET_BITS 64
Use open(), read(), lseek() with type off_t

None of those is a standard C library call.
Now the following little program using sequential read's
fails for me when using low-level File I/O functions
open(), read()
but succeeds with stream I/O
fopen(), fread().

The standard functions work and the non-standard ones don't.
Not surprising..

For help with the non-standard functions, you should ask in
a system-specific group.
i = 0;
while( ! feof(f) ) {
result = fread(buf, 1, sizeof(buf), f);
printf("%d: read %d bytes\n", ++i, result);
}

Incorrect use of feof. (Read its man page, or the newsgroup FAQ).
 
P

Peter Nilsson

Old said:
Incorrect use of feof. (Read its man page, or the newsgroup FAQ).

Strictly speaking, the sample doesn't show incorrect use of
feof(), in the way of resultant undefined behaviour. The problem
here is that the last loop will show "##: read 0 bytes", and
there is an infinite loop if a read error occurs.

[I found it sad that M$'s sample of feof shows just such a loop.

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dv_wcecrt4/html/erlrffeof.asp

Unfortunately, the construct is correct within the sample, but is
_highly_ misleading.]

In a less trivial loop the potential for more serious error is
greatly increased.
 
O

Old Wolf

Peter said:
Strictly speaking, the sample doesn't show incorrect use of
feof(), in the way of resultant undefined behaviour. The problem
here is that the last loop will show "##: read 0 bytes", and
there is an infinite loop if a read error occurs.

Right. There is another problem which I didn't notice at first:
fread() returns a size_t but the OP assigns it to an int. If
the size of the read was > 2^31 bytes , and int is 32 bit,
then we have implementation-defined behaviour (possibly generating
an implementation-defined signal, causing the program to abort).
 
R

Rolf Schroedter

Thanks for your remarks.
But please be aware that this is a quick-and-dirty example
kept small to illustrate the problem.

If there was only this sample I would concludethat
open/read doesn't work for MS-Windows.
But I have a larger program using FLTK, where
heavy low-level read/_lseeki64 work fine on files >2GB...

I see that probably c.l.c is not the right place to ask.
I would need something like comp.lang.c.cross-platform
 
O

Olof Lagerkvist

[FUT: c.o.m-w.p.win32]

Rolf said:
Thanks for your remarks.
But please be aware that this is a quick-and-dirty example
kept small to illustrate the problem.

If there was only this sample I would concludethat
open/read doesn't work for MS-Windows.
But I have a larger program using FLTK, where
heavy low-level read/_lseeki64 work fine on files >2GB...

I see that probably c.l.c is not the right place to ask.
I would need something like comp.lang.c.cross-platform

Eh, no. As others have pointed out, open/read/write etc are not part of
any C standard at all, they are the low-level I/O API in the Posix
standards. Win32 API are not Posix compatible and the CRT libraries in
Windows that run under Win32 only support a small sub-set of Posix API
that is emulated, open/read/write etc are translated to
CreateFile/ReadFile/WriteFile etc, they are not low-level APIs in the
Win32 subsystem.

So if you really want low-level I/O and not streaming I/O I would
suggest you to use the low-level API in each operating system, i.e.
open/read/write in Linux and other Posix systems, including Cygwin or
Interix under Windows and to use CreateFile/ReadFile/WriteFile under Win32.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,007
Latest member
obedient dusk

Latest Threads

Top