file of exact size

W

Wade Ward

How do I use the C Programming Language to create a files that is 2563695577
bytes. The one line I think to know in this program is:
long long m = 2563695577;

EOF on my implementation is carriage return line feed (I think). I don't
know if that is the 2563695578th byte or even the 2563695578th and the
2563695579th.

The number in question is about two and a half billion. I can't remember
what the minumum maximum is for this datatype, but I think I'm within an
order of magnitude. Telling me to read the manual won't work, because I
don't have a c compiler.

Thanks in advance,
Gruß,
--
Wade Ward
(e-mail address removed)
'If they took all the "And it came to pass's" out
of the Book of Mormon, it would be a pamphlet.'
--Mark Twain
 
M

Michal Nazarewicz

Wade Ward said:
How do I use the C Programming Language to create a files that is
2563695577 bytes.

I believe it is platform dependent question. Consult your operating
system's newsgroup. 2563695577 is more then a maximum value of a 32-bit
integer number and therefore some systems may have problems operating on
such files (in fact, in some situations it may be impossible to do
because of file system's limitations) unless you compile your program in
special way (ie. in Unix you'd compile your program in such a way that
it'll use 64-bit file offsets instead of 32-bit). But as I've said it
is all implementation specific.
 
F

Flash Gordon

Michal Nazarewicz wrote, On 15/09/07 08:48:
I believe it is platform dependent question. Consult your operating
system's newsgroup. 2563695577 is more then a maximum value of a 32-bit
integer number and therefore some systems may have problems operating on
such files (in fact, in some situations it may be impossible to do
because of file system's limitations) unless you compile your program in
special way

On some systems it is impossible. Full stop. Some systems cannot have a
file system that large. For instance Windows used to only support
partitions up to 2GB, and within that 2GB some space was used for
structures, so there was no place large enough to store such a file.
(ie. in Unix you'd compile your program in such a way that
it'll use 64-bit file offsets instead of 32-bit). But as I've said it
is all implementation specific.

It may, of course, be possible to create such a file using purely
standard C by, for example, creating the file and writing that number of
characters.
 
R

Richard Heathfield

Flash Gordon said:

It may, of course, be possible to create such a file using purely
standard C by, for example, creating the file and writing that number
of characters.

It's trivial, of course, to make the attempt.

#include <stdio.h>
#include <stdlib.h>
int main(void)
{
int errupt = 0;
unsigned long wait = 2282899UL;
fputs("please wait\n", stderr);
while(!errupt && wait--)
{
unsigned int erest = 1123;
while(!errupt && erest--)
{
errupt = (putchar('\n') == EOF);
}
}
if(errupt)
{
fputs("rats!\n", stderr);
}
return errupt ? EXIT_FAILURE : EXIT_SUCCESS;
}
 
W

Wade Ward

Michal Nazarewicz said:
I believe it is platform dependent question. Consult your operating
system's newsgroup. 2563695577 is more then a maximum value of a 32-bit
integer number and therefore some systems may have problems operating on
such files (in fact, in some situations it may be impossible to do
because of file system's limitations) unless you compile your program in
special way (ie. in Unix you'd compile your program in such a way that
it'll use 64-bit file offsets instead of 32-bit). But as I've said it
is all implementation specific.
I think if I knocked off 90% of the number in the original post, I would be
within 32 bits, but I don't think that part of it is important. LLONG_MAX
is around there.

Let's say, instead, that you want to create a thousand files of size 10^8
bytes. The implementation-specific part, how your machine or platform
represents EOF, is much smaller than the files in question.

#include <stdio.h>

int main(void)
{
int loop = 1000;
long long m = 100000000;
for (i = 0; i < loop; ++ i)
{
some pointer = fopen()
write(^^ ) = some character in the source and execution set m times
}
return 0;
}
--
Wade Ward
(e-mail address removed)
'If they took all the "And it came to pass's" out
of the Book of Mormon, it would be a pamphlet.'
--Mark Twain
 
F

Flash Gordon

Wade Ward wrote, On 15/09/07 11:13:
Of course it's unlikely that we can do this with the software on a
television remote.

A machine running Windows95 is hardly a television remote. There are
other file systems used on PCs and servers that do not support files
that large.
OP just dug up K&R2. I think this is easily within the realm of ISO C.

I stated that it was potentially possible and even one way it could be
done in ISO C so I don't see what you are getting at here, unless you
were posting to agree with me.

Depending on what is really wanted there may well be a better way to
achieve it, such as using sparse files, but it all depends on the real
requirements and the system in question. I certainly would not want to
wait whilst such a file was written.
 
W

Wade Ward

Flash Gordon said:
Michal Nazarewicz wrote, On 15/09/07 08:48:

On some systems it is impossible. Full stop. Some systems cannot have a
file system that large. For instance Windows used to only support
partitions up to 2GB, and within that 2GB some space was used for
structures, so there was no place large enough to store such a file.
Of course it's unlikely that we can do this with the software on a
television remote.
It may, of course, be possible to create such a file using purely standard
C by, for example, creating the file and writing that number of
characters.
OP just dug up K&R2. I think this is easily within the realm of ISO C.
--
Wade Ward
(e-mail address removed)
'If they took all the "And it came to pass's" out
of the Book of Mormon, it would be a pamphlet.'
--Mark Twain
 
E

Erik Trulsson

Wade Ward said:
How do I use the C Programming Language to create a files that is 2563695577
bytes. The one line I think to know in this program is:
long long m = 2563695577;

EOF on my implementation is carriage return line feed (I think). I don't
know if that is the 2563695578th byte or even the 2563695578th and the
2563695579th.

The end-of-*line* marker on your system might be carriage return-line feed.
It is almost certainly not used for EOF. In fact on most systems there are
no specific character stored in the file to mark the end - the system just
keeps track of when the file ends in some other way.

The EOF value that many library functions can return is some negative value
that you do not need to know the exact value of, and which you should not try to
write to the file anyway.

The number in question is about two and a half billion. I can't remember
what the minumum maximum is for this datatype, but I think I'm within an
order of magnitude. Telling me to read the manual won't work, because I
don't have a c compiler.

Thanks in advance,
Gruß,

The simplest (but not necessarily fastest) way is just to create a file and
write the desired number of bytes to it. Not all systems can accomodate
files that large so it might not be possible to create such a file, but if
it doesn't work there is not much you can do about it.

The following program should do the trick and be about as portable as it can get:



#include <stdio.h>
#include <stdlib.h>

int main(void)
{
FILE *f;
int res;
long count1=2147483647; /* The largest value a 'long' is guaranteed to hold */
long count2= 416211930; /* 416211930 == 2563695577 - 2147483647 */

f=fopen("bigfile","wb");
if(f == NULL)
{
return EXIT_FAILURE;
}

while(count1 > 0)
{
res=putc('x', f);
if(res == EOF)
{
fclose(f);
return EXIT_FAILURE;
}
count1--;
}
while(count2 > 0)
{
res=putc('x', f);
if(res == EOF)
{
fclose(f);
return EXIT_FAILURE;
}
count2--;
}

fclose(f);
return 0;
}





If I had wanted to restrict myself to C99 you could use a single count variable of
type 'long long' and just a single loop instead of two, but I did it this way
to be portable to implementations not supporting 'long long' as well.


It might also be worth noting that writing a single character at a time is not the most
efficient way of doing it - but improving that is left as an exercise for the reader.

Modifying the program so that it can be used to create a file with an arbitrary size
is also left as an exercise.

Oh, and the fclose() calls in the program are not actually necessary - see if you can
figure out why.
 
C

Charlie Gordon

Richard Heathfield said:
Flash Gordon said:



It's trivial, of course, to make the attempt.

#include <stdio.h>
#include <stdlib.h>
int main(void)
{
int errupt = 0;
unsigned long wait = 2282899UL;
fputs("please wait\n", stderr);
while(!errupt && wait--)
{
unsigned int erest = 1123;
while(!errupt && erest--)
{
errupt = (putchar('\n') == EOF);
}
}
if(errupt)
{
fputs("rats!\n", stderr);
}
return errupt ? EXIT_FAILURE : EXIT_SUCCESS;
}

'\n' is probably not a wise choice of character for this test ;-)
If the test succeeds, the file produced on a Windows system by redirecting
stdout to a new file will likely be twice the size wanted.
 
W

Wade Ward

Flash Gordon said:
Wade Ward wrote, On 15/09/07 11:13:

I stated that it was potentially possible and even one way it could be
done in ISO C so I don't see what you are getting at here, unless you were
posting to agree with me.

Depending on what is really wanted there may well be a better way to
achieve it, such as using sparse files, but it all depends on the real
requirements and the system in question. I certainly would not want to
wait whilst such a file was written.
Yeah, well, I let Heathfield's program run for about fifteen minutes. I'm
missing something and my IQ isn't getting bigger tonight, so I'll let it
rest.

The benchmark to beat is eight minutes. Sleep on it.

unsigned long wait = 2282899UL;
fputs("please wait\n", stderr);
while(!errupt && wait--)
{
unsigned int erest = 1123;
while(!errupt && erest--)

Tja.
--
Wade Ward
(e-mail address removed)
'If they took all the "And it came to pass's" out
of the Book of Mormon, it would be a pamphlet.'
--Mark Twain
 
C

Charlie Gordon

Wade Ward said:
Yeah, well, I let Heathfield's program run for about fifteen minutes. I'm
missing something and my IQ isn't getting bigger tonight, so I'll let it
rest.

The benchmark to beat is eight minutes. Sleep on it.

unsigned long wait = 2282899UL;
fputs("please wait\n", stderr);
while(!errupt && wait--)
{
unsigned int erest = 1123;
while(!errupt && erest--)

Tja.

given todays average harware performance, 1 minute seems a good goal for
this benchmark.
using fwrite with a decent buffer size should do it.
 
M

Michal Nazarewicz

Wade Ward said:
I think if I knocked off 90% of the number in the original post, I would be
within 32 bits, but I don't think that part of it is important.

It may be important. In fact it is. If file systems holds file size as
a 32-bit signed number (not very clever) you won't be able to create
file greater then 2^31-1. If it's 32-bit unsigned number then instead
the limit is 2^32-1. But hey! File system may hold file size as a 24-bit
unsigned number in which case you won't be able to create file greater
then 16 MiB.
Let's say, instead, that you want to create a thousand files of size 10^8
bytes. The implementation-specific part, how your machine or platform
represents EOF, is much smaller than the files in question.

Uhm? Don't understand the part about representing EOF... From what
I know EOF is usually not represented in any way -- instead file system
holds number of bytes file contains (at least if we are talking about
files).
 
R

Richard Heathfield

Charlie Gordon said:

If the test succeeds, the file produced on a Windows system by
redirecting stdout to a new file will likely be twice the size wanted.

<shrug> A broken OS is indeed one possible obstacle to producing the
file as specified. There are other obstacles too, some of them rather
more difficult to overcome.
 
C

Charlie Gordon

Richard Heathfield said:
Charlie Gordon said:



<shrug> A broken OS is indeed one possible obstacle to producing the
file as specified. There are other obstacles too, some of them rather
more difficult to overcome.

The new-line translation issue is easy to solve: insteal of stdout, we need
to use a stream opened in binary mode. FILE *fp = fopen("bigfile", "wb");
Writing the appropriate number of bytes to fp should produce a file with the
appropriate size...
- if the OS can handle that size,
- and the file system can too,
- and there is enough available space,
- and there are no write errors,
- and we let the program run to completion,
- and "bigfile" is a regular file name (as opposed to some OS specific
device or pipe name),
- and the program can create the file,
- and it can write to the file system
- and some other program does not mess with the file
- ...

Note also that some OSes have system specific functions to truncate or
extend files to a certain size, POSIX has two that you might find useful,
but further discussing these is off topic in this forum:

#include <unistd.h>
#include <sys/types.h>

int truncate(const char *path, off_t length);
int ftruncate(int fd, off_t length);
 
W

Walter Roberson

The new-line translation issue is easy to solve: insteal of stdout, we need
to use a stream opened in binary mode. FILE *fp = fopen("bigfile", "wb");
Writing the appropriate number of bytes to fp should produce a file with the
appropriate size...
- if the OS can handle that size,
- and the file system can too,
[...]

And if the OS doesn't happen to pad out binary files to
a full multiple of the sector size and use some kind of
mechanism (e.g., writing an end-of-file marker) to keep track
of how big the file is "really"

And if the OS doesn't happen to toss in some trailing NUL's on
the binary file.


This all relates strongly to the FAQ question asking how to find
out how big a file is, the answer to which is "You can't be sure
using standard C facilities, not even for binary files"
 
R

Richard Heathfield

Walter Roberson said:

This all relates strongly to the FAQ question asking how to find
out how big a file is, the answer to which is "You can't be sure
using standard C facilities, not even for binary files"

Right. This is precisely why I wrote "It's trivial, of course, to make
the attempt" rather than "It's trivial to solve this problem".

I note from followups that there were performance complaints about my
attempt.

<shrug width="gallic">
People ask if something is possible, and you show them how it might be,
and then they moan because it doesn't run in nothing flat with no
memory consumption, despite these constraints not being stated in the
original problem.

Well, people *will* be people, I guess.
</shrug>
 
F

Flash Gordon

Charlie Gordon wrote, On 15/09/07 13:32:
given todays average harware performance, 1 minute seems a good goal for
this benchmark.
using fwrite with a decent buffer size should do it.

Pick the right system and the right method and I would expect closer to
1s. Actually, I would expect a lot *less* than 1s. E.g.

markg@brenda:~$ rm /tmp/big
markg@brenda:~$ time ./a.out

real 0m0.002s
user 0m0.000s
sys 0m0.000s
markg@brenda:~$ ls -l /tmp/big
-rw-r--r-- 1 markg markg 2282899 2007-09-15 19:02 /tmp/big
markg@brenda:~$

For the earlier mentioned size of 2563695577 my method did not work, and
for various reasons (including portability) my method might not be
suitable to the OP. I did not use standard C to do this so the code is
not topical here.
 
K

Keith Thompson

Wade Ward said:
How do I use the C Programming Language to create a files that is 2563695577
bytes. The one line I think to know in this program is:
long long m = 2563695577;

EOF on my implementation is carriage return line feed (I think). I don't
know if that is the 2563695578th byte or even the 2563695578th and the
2563695579th.

The number in question is about two and a half billion. I can't remember
what the minumum maximum is for this datatype, but I think I'm within an
order of magnitude. Telling me to read the manual won't work, because I
don't have a c compiler.

Why do you want to do this? Creating a file of an exact specified
size without saying anything about its contents seems like a very odd
requirement.
 
W

Walter Roberson

Why do you want to do this? Creating a file of an exact specified
size without saying anything about its contents seems like a very odd
requirement.

Personally I find the latter an even odder requirement -- someone
wants to know a C program, but doesn't have a C compiler? And what
does having a C compiler have to do with reading C documentation,
a great deal of which is readily available through any decent
search engine?
 
W

Walter Roberson

Charlie Gordon wrote, On 15/09/07 13:32:
Pick the right system and the right method and I would expect closer to
1s. Actually, I would expect a lot *less* than 1s. E.g.
for various reasons (including portability) my method might not be
suitable to the OP. I did not use standard C to do this so the code is
not topical here.

Heck, if you allow non-portability, you might not need to write any
code at all. For example, SGI IRIX provides "mkfile" to create a
file of any given size. If you really want C code, you could wrap
the call with system() ;-)

Then there are methods using a pair of fseek() (a pair because fseek
takes a signed long so you can't seek that far in one call) followed by
writing a byte, or go non-portable and just open the file and
ftruncate64() it to the size you want, or (non-portable again)
fseek64() followed by writing a byte. Or there's always the good old
unix utility "dd", no C code required...
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,009
Latest member
GidgetGamb

Latest Threads

Top