Large Files

R

raj

Hi friends,

In an interview I was asked to write a C program to create a large file
of 8GB

The first 4GB is filled with "Hello"

and the secod 4GB is filled with "World"

Sorry to say that I don't know how to do that in an elegant way. I think
it is a trick question depending on if size_t is 32 bits or 64 bits.

Does anybody know how?

Thanks for answering!
 
J

James Kuyper

raj said:
Hi friends,

In an interview I was asked to write a C program to create a large file
of 8GB

The first 4GB is filled with "Hello"

and the secod 4GB is filled with "World"

Sorry to say that I don't know how to do that in an elegant way. I think
it is a trick question depending on if size_t is 32 bits or 64 bits.

Does anybody know how?

Go to groups.google.com and search comp.lang.c for messages with "large
files" in the name. The most recent occurrence was 2007-11-08.
 
M

Malcolm McLean

raj said:
Hi friends,

In an interview I was asked to write a C program to create a large file
of 8GB

The first 4GB is filled with "Hello"

and the secod 4GB is filled with "World"

Sorry to say that I don't know how to do that in an elegant way. I think
it is a trick question depending on if size_t is 32 bits or 64 bits.

Does anybody know how?

Thanks for answering!
A long will give you 2G of space. Since you only need 4/5 G for the for the
"Hello" and another 4/5 for the "World" you are just within limits.

It would be prudent to check ferror after each call to fprintf / fwrite,
since it is not unlikely that the filesystem cannot support such large
files, or will run out of space. However it is just an ordinary C function
call job, not different in any way from if the requirement was to write 1K
or each.
That assumes a cross-platform question. Particular architecures may have
poor standard libraries that require special calls for large files. You
can't reasonably be expected to know all these details, though questioner
might not realise that - in which case it is tricky social but not technical
situation.
 
E

Eric Sosman

raj said:
Hi friends,

In an interview I was asked to write a C program to create a large file
of 8GB

The first 4GB is filled with "Hello"

and the secod 4GB is filled with "World"

Sorry to say that I don't know how to do that in an elegant way. I think
it is a trick question depending on if size_t is 32 bits or 64 bits.

Does anybody know how?

Output 800000000 copies of "Hello", then output 800000000
copies of "World". Finally, use ferror() to see whether any
I/O errors occurred, and make sure fclose() succeeds before
your program declares success.

Notes:

1) The symbol "4GB" usually means 4294967296 to computer
people, but the task would be impossible if that were the
case in this instance: both "Hello" and "World" are five
bytes long, and 4294967296 is not divisible by five. Therefore
the prefix "G" presumably denotes its meaning under international
standards, namely, 1000000000. The assignment therefore calls
for 4000000000 bytes to be filled with each word, not 4294967296.
Besides making the task possible, this observation will make your
program run about seven percent faster; be sure to point this
out to the interviewer, who will be impressed with your devotion
to efficiency.

2) Since the task does not mention writing any newline
characters, the output cannot be a well-formed text stream
because each line of such a stream ends with a '\n'. (Even
on systems where an unterminated line is allowed, the length
of the generated line would exceed the portable limit.) So
we conclude that the output is to be a binary stream; keep
this in mind when you call fopen().
 
T

Tor Rustad

raj said:
Hi friends,

In an interview I was asked to write a C program to create a large file
of 8GB

The first 4GB is filled with "Hello"

and the secod 4GB is filled with "World"

Sorry to say that I don't know how to do that in an elegant way. I think
it is a trick question depending on if size_t is 32 bits or 64 bits.

Does anybody know how?

#include <stdio.h>
#include <stdlib.h>

#define FNAME "big-file"

int main(void)
{
int rc=EXIT_FAILURE, i,j,k;
FILE *out = fopen(FNAME,"w+");

if (out != NULL)
{
printf("Writing 4Gb 'Hello' to file '%s'...\n", FNAME);

for (i=0; i<4*1024; i++)
for (j=0; j<1024; j++)
for (k=0; k<1024; k++)
fprintf(out, "%c", "Hello"[k%5]);

printf("Writing 4Gb 'World' to file '%s'...\n", FNAME);
for (i=0; i<4*1024; i++)
for (j=0; j<1024; j++)
for (k=0; k<1024; k++)
fprintf(out, "%c", "World"[k%5]);

fclose(out);
rc = EXIT_SUCCESS;
}
return rc;
}
 
T

Tor Rustad

Eric Sosman wrote:

[...]
Output 800000000 copies of "Hello", then output 800000000
copies of "World". Finally, use ferror() to see whether any
I/O errors occurred, and make sure fclose() succeeds before
your program declares success.

Good point, I forgot to call ferror()! :)
Notes:

1) The symbol "4GB" usually means 4294967296 to computer
people, but the task would be impossible if that were the
case in this instance: both "Hello" and "World" are five
bytes long, and 4294967296 is not divisible by five. Therefore
the prefix "G" presumably denotes its meaning under international
standards, namely, 1000000000. The assignment therefore calls

Not agreeing here, filling don't mean the last word has to be "Hello"
and "World".

Hence, if using the 1000x1000x1000 or the 1024x1024x1024 definition of
gigabyte, shouldn't make a difference.
2) Since the task does not mention writing any newline
characters, the output cannot be a well-formed text stream
because each line of such a stream ends with a '\n'. (Even
on systems where an unterminated line is allowed, the length
of the generated line would exceed the portable limit.) So
we conclude that the output is to be a binary stream; keep
this in mind when you call fopen().

Another good point.
 
B

Ben Pfaff

Tor Rustad said:
for (i=0; i<4*1024; i++)
for (j=0; j<1024; j++)
for (k=0; k<1024; k++)
fprintf(out, "%c", "Hello"[k%5]);

1024 is not evenly divisible by 5, so this will lead to a uneven
boundary between the end of one kilobyte of output and the start
of the next.
 
S

santosh

raj said:
Hi friends,

In an interview I was asked to write a C program to create a large
file of 8GB

The first 4GB is filled with "Hello"

and the secod 4GB is filled with "World"

Sorry to say that I don't know how to do that in an elegant way. I
think it is a trick question depending on if size_t is 32 bits or 64
bits.

Does anybody know how?

Thanks for answering!

What's up with posters posting the same questions repeatedly, every few
weeks or months? Is this a concerted troll attempt, or collective
stupidity?

Now coming to your question, the C language says nothing about the
characteristics of disk files. This is purely a system issue, primarily
a filesystem one. Please consult your system's documentation to
determine whether and how such files are creatable.
 
P

Paul Hsieh

In an interview I was asked to write a C program to create a large file
of 8GB

The first 4GB is filled with "Hello"
and the second 4GB is filled with "World"

Sorry to say that I don't know how to do that in an elegant way. I think
it is a trick question depending on if size_t is 32 bits or 64 bits.

The way to deal with > 32 bits elegantly, is to use 64 bits:

#include <stdio.h>
#include <stdlib.h>
#include "pstdint.h" /* http://www.pobox.com/~qed/pstdint.h */

int write4GB (char * rept, FILE * fp) {
int64_t ofs;
size_t slen = strlen (rept);

for (ofs = slen;
ofs < INT64_C(4294967296);
ofs += slen) {
fprintf (fp, "%s", rept);
if (ferror (fp)) return -__LINE__;
}
rept[(size_t) (INT64_C(4294967296)+slen-ofs)] = '\0';
fprintf (fp, "%s", rept);
if (ferror (fp)) return -__LINE__;
return 0;
}

int main () {
char hello[] = "Hello";
char world[] = "World";
FILE * fp = fopen ("file.txt", "w");
int ret = EXIT_FAILURE;

if (fp) {
if (0 == write4GB (hello, fp) && 0 == write4GB (world, fp))
ret = EXIT_SUCCESS;
fclose (fp);
}
return ret;
}

You could solve this with 32 bits and a do { ... } while(), but you
know what? Life is too short, and you are IO limited anyways.
 
R

Roland Pibinger

if (fp) {
if (0 == write4GB (hello, fp) && 0 == write4GB (world, fp))
ret = EXIT_SUCCESS;
fclose (fp);
}
return ret;

Ignoring the return value of fclose (fp) means that some error
conditons are reported as success.
 
T

Tor Rustad

Ben said:
Tor Rustad said:
for (i=0; i<4*1024; i++)
for (j=0; j<1024; j++)
for (k=0; k<1024; k++)
fprintf(out, "%c", "Hello"[k%5]);

1024 is not evenly divisible by 5, so this will lead to a uneven

5 isn't a factor in 2^64 either. :)
boundary between the end of one kilobyte of output and the start
of the next.

Yup, which was the reason I didn't print the whole word on each
fprintf() call.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,535
Members
45,007
Latest member
obedient dusk

Latest Threads

Top