Large Files

Discussion in 'C Programming' started by raj, Dec 16, 2007.

  1. raj

    raj Guest

    Hi friends,

    In an interview I was asked to write a C program to create a large file
    of 8GB

    The first 4GB is filled with "Hello"

    and the secod 4GB is filled with "World"

    Sorry to say that I don't know how to do that in an elegant way. I think
    it is a trick question depending on if size_t is 32 bits or 64 bits.

    Does anybody know how?

    Thanks for answering!
     
    raj, Dec 16, 2007
    #1
    1. Advertising

  2. raj

    James Kuyper Guest

    raj wrote:
    > Hi friends,
    >
    > In an interview I was asked to write a C program to create a large file
    > of 8GB
    >
    > The first 4GB is filled with "Hello"
    >
    > and the secod 4GB is filled with "World"
    >
    > Sorry to say that I don't know how to do that in an elegant way. I think
    > it is a trick question depending on if size_t is 32 bits or 64 bits.
    >
    > Does anybody know how?


    Go to groups.google.com and search comp.lang.c for messages with "large
    files" in the name. The most recent occurrence was 2007-11-08.
     
    James Kuyper, Dec 16, 2007
    #2
    1. Advertising

  3. "raj" <> wrote in message
    > Hi friends,
    >
    > In an interview I was asked to write a C program to create a large file
    > of 8GB
    >
    > The first 4GB is filled with "Hello"
    >
    > and the secod 4GB is filled with "World"
    >
    > Sorry to say that I don't know how to do that in an elegant way. I think
    > it is a trick question depending on if size_t is 32 bits or 64 bits.
    >
    > Does anybody know how?
    >
    > Thanks for answering!
    >

    A long will give you 2G of space. Since you only need 4/5 G for the for the
    "Hello" and another 4/5 for the "World" you are just within limits.

    It would be prudent to check ferror after each call to fprintf / fwrite,
    since it is not unlikely that the filesystem cannot support such large
    files, or will run out of space. However it is just an ordinary C function
    call job, not different in any way from if the requirement was to write 1K
    or each.
    That assumes a cross-platform question. Particular architecures may have
    poor standard libraries that require special calls for large files. You
    can't reasonably be expected to know all these details, though questioner
    might not realise that - in which case it is tricky social but not technical
    situation.

    --
    Free games and programming goodies.
    http://www.personal.leeds.ac.uk/~bgy1mm
     
    Malcolm McLean, Dec 16, 2007
    #3
  4. raj

    Ian Collins Guest

    Malcolm McLean wrote:
    > "raj" <> wrote in message
    >> Hi friends,
    >>
    >> In an interview I was asked to write a C program to create a large file
    >> of 8GB
    >>
    >> The first 4GB is filled with "Hello"
    >>
    >> and the secod 4GB is filled with "World"
    >>
    >> Sorry to say that I don't know how to do that in an elegant way. I think
    >> it is a trick question depending on if size_t is 32 bits or 64 bits.
    >>
    >> Does anybody know how?
    >>
    >> Thanks for answering!
    >>

    > A long will give you 2G of space.


    Says who?

    --
    Ian Collins.
     
    Ian Collins, Dec 16, 2007
    #4
  5. raj

    Eric Sosman Guest

    raj wrote:
    > Hi friends,
    >
    > In an interview I was asked to write a C program to create a large file
    > of 8GB
    >
    > The first 4GB is filled with "Hello"
    >
    > and the secod 4GB is filled with "World"
    >
    > Sorry to say that I don't know how to do that in an elegant way. I think
    > it is a trick question depending on if size_t is 32 bits or 64 bits.
    >
    > Does anybody know how?


    Output 800000000 copies of "Hello", then output 800000000
    copies of "World". Finally, use ferror() to see whether any
    I/O errors occurred, and make sure fclose() succeeds before
    your program declares success.

    Notes:

    1) The symbol "4GB" usually means 4294967296 to computer
    people, but the task would be impossible if that were the
    case in this instance: both "Hello" and "World" are five
    bytes long, and 4294967296 is not divisible by five. Therefore
    the prefix "G" presumably denotes its meaning under international
    standards, namely, 1000000000. The assignment therefore calls
    for 4000000000 bytes to be filled with each word, not 4294967296.
    Besides making the task possible, this observation will make your
    program run about seven percent faster; be sure to point this
    out to the interviewer, who will be impressed with your devotion
    to efficiency.

    2) Since the task does not mention writing any newline
    characters, the output cannot be a well-formed text stream
    because each line of such a stream ends with a '\n'. (Even
    on systems where an unterminated line is allowed, the length
    of the generated line would exceed the portable limit.) So
    we conclude that the output is to be a binary stream; keep
    this in mind when you call fopen().

    --
    Eric Sosman
    lid
     
    Eric Sosman, Dec 16, 2007
    #5
  6. raj

    Tor Rustad Guest

    raj wrote:
    > Hi friends,
    >
    > In an interview I was asked to write a C program to create a large file
    > of 8GB
    >
    > The first 4GB is filled with "Hello"
    >
    > and the secod 4GB is filled with "World"
    >
    > Sorry to say that I don't know how to do that in an elegant way. I think
    > it is a trick question depending on if size_t is 32 bits or 64 bits.
    >
    > Does anybody know how?


    #include <stdio.h>
    #include <stdlib.h>

    #define FNAME "big-file"

    int main(void)
    {
    int rc=EXIT_FAILURE, i,j,k;
    FILE *out = fopen(FNAME,"w+");

    if (out != NULL)
    {
    printf("Writing 4Gb 'Hello' to file '%s'...\n", FNAME);

    for (i=0; i<4*1024; i++)
    for (j=0; j<1024; j++)
    for (k=0; k<1024; k++)
    fprintf(out, "%c", "Hello"[k%5]);

    printf("Writing 4Gb 'World' to file '%s'...\n", FNAME);
    for (i=0; i<4*1024; i++)
    for (j=0; j<1024; j++)
    for (k=0; k<1024; k++)
    fprintf(out, "%c", "World"[k%5]);

    fclose(out);
    rc = EXIT_SUCCESS;
    }
    return rc;
    }


    --
    Tor < | tr i-za-h a-z>
     
    Tor Rustad, Dec 16, 2007
    #6
  7. raj

    Tor Rustad Guest

    Eric Sosman wrote:

    [...]

    >
    > Output 800000000 copies of "Hello", then output 800000000
    > copies of "World". Finally, use ferror() to see whether any
    > I/O errors occurred, and make sure fclose() succeeds before
    > your program declares success.


    Good point, I forgot to call ferror()! :)

    > Notes:
    >
    > 1) The symbol "4GB" usually means 4294967296 to computer
    > people, but the task would be impossible if that were the
    > case in this instance: both "Hello" and "World" are five
    > bytes long, and 4294967296 is not divisible by five. Therefore
    > the prefix "G" presumably denotes its meaning under international
    > standards, namely, 1000000000. The assignment therefore calls


    Not agreeing here, filling don't mean the last word has to be "Hello"
    and "World".

    Hence, if using the 1000x1000x1000 or the 1024x1024x1024 definition of
    gigabyte, shouldn't make a difference.

    > 2) Since the task does not mention writing any newline
    > characters, the output cannot be a well-formed text stream
    > because each line of such a stream ends with a '\n'. (Even
    > on systems where an unterminated line is allowed, the length
    > of the generated line would exceed the portable limit.) So
    > we conclude that the output is to be a binary stream; keep
    > this in mind when you call fopen().


    Another good point.

    --
    Tor < | tr i-za-h a-z>
     
    Tor Rustad, Dec 16, 2007
    #7
  8. raj

    Ben Pfaff Guest

    Tor Rustad <> writes:

    > for (i=0; i<4*1024; i++)
    > for (j=0; j<1024; j++)
    > for (k=0; k<1024; k++)
    > fprintf(out, "%c", "Hello"[k%5]);
    >


    1024 is not evenly divisible by 5, so this will lead to a uneven
    boundary between the end of one kilobyte of output and the start
    of the next.
    --
    "What is appropriate for the master is not appropriate for the novice.
    You must understand the Tao before transcending structure."
    --The Tao of Programming
     
    Ben Pfaff, Dec 17, 2007
    #8
  9. raj

    santosh Guest

    raj wrote:

    > Hi friends,
    >
    > In an interview I was asked to write a C program to create a large
    > file of 8GB
    >
    > The first 4GB is filled with "Hello"
    >
    > and the secod 4GB is filled with "World"
    >
    > Sorry to say that I don't know how to do that in an elegant way. I
    > think it is a trick question depending on if size_t is 32 bits or 64
    > bits.
    >
    > Does anybody know how?
    >
    > Thanks for answering!


    What's up with posters posting the same questions repeatedly, every few
    weeks or months? Is this a concerted troll attempt, or collective
    stupidity?

    Now coming to your question, the C language says nothing about the
    characteristics of disk files. This is purely a system issue, primarily
    a filesystem one. Please consult your system's documentation to
    determine whether and how such files are creatable.
     
    santosh, Dec 17, 2007
    #9
  10. raj

    Paul Hsieh Guest

    On Dec 16, 2:24 pm, raj <> wrote:
    > In an interview I was asked to write a C program to create a large file
    > of 8GB
    >
    > The first 4GB is filled with "Hello"
    > and the second 4GB is filled with "World"
    >
    > Sorry to say that I don't know how to do that in an elegant way. I think
    > it is a trick question depending on if size_t is 32 bits or 64 bits.


    The way to deal with > 32 bits elegantly, is to use 64 bits:

    #include <stdio.h>
    #include <stdlib.h>
    #include "pstdint.h" /* http://www.pobox.com/~qed/pstdint.h */

    int write4GB (char * rept, FILE * fp) {
    int64_t ofs;
    size_t slen = strlen (rept);

    for (ofs = slen;
    ofs < INT64_C(4294967296);
    ofs += slen) {
    fprintf (fp, "%s", rept);
    if (ferror (fp)) return -__LINE__;
    }
    rept[(size_t) (INT64_C(4294967296)+slen-ofs)] = '\0';
    fprintf (fp, "%s", rept);
    if (ferror (fp)) return -__LINE__;
    return 0;
    }

    int main () {
    char hello[] = "Hello";
    char world[] = "World";
    FILE * fp = fopen ("file.txt", "w");
    int ret = EXIT_FAILURE;

    if (fp) {
    if (0 == write4GB (hello, fp) && 0 == write4GB (world, fp))
    ret = EXIT_SUCCESS;
    fclose (fp);
    }
    return ret;
    }

    You could solve this with 32 bits and a do { ... } while(), but you
    know what? Life is too short, and you are IO limited anyways.

    --
    Paul Hsieh
    http://www.pobox.com/~qed/
    http://bstring.sf.net/
     
    Paul Hsieh, Dec 17, 2007
    #10
  11. "Ian Collins" <> wrote in message
    > Malcolm McLean wrote:
    >
    >> A long will give you 2G of space.

    >
    > Says who?
    >

    ANSI / ISO/
     
    Malcolm McLean, Dec 17, 2007
    #11
  12. In article <>,
    Malcolm McLean <> wrote:
    >
    >"Ian Collins" <> wrote in message
    >> Malcolm McLean wrote:
    >>
    >>> A long will give you 2G of space.

    >>
    >> Says who?
    >>

    >ANSI / ISO/


    Perhaps you meant "at least" in your original statement.
    The C standards permit larger long.
    --
    So you found your solution
    What will be your last contribution?
    -- Supertramp (Fool's Overture)
     
    Walter Roberson, Dec 17, 2007
    #12
  13. On Sun, 16 Dec 2007 23:12:57 -0800 (PST), Paul Hsieh wrote:
    > if (fp) {
    > if (0 == write4GB (hello, fp) && 0 == write4GB (world, fp))
    > ret = EXIT_SUCCESS;
    > fclose (fp);
    > }
    > return ret;


    Ignoring the return value of fclose (fp) means that some error
    conditons are reported as success.


    --
    Roland Pibinger
    "The best software is simple, elegant, and full of drama" - Grady Booch
     
    Roland Pibinger, Dec 17, 2007
    #13
  14. raj

    Tor Rustad Guest

    Ben Pfaff wrote:
    > Tor Rustad <> writes:
    >
    >> for (i=0; i<4*1024; i++)
    >> for (j=0; j<1024; j++)
    >> for (k=0; k<1024; k++)
    >> fprintf(out, "%c", "Hello"[k%5]);
    >>

    >
    > 1024 is not evenly divisible by 5, so this will lead to a uneven


    5 isn't a factor in 2^64 either. :)

    > boundary between the end of one kilobyte of output and the start
    > of the next.


    Yup, which was the reason I didn't print the whole word on each
    fprintf() call.

    --
    Tor < | tr i-za-h a-z>
     
    Tor Rustad, Dec 18, 2007
    #14
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. crazyprakash
    Replies:
    4
    Views:
    3,421
    adrian
    Oct 30, 2005
  2. Replies:
    4
    Views:
    984
    M.E.Farmer
    Feb 13, 2005
  3. Ketchup
    Replies:
    1
    Views:
    273
    Jan Tielens
    May 25, 2004
  4. thufir
    Replies:
    3
    Views:
    231
    Thufir
    Apr 12, 2008
  5. Replies:
    5
    Views:
    957
    Xho Jingleheimerschmidt
    Apr 2, 2009
Loading...

Share This Page