iostream and files larger than 4GB

Discussion in 'C++' started by Robert Kochem, Jul 21, 2008.

  1. Hi,

    I am relative new to C++ regarding it's functions and libraries. I need to
    access files larger than 4GB which is AFAIK not possible with the STL
    iostream - at least not if using a 32 bit compiler. iostream was my
    favorite as my code has to work on files as well as memory buffers...

    Could somebody please help me what functions/classes are the best in this
    case?

    BTW: I am currently using Visual C++ 2008 on Win32, but if possible I want
    to write my code as "portable as possible".

    Robert
     
    Robert Kochem, Jul 21, 2008
    #1
    1. Advertising

  2. Victor Bazarov schrieb:

    > Robert Kochem wrote:
    >> I am relative new to C++ regarding it's functions and libraries. I need to
    >> access files larger than 4GB which is AFAIK not possible with the STL
    >> iostream - at least not if using a 32 bit compiler. iostream was my
    >> favorite as my code has to work on files as well as memory buffers...

    >
    > Have you actually tried and failed, or is that only your speculation?


    If you get a "possible loss of data" warning when feeding seekg() with an
    64 bit integer - what would you expect?

    > AFAIK, even standard C Library functions like fread and fseek should
    > work with large files. And since C++ I/O streams are relatively thin
    > wrappers around C streams, those are expected to work just as well.
    > Write a program, see if you get it to work, if not, post your code and
    > explain the situation.


    It may work for files, but can I work with them on memory streams?

    Robert
     
    Robert Kochem, Jul 21, 2008
    #2
    1. Advertising

  3. Robert Kochem

    Ron AF Greve Guest

    Hi,

    Most 32 bit OS'es have a limitation of 2GB (maybe 4GB I am not sure about
    that) per file. If the OS can handle both like for instance sun V440 or
    ubuntu 64 or lots of others you can compile with m64 with gcc (and link to
    64 versions of all libraries) and usually have the full 64 bit range.
    Sometimes you have to add something like LARGE_FILE_SUPPORT from the top of
    my memory.

    I havent' tried but maybe the same applies to 64 bit MS-Windows.


    Regards, Ron AF Greve

    http://www.InformationSuperHighway.eu

    "Robert Kochem" <> wrote in message
    news:1bewsomn9lgep.1f5do5nbjwgh3$...
    > Hi,
    >
    > I am relative new to C++ regarding it's functions and libraries. I need to
    > access files larger than 4GB which is AFAIK not possible with the STL
    > iostream - at least not if using a 32 bit compiler. iostream was my
    > favorite as my code has to work on files as well as memory buffers...
    >
    > Could somebody please help me what functions/classes are the best in this
    > case?
    >
    > BTW: I am currently using Visual C++ 2008 on Win32, but if possible I want
    > to write my code as "portable as possible".
    >
    > Robert
     
    Ron AF Greve, Jul 21, 2008
    #3
  4. Victor Bazarov wrote:

    >> If you get a "possible loss of data" warning when feeding seekg() with an
    >> 64 bit integer - what would you expect?

    >
    > I expect not to use seekg then. Or switch to a better implementation of
    > the library.


    That is easy to say - but what else to use?

    >> It may work for files, but can I work with them on memory streams?

    >
    > I don't know what those are, sorry.


    May be that was not the correct name in the C++ realm: I need an
    abstraction of the underlaying data source. My code have to work on files
    as well as on memory buffers and I call a stream using a memory buffer as
    source a memory stream.

    Robert
     
    Robert Kochem, Jul 21, 2008
    #4
  5. Ron AF Greve schrieb:

    > Most 32 bit OS'es have a limitation of 2GB (maybe 4GB I am not sure about
    > that) per file.


    Sorry, but I can't believe that. Do you really mean that e.g. a 32bit Linux
    filesystem can not handle files larger than 4GB?

    Robert
     
    Robert Kochem, Jul 21, 2008
    #5
  6. Robert Kochem

    tni Guest

    tni, Jul 21, 2008
    #6
  7. Robert Kochem

    Ron AF Greve Guest

    Hi,

    In the past there certainly was a time it couldn't. Currently I haven't a
    pure 32 bit linux version although I could test with 64 and compiling for 32
    (maybe tomorrow) And I am sure a lot of OS'es indeed don't. Lookup your
    flavor and search for large file support.

    Regards, Ron AF Greve

    http://www.InformationSuperHighway.eu

    "Robert Kochem" <> wrote in message
    news:...
    > Ron AF Greve schrieb:
    >
    >> Most 32 bit OS'es have a limitation of 2GB (maybe 4GB I am not sure about
    >> that) per file.

    >
    > Sorry, but I can't believe that. Do you really mean that e.g. a 32bit
    > Linux
    > filesystem can not handle files larger than 4GB?
    >
    > Robert
     
    Ron AF Greve, Jul 21, 2008
    #7
  8. Robert Kochem wrote:
    > Ron AF Greve schrieb:
    >
    >> Most 32 bit OS'es have a limitation of 2GB (maybe 4GB I am not sure about
    >> that) per file.

    >
    > Sorry, but I can't believe that. Do you really mean that e.g. a 32bit Linux
    > filesystem can not handle files larger than 4GB?


    I don't think so either. The 64 bit file API is in no way related to 64
    bit extension of the CPU. even 8 bit CPUs could deal with 64 bit numbers.

    It is a compile time feature of the runtime library. At the operatin
    system level there are either two sets of API function with and without
    large file support or optional 64 bit extension parameters to the 32 bit
    API functions (like Win32). Unfortunately the C++ runtimes are not the
    first ones that support this.

    For tasks like that I do not recommend to use the iostream libraries at
    all. Usually they are not trimmed for maximum perpormance. Sometimes the
    implementations are more like case studies.
    And writing that large files /is/ a question of performance. You might
    want to control the caching of the content. Or you might do the I/O
    asynchronously.


    Marcel
     
    Marcel Müller, Jul 21, 2008
    #8
  9. Robert Kochem

    Guest

    On Jul 22, 7:14 am, "Ron AF Greve" <ron@localhost> wrote:

    Please stop top-posting.

    > Hi,
    >
    > Most 32 bit OS'es have a limitation of 2GB (maybe 4GB I am not sure about
    > that) per file.


    That's nonsense.

    > If the OS can handle both like for instance sun V440 or


    A V440 is a machine, not an OS.

    > ubuntu 64 or lots of others you can compile with m64 with gcc (and link to
    > 64 versions of all libraries) and usually have the full 64 bit range.


    Also nonsense.

    Ian.
     
    , Jul 22, 2008
    #9
  10. Robert Kochem

    James Kanze Guest

    On Jul 21, 8:39 pm, Victor Bazarov <> wrote:
    > Robert Kochem wrote:
    > > I am relative new to C++ regarding it's functions and
    > > libraries. I need to access files larger than 4GB which is
    > > AFAIK not possible with the STL iostream - at least not if
    > > using a 32 bit compiler. iostream was my favorite as my code
    > > has to work on files as well as memory buffers...


    > Have you actually tried and failed, or is that only your
    > speculation?


    It's really implementation defined. I know that some
    implementations do have this restriction.

    > > Could somebody please help me what functions/classes are the
    > > best in this case?


    > > BTW: I am currently using Visual C++ 2008 on Win32, but if
    > > possible I want to write my code as "portable as possible".


    > AFAIK, even standard C Library functions like fread and fseek
    > should work with large files.


    According to what or who? The standards (both C and C++) are
    really very, very vague about this (intentionally). I think
    about all you can portably count on is that you can read
    anything you can write. If the library doesn't allow writing
    files with more than some upper limit of characters, then
    there's no reason to assume that it can read them.

    From a quality of implementation point of view, of course, one
    would expect that the library not introduce additional
    restrictions not present in the OS. But backwards compatibility
    issues sometimes pose problems: changing the size of off_t on a
    Posix implementation breaks binary compatibility, for example.
    So libc.so (the dynamic object which contains the system API and
    the basic C library under Solaris) must stick with 32 bit file
    offsets, or existing binaries will cease to work. And if
    libc.so uses a 32 bit file offset, then any new code which links
    against it must, too. So by default, fopen uses a 32 bit file
    offset, and only allows access to the first 4 GB of a file, at
    least in programs compiled in 32 bit mode. I don't know how
    Windows handles this, but I'd be surprised if they didn't
    encounter the same problems, at least to some degree.

    The obvious solution would be to have three models, instead of
    two: a pure 32 bit mode for legacy code, a 32 bit mode with 64
    bit file offsets for new 32 bit code, and a 64 bit mode. On the
    other hand, even coping with two different models on the same
    machine can be confusing enough.

    --
    James Kanze (GABI Software) email:
    Conseils en informatique orientée objet/
    Beratung in objektorientierter Datenverarbeitung
    9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
     
    James Kanze, Jul 22, 2008
    #10
  11. Robert Kochem

    James Kanze Guest

    On Jul 21, 10:07 pm, Marcel Müller <>
    wrote:
    > Robert Kochem wrote:
    > > Ron AF Greve schrieb:


    > >> Most 32 bit OS'es have a limitation of 2GB (maybe 4GB I am
    > >> not sure about that) per file.


    > > Sorry, but I can't believe that. Do you really mean that
    > > e.g. a 32bit Linux filesystem can not handle files larger
    > > than 4GB?


    > I don't think so either. The 64 bit file API is in no way
    > related to 64 bit extension of the CPU. even 8 bit CPUs could
    > deal with 64 bit numbers.


    Posix requires off_t to be a typedef to a signed integral type.
    It also requires that the file size, in bytes, be held in an
    off_t. In the days before long long, the largest signed
    integral type was long, normally 32 bits on a 32 bit machine.
    Which meant that file sizes were limited to 2GB. (Of course,
    back then, a file of more than 2GB wouldn't fit on most disks.)

    The integration of large file support has been extremely
    complex, since breaking existing binaries (which dynamically
    link to the system API) was not considered an acceptable option.
    The result is that by default, both 32 bit Solaris and 32 bit
    Linux do not support files greater than 2GB. (I think that both
    have means to do so; it's highly unlikely, however, that the
    C++, or even the C standard library use these.)

    --
    James Kanze (GABI Software) email:
    Conseils en informatique orientée objet/
    Beratung in objektorientierter Datenverarbeitung
    9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
     
    James Kanze, Jul 22, 2008
    #11
  12. Robert Kochem

    tni Guest

    James Kanze wrote:
    > The result is that by default, both 32 bit Solaris and 32 bit
    > Linux do not support files greater than 2GB. (I think that both
    > have means to do so; it's highly unlikely, however, that the
    > C++, or even the C standard library use these.)


    It is part of the Unix 98 spec, so it is supported by libc/glibc. E.g.,
    look at:
    http://ac-archive.sourceforge.net/largefile/libc.html
     
    tni, Jul 22, 2008
    #12
  13. James Kanze wrote:

    > But backwards compatibility
    > issues sometimes pose problems: changing the size of off_t on a
    > Posix implementation breaks binary compatibility, for example.
    > So libc.so (the dynamic object which contains the system API and
    > the basic C library under Solaris) must stick with 32 bit file
    > offsets, or existing binaries will cease to work. And if
    > libc.so uses a 32 bit file offset, then any new code which links
    > against it must, too.


    I don't know how Solaris implements this in particular but this could be
    solved by providing legacy compatibility libs for older binaries (I
    think that, for example, FreeBSD does it that way, which has had a
    64-bit off_t since at least 1996 iirc.)
     
    Matthias Buelow, Jul 22, 2008
    #13
  14. James Kanze wrote:

    > The result is that by default, both 32 bit Solaris and 32 bit
    > Linux do not support files greater than 2GB. (I think that both


    What do you mean by that? At least Linux (i386) has had support for
    files >2gb for many years now, "out of box" (that is, by default).
     
    Matthias Buelow, Jul 22, 2008
    #14
  15. Robert Kochem

    Ron AF Greve Guest

    Hi Robert,

    I did a test on sun solaris using gcc and indeed even in 32 bit mode it can
    handle large files. So apparently it has become something of the past.

    However if you don't believe that this certainly hasn't always been the
    case, here is some nice reading material.

    http://www.unix.org/version2/whatsnew/lfs20mar.html#1.1


    Regards, Ron AF Greve

    http://www.InformationSuperHighway.eu

    "Robert Kochem" <> wrote in message
    news:...
    > Ron AF Greve schrieb:
    >
    >> Most 32 bit OS'es have a limitation of 2GB (maybe 4GB I am not sure about
    >> that) per file.

    >
    > Sorry, but I can't believe that. Do you really mean that e.g. a 32bit
    > Linux
    > filesystem can not handle files larger than 4GB?
    >
    > Robert
     
    Ron AF Greve, Jul 22, 2008
    #15
  16. Robert Kochem

    progmanos Guest

    On Jul 21, 3:14 pm, "Ron AF Greve" <ron@localhost> wrote:
    > Hi,
    >
    > Most 32 bit OS'es have a limitation of 2GB (maybe 4GB I am not sure about
    > that) per file.



    That's not true. See http://en.wikipedia.org/wiki/Comparison_of_file_systems
    Most file systems in use today will have no problem with 4GB files.
     
    progmanos, Jul 22, 2008
    #16
  17. Robert Kochem

    James Kanze Guest

    On Jul 22, 3:16 pm, Matthias Buelow <> wrote:
    > James Kanze wrote:
    > > The result is that by default, both 32 bit Solaris and 32 bit
    > > Linux do not support files greater than 2GB. (I think that both


    > What do you mean by that? At least Linux (i386) has had
    > support for files >2gb for many years now, "out of box" (that
    > is, by default).


    The OS, yes, but at least on the 32 bit implementations I have
    access to, off_t is an int32_t, which means (indirectly) that
    the standard FILE* and fstream will have problems with them.

    --
    James Kanze (GABI Software) email:
    Conseils en informatique orientée objet/
    Beratung in objektorientierter Datenverarbeitung
    9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
     
    James Kanze, Jul 23, 2008
    #17
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Peter Williams
    Replies:
    1
    Views:
    1,442
    Dylan Parry
    Jun 3, 2005
  2. John Tiger
    Replies:
    10
    Views:
    5,653
  3. ai@work
    Replies:
    9
    Views:
    562
    Ron Natalie
    Dec 16, 2004
  4. S. Nurbe

    iostream + iostream.h

    S. Nurbe, Jan 14, 2005, in forum: C++
    Replies:
    7
    Views:
    795
    red floyd
    Jan 15, 2005
  5. red floyd
    Replies:
    3
    Views:
    552
    Dietmar Kuehl
    Mar 8, 2005
Loading...

Share This Page