Saving and reloading a container to/from disk

Discussion in 'C++' started by jacob navia, Mar 31, 2010.

  1. jacob navia

    jacob navia Guest

    Hi

    What would be the best way to save and reload later a container
    to/from disk in C++?

    Thanks
    jacob navia, Mar 31, 2010
    #1
    1. Advertising

  2. jacob navia

    Ian Collins Guest

    Ian Collins, Mar 31, 2010
    #2
    1. Advertising

  3. jacob navia

    jacob navia Guest

    Ian Collins a écrit :
    > On 03/31/10 07:07 PM, jacob navia wrote:
    >> Hi
    >>
    >> What would be the best way to save and reload later a container
    >> to/from disk in C++?

    >
    > It depends, there isn't really a best way. Do you want portability?
    >
    > http://www.boost.org/doc/libs/1_42_0/libs/serialization/doc/index.html
    >
    > Is a good place to start.
    >


    Thanks.

    I downloaded the boost libraries, and compiled them in my machine.
    Unziped the source code is 269MB. Compilation took 14 minutes.

    Machine Mac-pro OS X with 8 CPUs and 12GB RAM.

    Then, I compiled the example of the serialization.
    The class being saved/restored looks like this:

    original schedule
    6:24 bob
    0x0x100200440 34?135'52.56" 134?22'78.3" 24th Street and 10th Avenue
    0x0x1002004d0 35?137'23.456" 133?35'54.12" State street and Cathedral
    Vista Lane
    0x0x100200530 35?136'15.456" 133?32'15.3" White House

    when restored, the restored stuff looks like this:
    6:24
    0x0x100200e30 34?135'52.56" 134?22'78.3" 24th Street and 10th Avenue
    0x0x100200f40 35?137'23.456" 133?35'54.12" State street and Cathedral
    Vista Lane
    0x0x1002012c0 35?136'15.456" 133?32'15.3" White House


    As you can see the name "bob" is missing.

    The same bug appears with all other saved/restored class instances.
    I am not fluent in C++ to figure out this, sorry.

    But maybe this is a bug in the example, I can't determine what
    is the reason.

    jacob
    jacob navia, Mar 31, 2010
    #3
  4. On 2010-03-31, jacob navia <> wrote:
    > Hi
    >
    > What would be the best way to save and reload later a container
    > to/from disk in C++?
    >
    > Thanks


    I'm not sure how best to deal with binary data, but for
    numbers and text, I would use JSON (escaping special
    characters as appropriate, etc).

    It's well-understood, lightweight and simple, and
    portable across many languages.

    With binary data you could base64-encode it or something,
    but you'll be looking at significant bloat for large
    structures. Or you could NUL-separate fields, replacing
    actual NUL characters with \001s, and actual \001s with
    \001\001s.

    --
    Andrew Poelstra
    http://www.wpsoftware.net/andrew
    Andrew Poelstra, Mar 31, 2010
    #4
  5. jacob navia

    jacob navia Guest

    Andrew Poelstra a écrit :
    > On 2010-03-31, jacob navia <> wrote:
    >> Hi
    >>
    >> What would be the best way to save and reload later a container
    >> to/from disk in C++?
    >>
    >> Thanks

    >
    > I'm not sure how best to deal with binary data, but for
    > numbers and text, I would use JSON (escaping special
    > characters as appropriate, etc).
    >
    > It's well-understood, lightweight and simple, and
    > portable across many languages.
    >
    > With binary data you could base64-encode it or something,
    > but you'll be looking at significant bloat for large
    > structures. Or you could NUL-separate fields, replacing
    > actual NUL characters with \001s, and actual \001s with
    > \001\001s.
    >


    Well, but that is a significant development. I thought that the
    STL would provide something to save/restore a container.
    jacob navia, Mar 31, 2010
    #5
  6. On 2010-03-31, jacob navia <> wrote:
    > Andrew Poelstra a écrit :
    >> On 2010-03-31, jacob navia <> wrote:
    >>> Hi
    >>>
    >>> What would be the best way to save and reload later a container
    >>> to/from disk in C++?
    >>>
    >>> Thanks

    >>
    >> I'm not sure how best to deal with binary data, but for
    >> numbers and text, I would use JSON (escaping special
    >> characters as appropriate, etc).
    >>
    >> It's well-understood, lightweight and simple, and
    >> portable across many languages.
    >>
    >> With binary data you could base64-encode it or something,
    >> but you'll be looking at significant bloat for large
    >> structures. Or you could NUL-separate fields, replacing
    >> actual NUL characters with \001s, and actual \001s with
    >> \001\001s.
    >>

    >
    > Well, but that is a significant development. I thought that the
    > STL would provide something to save/restore a container.


    I haven't looked that deeply into it, but the only serialization
    method I have heard of is to override the << >> operators so you
    can work with istreams. From there you have to iterate over your
    container, structuring the data as you see fit.

    I can't of any way in general that STL containers could save
    themselves without confusing the data with their own control
    characters.

    --
    Andrew Poelstra
    http://www.wpsoftware.net/andrew
    Andrew Poelstra, Mar 31, 2010
    #6
  7. jacob navia

    Kai-Uwe Bux Guest

    jacob navia wrote:

    > Andrew Poelstra a écrit :
    >> On 2010-03-31, jacob navia <> wrote:
    >>> Hi
    >>>
    >>> What would be the best way to save and reload later a container
    >>> to/from disk in C++?
    >>>
    >>> Thanks

    >>
    >> I'm not sure how best to deal with binary data, but for
    >> numbers and text, I would use JSON (escaping special
    >> characters as appropriate, etc).
    >>
    >> It's well-understood, lightweight and simple, and
    >> portable across many languages.
    >>
    >> With binary data you could base64-encode it or something,
    >> but you'll be looking at significant bloat for large
    >> structures. Or you could NUL-separate fields, replacing
    >> actual NUL characters with \001s, and actual \001s with
    >> \001\001s.
    >>

    >
    > Well, but that is a significant development. I thought that the
    > STL would provide something to save/restore a container.


    That would be a little tricky because everything is templated. Now, suppose
    you want to save/restore a vector<T>. The most natural thing would be to use
    operator<< and operator>> for serializing and deserializing vector elements.
    That, however, would suppose that for elements of type T both operations are
    truly inverse. This does not even hold for the standard types (e.g., double
    or std::string).

    Then, for set<T,C>, it is not clear what to do about the comparison
    predicate. Also: would you save/restore allocator objects or would you
    decide to ignore the issue?

    For simple cases, there is std::copy() and the use of stream iterators.
    Also, containers can be initialized from a pair of iterators. Everything
    more complex needs a custom solution.


    Best

    Kai-Uwe Bux
    Kai-Uwe Bux, Mar 31, 2010
    #7
  8. jacob navia

    Brian Guest

    On Mar 31, 9:59 am, Kai-Uwe Bux <> wrote:
    > jacob navia wrote:
    > > Andrew Poelstra a écrit :
    > >> On 2010-03-31, jacob navia <> wrote:
    > >>> Hi

    >
    > >>> What would be the best way to save and reload later a container
    > >>> to/from disk in C++?

    >
    > >>> Thanks

    >
    > >> I'm not sure how best to deal with binary data, but for
    > >> numbers and text, I would use JSON (escaping special
    > >> characters as appropriate, etc).

    >
    > >> It's well-understood, lightweight and simple, and
    > >> portable across many languages.

    >
    > >> With binary data you could base64-encode it or something,
    > >> but you'll be looking at significant bloat for large
    > >> structures. Or you could NUL-separate fields, replacing
    > >> actual NUL characters with \001s, and actual \001s with
    > >> \001\001s.

    >
    > > Well, but that is a significant development. I thought that the
    > > STL would provide something to save/restore a container.

    >
    > That would be a little tricky because everything is templated. Now, suppose
    > you want to save/restore a vector<T>. The most natural thing would be to use
    > operator<< and operator>> for serializing and deserializing vector elements.
    > That, however, would suppose that for elements of type T both operations are
    > truly inverse. This does not even hold for the standard types (e.g., double
    > or std::string).
    >
    > Then, for set<T,C>, it is not clear what to do about the comparison
    > predicate. Also: would you save/restore allocator objects or would you
    > decide to ignore the issue?
    >
    >


    I don't know any serialization library that does
    anything with comparison predicates or the allocators.
    It's certainly possible that users want to use
    different comparison predicates in different contexts,
    and that attempting to make them use the same one
    would cause them problems.


    Brian Wood
    http://webEbenezer.net
    (651) 251-9384
    Brian, Mar 31, 2010
    #8
  9. jacob navia

    Brian Guest

    Brian, Mar 31, 2010
    #9
  10. jacob navia

    jacob navia Guest

    Kai-Uwe Bux a écrit :
    > That would be a little tricky because everything is templated. Now, suppose
    > you want to save/restore a vector<T>. The most natural thing would be to use
    > operator<< and operator>> for serializing and deserializing vector elements.
    > That, however, would suppose that for elements of type T both operations are
    > truly inverse. This does not even hold for the standard types (e.g., double
    > or std::string).
    >


    Excuse me but I do not understand that. If I write a double value into a
    file, I will obtain the same value when I read it later if I store it in
    binary form.

    fwrite(&Double,1,sizeof(double),stream);
    fread(&Double,1,sizeof(double),stream);

    will leave the value of Double unchanged. Obviously if you use the same
    CPU type for bothoperations.

    I just do not see how that could be wrong. Maybe you care to explain?

    Thanks

    > Then, for set<T,C>, it is not clear what to do about the comparison
    > predicate.


    The container could be read in an incomplete way so that most values are
    retrieved but function pointers aren't.

    > Also: would you save/restore allocator objects or would you
    > decide to ignore the issue?
    >


    See above.
    jacob navia, Mar 31, 2010
    #10
  11. jacob navia

    Ian Collins Guest

    On 04/ 1/10 08:56 AM, jacob navia wrote:
    > Kai-Uwe Bux a écrit :
    >> That would be a little tricky because everything is templated. Now,
    >> suppose you want to save/restore a vector<T>. The most natural thing
    >> would be to use operator<< and operator>> for serializing and
    >> deserializing vector elements. That, however, would suppose that for
    >> elements of type T both operations are truly inverse. This does not
    >> even hold for the standard types (e.g., double or std::string).

    >
    > Excuse me but I do not understand that. If I write a double value into a
    > file, I will obtain the same value when I read it later if I store it in
    > binary form.
    >
    > fwrite(&Double,1,sizeof(double),stream);
    > fread(&Double,1,sizeof(double),stream);
    >
    > will leave the value of Double unchanged. Obviously if you use the same
    > CPU type for bothoperations.


    There's the issue, there isn't a universal interchange format for
    doubles. For a serialisation scheme to be worthwhile, it would have to
    be platform independent.

    >> Then, for set<T,C>, it is not clear what to do about the comparison
    >> predicate.

    >
    > The container could be read in an incomplete way so that most values are
    > retrieved but function pointers aren't.


    Then you loose the container's meta-data. That information has to be
    stored somewhere, either with the data, or in the code that retrieves
    it. It's a fair call not to serialise it, but it is a limitation.

    I use JSON to serialise data between applications and languages
    (particularly in web applications), but is only preserves the data part
    (at least for C and C++, dynamic languages can recover the object
    structure as well). Most of the data I find I wish to transfer or
    archive tends to be in simple containers like std::vector where the
    meta-data tends to be less important. For more complex data structures,
    I use XML. But I still have to provide my own in and out operators for
    each non-POD type.

    What C++ really need for this is reflection!

    --
    Ian Collins
    Ian Collins, Mar 31, 2010
    #11
  12. jacob navia

    Guest

    On Mar 31, 2:56 pm, jacob navia <> wrote:
    > Kai-Uwe Bux a crit :
    >
    > > That would be a little tricky because everything is templated. Now, suppose
    > > you want to save/restore a vector<T>. The most natural thing would be to use
    > > operator<< and operator>> for serializing and deserializing vector elements.
    > > That, however, would suppose that for elements of type T both operations are
    > > truly inverse. This does not even hold for the standard types (e.g., double
    > > or std::string).

    >
    > Excuse me but I do not understand that. If I write a double value into a
    > file, I will obtain the same value when I read it later if I store it in
    > binary form.
    >
    >         fwrite(&Double,1,sizeof(double),stream);
    >         fread(&Double,1,sizeof(double),stream);
    >
    > will leave the value of Double unchanged. Obviously if you use the same
    > CPU type for bothoperations.
    >
    > I just do not see how that could be wrong. Maybe you care to explain?



    Well, the C++ compiler for zOS can be run (with the equivalent of a
    command line switch) to use either hex or binary (IEEE) float. Most
    definitely not the same format, even if doubles are 64 bits long in
    both cases. The saga of long doubles on Windows (and other x86
    platforms) is another example, where some compilers treat long double
    as a 64 bit IEEE double, and other use the 80 bit double extended
    format.

    Anyway, I'd say dumping a container to disk is significantly less
    useful if it’s not portable. So binary formats present the usual
    issues. At least a text-ish format should be an option, and then you
    have issues trying to produce 100% faithful portable representations
    of things like floats. For example, some IEEE implementations store
    information about the type of NaN (beyond the QNaN/SNaN distinction)
    in the mantissa.
    , Mar 31, 2010
    #12
  13. jacob navia

    Kai-Uwe Bux Guest

    jacob navia wrote:

    > Kai-Uwe Bux a écrit :
    >> That would be a little tricky because everything is templated. Now,
    >> suppose you want to save/restore a vector<T>. The most natural thing
    >> would be to use operator<< and operator>> for serializing and
    >> deserializing vector elements. That, however, would suppose that for
    >> elements of type T both operations are truly inverse. This does not even
    >> hold for the standard types (e.g., double or std::string).
    >>

    >
    > Excuse me but I do not understand that. If I write a double value into a
    > file, I will obtain the same value when I read it later if I store it in
    > binary form.
    >
    > fwrite(&Double,1,sizeof(double),stream);
    > fread(&Double,1,sizeof(double),stream);
    >
    > will leave the value of Double unchanged. Obviously if you use the same
    > CPU type for bothoperations.
    >
    > I just do not see how that could be wrong. Maybe you care to explain?

    [...]

    a) I did not claim that your proposed code could be wrong. I made a claim
    about operator<< and operator>> not being inverse for the type double. That
    is a consequence of operator<< doing some rounding. For std::string it is a
    consequence of the way white space is treated by operator<< and operator>>.

    b) Your code will work for double (with the restrictions you mentioned about
    being tied to the a given CPU, or more accurately to a particular binary
    format for doubles). But that code will not work for many other types, e.g,
    std::string. Thus, it does not address the main point I made, namely, that
    the container of the standard library are templated.

    c) Another case where the templating causes a problem is something like
    this: Suppose you have a hierarchy of classes such as

    class Student {};
    class Freshman : public Student {};
    class Sophomore : public Student {};
    class Junior : public Student {};
    class Senior : public Student {};

    and

    typedef vector< Student* > Class;

    If you want to serialize objects of type Class, you have to decide what to
    do about the pointers. Very likely, you want to dump some unique student id,
    probably retrieved by some member function. That is very particular to the
    specific problem and a generic solution is unlikely to match your needs.

    d) Summary: in any particular case, there is a valid solution; but there
    seems to be no _generic_ solution in sight. The standard library does not
    even attempt to give a generic solution.


    >> Then, for set<T,C>, it is not clear what to do about the comparison
    >> predicate.

    >
    > The container could be read in an incomplete way so that most values are
    > retrieved but function pointers aren't.
    >
    >> Also: would you save/restore allocator objects or would you
    >> decide to ignore the issue?

    >
    > See above.


    Yes, ignoring the issue is a _valid_ design decision in this case. I did
    that too when I experimented with serialization.


    Best

    Kai-Uwe Bux
    Kai-Uwe Bux, Mar 31, 2010
    #13
  14. jacob navia

    jacob navia Guest

    a écrit :
    >
    > Well, the C++ compiler for zOS can be run (with the equivalent of a
    > command line switch) to use either hex or binary (IEEE) float. Most
    > definitely not the same format, even if doubles are 64 bits long in
    > both cases. The saga of long doubles on Windows (and other x86
    > platforms) is another example, where some compilers treat long double
    > as a 64 bit IEEE double, and other use the 80 bit double extended
    > format.
    >


    (1) Obviously if you are working with zOS you rather use the same
    command line switch for reading and writing your double data.
    Why is that so difficult to understand?
    (2) Obviously too, double data is not portable among compilers that
    use different representations of the data. If one compiler
    implements long double as 64 bits and the other as 80 bits
    they aren't compatible and you should recompile both the reader
    and the writer.
    Why is that so difficult to understand?

    Here we arrive at philosophical questions. C++ is all about bells
    and whistles. Rather than providing a solution that would be very useful
    for most users but would fail in some special cases it is decided that
    nothing should be provided so that everyone rolls its own.

    Modulo bugs the Boost solution seems to be working. Other solutions were
    presented in this discussion (http://webEbenezer.net by Brian)
    jacob navia, Mar 31, 2010
    #14
  15. jacob navia

    Ian Collins Guest

    On 04/ 1/10 10:29 AM, jacob navia wrote:
    > a écrit :
    >>
    >> Well, the C++ compiler for zOS can be run (with the equivalent of a
    >> command line switch) to use either hex or binary (IEEE) float. Most
    >> definitely not the same format, even if doubles are 64 bits long in
    >> both cases. The saga of long doubles on Windows (and other x86
    >> platforms) is another example, where some compilers treat long double
    >> as a 64 bit IEEE double, and other use the 80 bit double extended
    >> format.
    >>

    >
    > (1) Obviously if you are working with zOS you rather use the same
    > command line switch for reading and writing your double data.
    > Why is that so difficult to understand?


    How do you enforce that?

    > (2) Obviously too, double data is not portable among compilers that
    > use different representations of the data. If one compiler
    > implements long double as 64 bits and the other as 80 bits
    > they aren't compatible and you should recompile both the reader
    > and the writer.


    What if you don't have the source for one or both of them?

    > Why is that so difficult to understand?


    It isn't difficult to understand, it's impractical. There isn't a
    standard format for double.

    > Here we arrive at philosophical questions. C++ is all about bells
    > and whistles. Rather than providing a solution that would be very useful
    > for most users but would fail in some special cases it is decided that
    > nothing should be provided so that everyone rolls its own.


    Have you read the responses here? The problem of serialisation is
    complex and multi-layered. Sure a naive solution would work for a
    subset of types, but that subset is small. As soon as you add any
    complexity to your objects (even something as trivial as float or
    pointers), the solution breaks down. So it isn't "very useful for most
    users".

    --
    Ian Collins
    Ian Collins, Mar 31, 2010
    #15
  16. jacob navia

    Guest

    On Mar 31, 4:29 pm, jacob navia <> wrote:
    > a écrit :
    >
    >
    >
    > > Well, the C++ compiler for zOS can be run (with the equivalent of a
    > > command line switch) to use either hex or binary (IEEE) float.  Most
    > > definitely not the same format, even if doubles are 64 bits long in
    > > both cases.  The saga of long doubles on Windows (and other x86
    > > platforms) is another example, where some compilers treat long double
    > > as a 64 bit IEEE double, and other use the 80 bit double extended
    > > format.

    >
    > (1) Obviously if you are working with zOS you rather use the same
    >      command line switch for reading and writing your double data.
    >      Why is that so difficult to understand?
    > (2) Obviously too, double data is not portable among compilers that
    >      use different representations of the data. If one compiler
    >      implements long double as 64 bits and the other as 80 bits
    >      they aren't compatible and you should recompile both the reader
    >      and the writer.
    >      Why is that so difficult to understand?



    It's not difficult to understand at all. The problem is that you
    stated that it should not be a problem "if you use the same CPU type
    for bothoperations. " Which is clearly incorrect. It's not even
    consistent for a single compiler on a single OS on that one CPU type.


    > Here we arrive at philosophical questions. C++ is all about bells
    > and whistles. Rather than providing a solution that would be very useful
    > for most users but would fail in some special cases it is decided that
    > nothing should be provided so that everyone rolls its own.
    >
    > Modulo bugs the Boost solution seems to be working. Other solutions were
    > presented in this discussion (http://webEbenezer.netby Brian)



    The rest of my post (and other) point out that there are portablility
    issues with binary formats (and those do actually matter to some of
    us, even if *you* don't care), and that any common format will likely
    have some issues with some of the odder corners of type
    representations.
    And FWIW, Boost:serialization, does use a text format.

    Note that Java and .NET have relatively strong support for
    serialization, but then they also include complete specifications of
    the datatypes.

    That being said, I would not object at all to adding something like
    Boost:serialization to the STL...
    , Apr 1, 2010
    #16
  17. jacob navia

    jacob navia Guest

    Ian Collins a écrit :
    > On 04/ 1/10 10:29 AM, jacob navia wrote:
    >> a écrit :
    >>>
    >>> Well, the C++ compiler for zOS can be run (with the equivalent of a
    >>> command line switch) to use either hex or binary (IEEE) float. Most
    >>> definitely not the same format, even if doubles are 64 bits long in
    >>> both cases. The saga of long doubles on Windows (and other x86
    >>> platforms) is another example, where some compilers treat long double
    >>> as a 64 bit IEEE double, and other use the 80 bit double extended
    >>> format.
    >>>

    >>
    >> (1) Obviously if you are working with zOS you rather use the same
    >> command line switch for reading and writing your double data.
    >> Why is that so difficult to understand?

    >
    > How do you enforce that?
    >


    Very easy.

    Each time you do that you crash or obtain wrong results.

    :)

    Why should the language protect the programmer from himself
    from any possible error?

    It is well known that if you compile a shared object (dll)
    with structure alignment turned off, and you use it with a
    main executable with structure alignment turned on at 16 bytes, passing
    double data between the shared object and the main program will not
    work.

    How do you enforce that structure alignment is the same?


    >> (2) Obviously too, double data is not portable among compilers that
    >> use different representations of the data. If one compiler
    >> implements long double as 64 bits and the other as 80 bits
    >> they aren't compatible and you should recompile both the reader
    >> and the writer.

    >
    > What if you don't have the source for one or both of them?
    >
    >> Why is that so difficult to understand?

    >
    > It isn't difficult to understand, it's impractical. There isn't a
    > standard format for double.
    >


    What?

    And the IEEE-754 format?


    >> Here we arrive at philosophical questions. C++ is all about bells
    >> and whistles. Rather than providing a solution that would be very useful
    >> for most users but would fail in some special cases it is decided that
    >> nothing should be provided so that everyone rolls its own.

    >
    > Have you read the responses here? The problem of serialisation is
    > complex and multi-layered. Sure a naive solution would work for a
    > subset of types, but that subset is small. As soon as you add any
    > complexity to your objects (even something as trivial as float or
    > pointers), the solution breaks down. So it isn't "very useful for most
    > users".
    >


    OK. Let's agree that we disagree here.
    jacob navia, Apr 1, 2010
    #17
  18. jacob navia

    jacob navia Guest

    a écrit :
    > On Mar 31, 4:29 pm, jacob navia <> wrote:
    >> a écrit :
    >>
    >>
    >>
    >>> Well, the C++ compiler for zOS can be run (with the equivalent of a
    >>> command line switch) to use either hex or binary (IEEE) float. Most
    >>> definitely not the same format, even if doubles are 64 bits long in
    >>> both cases. The saga of long doubles on Windows (and other x86
    >>> platforms) is another example, where some compilers treat long double
    >>> as a 64 bit IEEE double, and other use the 80 bit double extended
    >>> format.

    >> (1) Obviously if you are working with zOS you rather use the same
    >> command line switch for reading and writing your double data.
    >> Why is that so difficult to understand?
    >> (2) Obviously too, double data is not portable among compilers that
    >> use different representations of the data. If one compiler
    >> implements long double as 64 bits and the other as 80 bits
    >> they aren't compatible and you should recompile both the reader
    >> and the writer.
    >> Why is that so difficult to understand?

    >
    >
    > It's not difficult to understand at all. The problem is that you
    > stated that it should not be a problem "if you use the same CPU type
    > for bothoperations. " Which is clearly incorrect. It's not even
    > consistent for a single compiler on a single OS on that one CPU type.
    >
    >


    With this logic, assignment of a "double" field in a structure should be
    forbidden:

    file.h
    struct foo { char a; double b; };

    file1.c
    foo a;
    extern foo b;

    // ...

    a.b = b.b; // this will not work

    file2.c
    foo b;

    I compile file1.c with the compilation flag "No structure alignment".
    I compile file2.c with the structure alignment to 16 bytes.

    Consequence: Since that can't be enforced, assignment to a structure
    field of type double from other structure should be forbidden.


    You just can't protect the programmer from all possible mistakes.
    jacob navia, Apr 1, 2010
    #18
  19. jacob navia

    gwowen Guest

    On Mar 31, 7:07 am, jacob navia <> wrote:
    > Hi
    >
    > What would be the best way to save and reload later a container
    > to/from disk in C++?
    >
    > Thanks


    Here's a solution, what constitutes "best" depends on what you
    consider important:

    If the container contains Plain-Old-Data types or pointers to PODs --
    first fread()/fwrite() the size() [if its variable]. After that just
    iterate over the elements, and fread() / fwrite() the data,
    dereferencing as appropriate as you go. If you've got pointers to
    polymorphic types, make sure their base type has a [virtual]
    serialize(), and each derived types implementation includes enough
    extra header information the first element to determine its type

    // Could easily be a static member function...
    Base* unserialize()
    {
    FILE* file_descriptor = fopen("filename","rb");
    // read header, determine DerivedType
    switch(DerivedType){
    case DerivedType1:
    return DerivedType1::unserialize(file_descriptor);
    case DerivedType2:
    return DerivedType2::unserialize(file_descriptor);
    /// etc
    default:
    throw(std::runtime_error("Unrecognised derived type header in
    Base* unserialize()"));
    }
    }

    Dumping the vtable / function pointers, even if you can find them, is
    a recipe for disaster.
    gwowen, Apr 1, 2010
    #19
  20. jacob navia

    Jorgen Grahn Guest

    On Wed, 2010-03-31, Andrew Poelstra wrote:
    > On 2010-03-31, jacob navia <> wrote:
    >> Hi
    >>
    >> What would be the best way to save and reload later a container
    >> to/from disk in C++?
    >>
    >> Thanks

    >
    > I'm not sure how best to deal with binary data, but for
    > numbers and text, I would use JSON (escaping special
    > characters as appropriate, etc).
    >
    > It's well-understood, lightweight and simple, and
    > portable across many languages.


    Don't know anything about JSON, but:

    > With binary data you could base64-encode it or something,
    > but you'll be looking at significant bloat for large
    > structures. Or you could NUL-separate fields, replacing
    > actual NUL characters with \001s, and actual \001s with
    > \001\001s.


    What would this buy him? He's saving the data to file, not to paper
    or a text-only medium like Usenet. Apart from being able to print it,
    base64 has exactly the same weaknesses as whatever binary representation
    lies under the surface.

    To the original poster, I have no general answer. I'd recommend some
    format suited to his application, not to its current implementation.

    /Jorgen

    --
    // Jorgen Grahn <grahn@ Oo o. . .
    \X/ snipabacken.se> O o .
    Jorgen Grahn, Apr 1, 2010
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Espen Evje
    Replies:
    1
    Views:
    389
    Espen Evje
    Jul 11, 2003
  2. Jas Shultz
    Replies:
    0
    Views:
    942
    Jas Shultz
    Dec 3, 2003
  3. Darren Dale
    Replies:
    4
    Views:
    338
    Darren Dale
    Jul 28, 2004
  4. Replies:
    12
    Views:
    517
    santosh
    Nov 15, 2006
  5. Bertram Hurtig
    Replies:
    1
    Views:
    295
    Joshua Cranmer
    Sep 7, 2007
Loading...

Share This Page