Object (de)serialization

P

Philip Pemberton

Hi guys,
I'm trying to write the contents of a set of classes to a file in a
reasonably portable way. Obviously I also want to be able to read the
files back into memory later on. At this point, my serializer works fine;
I can create an object (or several objects) and save them to a file. Now
I need to get them back out of the file...

I've been reading the C++ FAQ (notably Section 36, Serialization and
Unserialization) and I've been writing small prototype apps to try and
learn how all this stuff works. I came up with this, based on the textual
description in C++ FAQ 36.8:

#include <map>
#include <string>
#include <iostream>

using namespace std;

class Shape {
public:
Shape() { cerr<<"ctor: Shape\n"; };
static std::map<std::string, Shape *> creationMap;
virtual Shape *create(string data) const =0;
virtual string getType() const =0;
static Shape *deserialise(string data) {
return creationMap[data]->create(data);
}
};

class Triangle : public Shape {
public:
Triangle() {
cerr<<"ctor: Triangle\n";
creationMap["triangle"] = new Triangle();
}
virtual Shape *create(string data) const {
if (creationMap.count(data) == 0) throw -1;
return new Triangle();
}
virtual string getType() const {return "triangle";}
};

int main(void)
{
Shape *x = Shape::deserialise("triangle");

return 0;
}

This looks fine to me, and it compiles -- but it won't link:
philpem@cougar:~/dev$ g++ -o test test.cpp && ./test
/tmp/ccDHDpxd.o: In function `Shape::deserialise(std::basic_string<char,
std::char_traits<char>, std::allocator<char> >)':
test.cpp:(.text._ZN5Shape11deserialiseESs[Shape::deserialise
(std::basic_string<char, std::char_traits<char>, std::allocator<char> >)]
+0x12): undefined reference to `Shape::creationMap'
collect2: ld returned 1 exit status

Compiler is g++ (gcc) 4.4.1-4ubuntu8, on Ubuntu Linux 9.10 (32-bit).

Can anyone see what I'm doing wrong here? I'm sure it's blatantly obvious
to an experienced developer, but this is the first time I've tried to
implement something like this, and I'm really not having much luck...

Thanks,
Phil.
 
R

Richard Herring

Philip said:

Your subject line is wrong: try something more like "linker complains
about missing static class member" ;-)
I'm trying to write the contents of a set of classes to a file in a
reasonably portable way. Obviously I also want to be able to read the
files back into memory later on. At this point, my serializer works fine;
I can create an object (or several objects) and save them to a file. Now
I need to get them back out of the file...

I've been reading the C++ FAQ (notably Section 36, Serialization and
Unserialization) and I've been writing small prototype apps to try and
learn how all this stuff works. I came up with this, based on the textual
description in C++ FAQ 36.8:

#include <map>
#include <string>
#include <iostream>

using namespace std;

class Shape {
public:
Shape() { cerr<<"ctor: Shape\n"; };
static std::map<std::string, Shape *> creationMap;

That's a declaration. Where's the corresponding definition?

std::map<std::string, Shape *> Shape::creationMap;

(hint: if this is shape.h it probably ought to be in shape.cpp)
virtual Shape *create(string data) const =0;
virtual string getType() const =0;
static Shape *deserialise(string data) {
return creationMap[data]->create(data);
}
};

[...]


This looks fine to me, and it compiles -- but it won't link:
philpem@cougar:~/dev$ g++ -o test test.cpp && ./test
/tmp/ccDHDpxd.o: In function `Shape::deserialise(std::basic_string<char,
std::char_traits<char>, std::allocator<char> >)':
test.cpp:(.text._ZN5Shape11deserialiseESs[Shape::deserialise
(std::basic_string<char, std::char_traits<char>, std::allocator<char> >)]
+0x12): undefined reference to `Shape::creationMap'

Confirmation: there's no definition of Shape::creationMap.
 
B

Brian

Hi guys,
I'm trying to write the contents of a set of classes to a file in a
reasonably portable way. Obviously I also want to be able to read the
files back into memory later on. At this point, my serializer works fine;
I can create an object (or several objects) and save them to a file. Now
I need to get them back out of the file...

I've been reading the C++ FAQ (notably Section 36, Serialization and
Unserialization) and I've been writing small prototype apps to try and
learn how all this stuff works. I came up with this, based on the textual
description in C++ FAQ 36.8:

#include <map>
#include <string>
#include <iostream>

using namespace std;

class Shape {
        public:
                Shape() { cerr<<"ctor: Shape\n"; };
                static std::map<std::string, Shape *> creationMap;
                virtual Shape *create(string data) const =0;
                virtual string getType() const =0;
                static Shape *deserialise(string data) {
                        return creationMap[data]->create(data);
                }

};

class Triangle : public Shape {
        public:
                Triangle() {
                        cerr<<"ctor: Triangle\n";
                        creationMap["triangle"] = new Triangle();
                }

That default constructor looks like trouble. Perhaps you
could move the second line to another function.


Brian Wood
http://webEbenezer.net
(651) 251-9384
 
P

Philip Pemberton

class Triangle : public Shape {
        public:
                Triangle() {
                        cerr<<"ctor: Triangle\n";
                        creationMap["triangle"] = new
                        Triangle();
                }

That default constructor looks like trouble. Perhaps you could move the
second line to another function.

The thing is, I need some way of creating an arbitrary object (in this
case a Shape) based on a given ID string.

Basically, I'm saving objects to/reading objects from a "chunky" file
format (a bit like EA IFF 85). The file is structured into chunks, which
have this format:
4-byte FOURCC (Chunk ID)
8-byte length
Chunk payload
A chunk may have children, in which case the MSBit of the length is set.

What I want is to have as little copy-pasted code as possible, while
still having easy-to-read code. For serialisation, I've got two functions
in Chunk:
vector<uint8_t> Serialise()
virtual vector<uint8_t> SerialisePayload() =0;
(there's also a pure-virtual getChunkID() fn which returns a 4-character
std::string containing the FOURCC code; this is implemented in all child
classes)

Chunk::Serialise calls this->SerialisePayload() to get the payload data,
then outputs the header and payload into a vector and returns it. The
idea being that the headers are common to all chunks, but payload data
depends on the specific class being serialised.

Now back to the deserialisation problem...

At this point I haven't even managed to get an example implementation of
the C++FAQ deserialiser working -- the static ctors aren't being called,
so the std::map doesn't contain anything, thus the code bombs (current
version throws an exception, the one I posted segfaults)...

Thanks,
Phil.
 
P

Philip Pemberton

That's a declaration. Where's the corresponding definition?

std::map<std::string, Shape *> Shape::creationMap;

*bangs head on desk*

Up until you posted that, I had no idea that static member variables had
to be declared in the implementation... It's so obvious, I can't believe
I missed it...

Thanks!
--
Phil.
(e-mail address removed)
http://www.philpem.me.uk/
If mail bounces, replace "09" with the last two digits of the current
year.
 
B

Brian

class Triangle : public Shape {
        public:
                Triangle() {
                        cerr<<"ctor: Triangle\n";
                        creationMap["triangle"] = new
                        Triangle();
                }
That default constructor looks like trouble.  Perhaps you could move the
second line to another function.

The thing is, I need some way of creating an arbitrary object (in this
case a Shape) based on a given ID string.

Basically, I'm saving objects to/reading objects from a "chunky" file
format (a bit like EA IFF 85). The file is structured into chunks, which
have this format:
  4-byte FOURCC (Chunk ID)
  8-byte length
  Chunk payload
A chunk may have children, in which case the MSBit of the length is set.

What I want is to have as little copy-pasted code as possible, while
still having easy-to-read code. For serialisation, I've got two functions
in Chunk:
  vector<uint8_t> Serialise()
  virtual vector<uint8_t> SerialisePayload() =0;
(there's also a pure-virtual getChunkID() fn which returns a 4-character
std::string containing the FOURCC code; this is implemented in all child
classes)

I'm not sure if I'm following you, but the way I do it
a constant for each type being marshalled is output by
a code generator. For your code it would have this:

uint32_t const Shape_num = 7001;
uint32_t const Triangle_num = 7002;


The process of sending an object involves sending
it's "type number" and receiving uses the type
numbers to interpret the input. There's some
related information about this here --
http://webEbenezer.net/release/110.html .
That page describes how I've switched from using
virtual functions like your create to using
"stream" constructors which don't need to be
virtual.

Chunk::Serialise calls this->SerialisePayload() to get the payload data,
then outputs the header and payload into a vector and returns it. The
idea being that the headers are common to all chunks, but payload data
depends on the specific class being serialised.

The above sounds somewhat similar to how I do it, but you've
got different terminology. I talk about messages, message
IDs and message lengths. Typically a message id is embedded
first into the stream, then a message length and then the
message/payload.


Brian Wood
http://webEbenezer.net
(651) 251-9384
 
T

Thomas J. Gritzan

Am 25.01.2010 20:31, schrieb Philip Pemberton:
class Triangle : public Shape {
public:
Triangle() {
cerr<<"ctor: Triangle\n";
creationMap["triangle"] = new
Triangle();
}

That default constructor looks like trouble. Perhaps you could move the
second line to another function.
[...]
Now back to the deserialisation problem...

At this point I haven't even managed to get an example implementation of
the C++FAQ deserialiser working -- the static ctors aren't being called,
so the std::map doesn't contain anything, thus the code bombs (current
version throws an exception, the one I posted segfaults)...

The map isn't filled because you don't create triangle, so the line
creationMap["triangle"] = new Triangle();
isn't executed. You have to move this line somewhere else so that it's
invoked before you use creationMap, like a registerShape function
that'll be called from main.

But instead using this prototype based meachanism, I suggest using a
factory functor and storing a boost::function in creationMap, if you
have access to Boost (std::tr1::function is the same). Example:

#include <map>
#include <string>
#include <iostream>
#include <boost/function.hpp>

using namespace std;

class Shape {
public:
Shape() { cerr << "ctor: Shape\n"; };
static Shape* deserialise(string data) {
return creationMap[data]();
}
// add virtual d'tor to allow typeid / delete through base pointer
virtual ~Shape() {}
protected:
typedef boost::function<Shape*()> creation_func;
static void registerShape(std::string type, creation_func factory) {
creationMap[type] = factory;
}

template <typename T>
static Shape* create() {
return new T;
}
private:
static std::map<std::string, creation_func> creationMap;
};

/*static*/ std::map<std::string, Shape::creation_func> Shape::creationMap;

class Triangle : public Shape {
public:
Triangle() {
cerr << "ctor: Triangle\n";
}
static void registerClass() {
registerShape("triangle", &Shape::create<Triangle>);
}
};

int main()
{
Triangle::registerClass();
Shape *x = Shape::deserialise("triangle");

// checks if x has correct type:
cerr << typeid(*x).name() << endl;
delete x;
}
 
B

Branimir Maksimovic

Thomas said:
class Triangle : public Shape {
public:
Triangle() {
cerr << "ctor: Triangle\n";
}
static void registerClass() {
registerShape("triangle", &Shape::create<Triangle>);
}
};

int main()
{
Triangle::registerClass();
Shape *x = Shape::deserialise("triangle");

// checks if x has correct type:
cerr << typeid(*x).name() << endl;
delete x;
}

Perfect, I use this method since 1999.
 
P

Philip Pemberton

The above sounds somewhat similar to how I do it, but you've got
different terminology. I talk about messages, message IDs and message
lengths. Typically a message id is embedded first into the stream, then
a message length and then the message/payload.

That's pretty much what I'm doing. Four bytes to tell you what the chunk
is, eight more to specify its length, then a <length>-sized block of data
(the Payload).

Same concept, different terminology.
 
P

Philip Pemberton

At this point I haven't even managed to get an example implementation
of the C++FAQ deserialiser working -- the static ctors aren't being
called, so the std::map doesn't contain anything, thus the code bombs
(current version throws an exception, the one I posted segfaults)...

The map isn't filled because you don't create triangle, so the line
creationMap["triangle"] = new Triangle();
isn't executed. You have to move this line somewhere else so that it's
invoked before you use creationMap, like a registerShape function
that'll be called from main.

I've actually shuffled it into a "TriangleInitialiser" class --

static class TriangleInitialiser {
public:
TriangleInitialiser() {
cerr<<"ctor: TriangleInitialiser\n";
if (Shape::creationMap.count("triangle") == 0) {
Shape::creationMap["triangle"] = new
Triangle();
}
}
} _x_Initialiser_Triangle;

(Obviously this is a test, and any real code would be hiding the map
behind a couple of functions -- RegisterPrototype and FreePrototypes)

This stays in Triangle.cpp and isn't referenced by (or even accessible
by, thanks to the static prefix). That leaves the problem of deallocating
the memory (admittedly only a few bytes, but it's still more fluff to
wade through in the Valgrind log). Adding a destructor to Shape deals
with that:

~Shape() {
// dealloc the prototypes
while (!creationMap.empty()) {
std::map<std::string, Shape *>::iterator
i = creationMap.begin();
Shape *x = (*i).second;
creationMap.erase(i);
delete x;
}
};

The catch being that the "delete x" invokes ~Shape again, thus (AIUI) it
will consume one stack level for each prototype in the map... I've had a
quick play, but it doesn't seem to be possible to specify that a
destructor applies to the base class, but not any derived classes.

I didn't use an iterator loop because AIUI calling erase() on a container
or map invalidates any iterators active against it. The "Shape *x" is
there for a similar reason.

But instead using this prototype based meachanism, I suggest using a
factory functor and storing a boost::function in creationMap, if you
have access to Boost (std::tr1::function is the same). Example:
(snip code)

That looks better than my solution, but I'm not keen on adding Boost to
my application's build dependencies. As nice as it is, it's an utter pig
to build on Win32 (IIRC last time I did it, I had to build Cmake, which
was great fun). Dead easy on *nix, but unfortunately this code has to
work in the Evil Empire too...

The only other thing I'm not keen on is having to hack around with main()
to add new chunks, although that's probably solvable by putting the
registration stuff in a static class's ctor or a global
RegisterChunkDeserialisers() function.

Still not as nice as just being able to create a module with two classes,
and have that module auto-initialise and register on startup (see
TriangleInitialiser above). But that said, I'm still concerned about what
happens if the TriangleInitialiser object gets initialised before
Shape... unless the compiler is clever enough to figure out that Shape
needs setting up first (probably not, even though it is gcc).

Thanks,
Phil.
 
T

Thomas J. Gritzan

Am 26.01.2010 01:19, schrieb Philip Pemberton:
(snip code)

That looks better than my solution, but I'm not keen on adding Boost to
my application's build dependencies. As nice as it is, it's an utter pig
to build on Win32 (IIRC last time I did it, I had to build Cmake, which
was great fun). Dead easy on *nix, but unfortunately this code has to
work in the Evil Empire too...

1) Most of Boost's code is header only, so you don't need to build
anything. Just put the headers somewhere and add the path to the
compiler's include path.
2) If your compiler supports TR1, it has std::tr1::function, which is
practically the same.
TR1 has some really nice classes like random number generators, regular
expression parsers, smart pointers and std::tr1::function.
See <http://en.wikipedia.org/wiki/C++_Technical_Report_1>.

AFAIK, Visual Studio 2008 comes with TR1 support since Service Pack 1.
The only other thing I'm not keen on is having to hack around with main()
to add new chunks, although that's probably solvable by putting the
registration stuff in a static class's ctor or a global
RegisterChunkDeserialisers() function.

Still not as nice as just being able to create a module with two classes,
and have that module auto-initialise and register on startup (see
TriangleInitialiser above). But that said, I'm still concerned about what
happens if the TriangleInitialiser object gets initialised before
Shape... unless the compiler is clever enough to figure out that Shape
needs setting up first (probably not, even though it is gcc).

[10.13] How do I prevent the "static initialization order fiasco"?
http://www.parashift.com/c++-faq-lite/ctors.html#faq-10.13
 
B

Brian

Perfect, I use this method since 1999.

Here's how I'd do it:

class SendCompressedBuffer;
class Counter;

class Shape {
public:
template <typename B>
explicit Shape(B* buf);

// Add virtual d'tor to allow delete through base pointer
virtual ~Shape() {}

virtual inline void
Send(SendCompressedBuffer* buf, bool = false) const;

virtual inline void
CalculateMarshallingSize(Counter&) const;

template <typename B>
static Shape* BuildPolyInstance(B* buf);

virtual void Draw() const =0;
};


class Triangle : public Shape {
public:
template <typename B>
explicit Triangle(B* buf);

virtual inline void
Send(SendCompressedBuffer* buf, bool = false) const;

virtual inline void
CalculateMarshallingSize(Counter&) const;

virtual void Draw();
};


The full output from the C++ Middleware Writer given the
above as input is here --
http://webEbenezer.net/posts/buildpoly.hh

And here's a portion of that output:

uint32_t const Shape_num = 7001;
uint32_t const Triangle_num = 7002;

template <typename B>
inline Shape*
Shape::BuildPolyInstance(B* buf)
{
uint32_t type_num;

buf->Give(type_num);
switch (type_num) {
case Triangle_num:
return new Triangle(buf);

default:
throw failure("Shape::BuildPolyInstance: Unknown type");
}
}

---------------------------------------------------------------

If there were other concrete, derived classes they would
be added to the switch statement. I think this is both
simpler and more complete than what has previously been
outlined -- there are Send/serialization functions and
the Draw method indicates Shape is an abstract class.
The automated generation of the type numbers helps to
conserve bandwidth. I don't recommend sending/receiving
class names as strings in a real application.


Brian Wood
http://webEbenezer.net
(651) 251-9384
 
B

Brian

Here's how I'd do it:

class SendCompressedBuffer;
class Counter;

class Shape {
public:
   template <typename B>
   explicit Shape(B* buf);

   // Add virtual d'tor to allow delete through base pointer
   virtual ~Shape() {}

   virtual inline void
   Send(SendCompressedBuffer* buf, bool = false) const;

   virtual inline void
   CalculateMarshallingSize(Counter&) const;

   template <typename B>
   static Shape* BuildPolyInstance(B* buf);

   virtual void Draw() const =0;

};

classTriangle: public Shape {
public:
   template <typename B>
   explicitTriangle(B* buf);

   virtual inline void
   Send(SendCompressedBuffer* buf, bool = false) const;

   virtual inline void
   CalculateMarshallingSize(Counter&) const;

   virtual void Draw();

};

It looks like I accidentally dropped a const in that
Draw function. It should be

virtual void Draw() const;


Brian Wood
http://webEbenezer.net
(651) 251-9384
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,057
Latest member
KetoBeezACVGummies

Latest Threads

Top