Automating Serialization?

B

Brian

Recently on the Boost Users list someone started a thread with the
title "Automating Serialization?". I was thinking about copying the
thread over here and then someone in the thread got a little defensive
about things so I decided to discuss it here. I'll use a line of
dashes to
separate the posts from each other. In the middle of the dashes
I'll describe how that post relates to the others.

------------------------------------------ OP
-------------------------------------------

Hi,

By design Boost Serialization requires the user to list each field
that
needs to be serialised.

This is in contrast, for example, with Java and some other languages
where serialisation is supported (kind of) at the language level.

The current approach requires some code duplication. We have to
declare a field. We have to manually (de)serialise it. If we need to
make a change, we make it in at least two distinct places.

Is there any way to automate the process of serialisation, perhaps
harnessing the power of the preprocessor? E.g. we could label the
fields that need to be serialised. Is there anything in Boost that
could help?

Many thanks,
Paul

Paul Bilokon, Vice President
Citigroup | FX - Options Trading | Quants
33 Canada Square | Canary Wharf | Floor 3
London, E14 5LB
Phone: +44 20 798-62191

---------------------------------------- first reply to OP
-----------------------------------


The C++ Middleware Writer doesn't have that problem --
http://webEbenezer.net/comparison.html . (There's a performance
section
on that page using Boost 1.38. We're in the process of updating that
page
using Boost 1.41 and hope to have those results on line in the next
two
weeks.)


---------------------------------- second reply to OP
---------------------------

Stefan Strasser <[email protected]>
To: (e-mail address removed)
[...]

not that I'm aware of, but I even think that's a good thing.
the serialize() function represents a file format, and you want file
formats
to be stable and not being changed because someone added a runtime
field to a
class.
usually when you do want to add a serialized field you'd also want old
versions of the file still to be readable, so you end up writing
custom
(versioned) deserialization code anyway, even if your language has
built-in
serialization support.

you could use some compile time code generator to write default
serialization
code for you, using e.g. OpenC++, GCC-XML, or Doxygen, but I doubt
those
generated functions would stay there very long.


--------------------------------------- third reply to OP
------------------------------------

Hi,

By design Boost Serialization requires the user to list each field that
needs to be serialised.

This is in contrast, for example, with Java and some other languages where
serialisation is supported (kind of) at the language level.

No, Paul, C++ has no reflection and the reason is that unlike Java it
does not define ABI. Consider that in C++ not only the size of the
built-in types is not standardized, but even some operations have
"implementation-defined" semantics.
The current approach requires some code duplication. We have to declare a
field. We have to manually (de)serialise it. If we need to make a change, we
make it in at least two distinct places.

You are thinking in terms of reflection. Think in terms of states and
invariants and it'll make more sense. For example, if you have an
array of items and a pointer, and your invariant is that the pointer
always points the last element in the array, the pointer should not be
serialized.

Emil Dotchevski
Reverge Studios, Inc.
http://www.revergestudios.com/reblog/index.php?n=ReCode


------------------------------- my reply to Stefan Strasser
------------------------
not that I'm aware of, but I even think that's a good thing.
the serialize() function represents a file format, and you want file
formats to be stable and not being changed because someone
added a runtime field to a class.

"runtime field" ?
usually when you do want to add a serialized field you'd also want old
versions of the file still to be readable, so you end up writing custom
(versioned) deserialization code anyway, even if your language has built-in
serialization support.

I recommend avoiding the versioning support in Boost Serialization.
It runs counter to good development practices by averting the type
system. Consider for example a class called Account that uses
versioning to support multiple releases of a product. In the usual
case, later releases will have more fields and added complexity
than earlier releases. Support then for a client at an early release,
say 1.1, becomes inefficient since Account is being used to
handle both 1.2 and 1.1 users. If instead Account_11 and
Account_12 are used -- with Account_12 probably derived from
Account_11 -- this weakness is avoided. Additionally this approach
is beneficial from a testing perspective. In a 1.2 server, 1.1
clients are supported using Account_11, which has already been
tested and is not messed with to support 1.2 clients.
you could use some compile time code generator to write default
serialization code for you, using e.g. OpenC++, GCC-XML, or
Doxygen, but I doubt those generated functions would stay there
very long.

I'm not exactly sure what you are saying here, but if a user is
simply improving the names of some of his fields, the functions
won't stay the same very long. Automating this helps with a
common problem of forgetting to update the serialization
functions and the compiler then barking. I don't claim though
that every class should be handled this way. The C++ Middleware
Writer allows users to turn off the automated generation of
marshalling functions if that is desired. In my experience, it is
unusual to turn that functionality off.



------------------------------- my reply to Emil Dotchevski
--------------------
You are thinking in terms of reflection. Think in terms of states and
invariants and it'll make more sense. For example, if you have an
array of items and a pointer, and your invariant is that the pointer
always points the last element in the array, the pointer should not be
serialized.

I don't know if that is a common situation, but there are two ways to
handle that with the C++ Middleware Writer. One as I mentioned
in another post is to turn off the automatic generation of marshalling
functions. The other is to place those fields which shouldn't be
included in the marshalling process within an #ifdef SERVER_SIDE
-- http://webEbenezer.net/ifdefSERVER.html .
The macro is turned on when servers are built and off when clients
are built. This approach enables servers to have a complete view of
a type and clients to have an accurate, but limited view of the same
type.


--------------------------------Stefan Strasser's reply to me
------------------
"runtime field" ?

field that exists only at runtime. I was assuming serialization used
for
persistence.
non-serialized field.
I recommend avoiding the versioning support in Boost Serialization.
It runs counter to good development practices by averting the type
system. Consider for example a class called Account that uses
versioning to support multiple releases of a product. In the usual
case, later releases will have more fields and added complexity
than earlier releases. Support then for a client at an early release,
say 1.1, becomes inefficient since Account is being used to

that's a very specific case. you can use different types for
versioning using boost.serialization.
more often you want the evolved types to handle old files/streams/...

I'm not exactly sure what you are saying here, but if a user is
simply improving the names of some of his fields, the functions
won't stay the same very long. Automating this helps with a

the same as above, that most serialization functions you might want to
generate automatically at the start of a project end up being custom
serialization functions anyway.
common problem of forgetting to update the serialization
functions and the compiler then barking. I don't claim though
that every class should be handled this way. The C++ Middleware
Writer allows users to turn off the automated generation of
marshalling functions if that is desired. In my experience, it is
unusual to turn that functionality off.

I don't want to start anything, but you already spend the better part
of your messages on this list advertising your ebenezer thing, so I
don't think we need yet another discussion about it.

-------------------------------------------------------------------------------------------------------

I'm going to reply to Stefan Strasser's last post in a separate
post.


Brian Wood
http://www.webEbenezer.net
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,055
Latest member
SlimSparkKetoACVReview

Latest Threads

Top