Serialization - filesystem or dbms

Antimon · Dec 4, 2005

Hi,
I'm working on a gameserver (not a huge project) and i need to save the
whole world state to somewhere.

Game world will be object oriented and there can be many different
class types so i cant just create 4-5 tables on a database and store
information. So i wanted to have a "ISerializable" interface with two
methods: "Serialize" and "Deserialize". Serialize will return that
objects byte[] representation and Deserialize will use a byte[] to
construct the same object from scratch.
All "ISerializable" objects will be mapped in a static
"Hashtable<Serial, ISerializable>" where "Serial" will be an id number
class assigned to each ISerializable on construction. So i can write
Serial id's as object pointers while serialization.

This approach seemed to be fine to me. So i'm thinking about
implementing it on filesystem or an rdbms. If i use a rdbms, there will
be 2 tables

One for class types, one for object instances. I will be
storing binary data on database (returned from Serialize method).
It seems that there's no point on using an rdbms for something like
this but if i use filesystem, i will need to suspend server each 30
mins or something and dump all world into a file. So, a crash may cause
a timewarp. If i use a rdbms, i can have a continious saving mechanism.
I can place modified objects into a queue, and a thread can write them
to the database continiously. So if server goes offline for some
reason, i would only loose data on the serialization buffer which will
not be a comparable amount to 30 mins. And rdbms system would allow
splitting the server to 2 machines (one for server application, one for
rdbms layer) w/o any effort.

On the other hand, filesystem mechanism will be very easy to implement
and maintain.

All suggestions are welcome

Please help me decide what to do.

zero · Dec 4, 2005

Hi,
I'm working on a gameserver (not a huge project) and i need to save the
whole world state to somewhere.

Game world will be object oriented and there can be many different
class types so i cant just create 4-5 tables on a database and store
information. So i wanted to have a "ISerializable" interface with two
methods: "Serialize" and "Deserialize". Serialize will return that
objects byte[] representation and Deserialize will use a byte[] to
construct the same object from scratch.
All "ISerializable" objects will be mapped in a static
"Hashtable<Serial, ISerializable>" where "Serial" will be an id number
class assigned to each ISerializable on construction. So i can write
Serial id's as object pointers while serialization.

This approach seemed to be fine to me. So i'm thinking about
implementing it on filesystem or an rdbms. If i use a rdbms, there will
be 2 tables One for class types, one for object instances. I will be
storing binary data on database (returned from Serialize method).
It seems that there's no point on using an rdbms for something like
this but if i use filesystem, i will need to suspend server each 30
mins or something and dump all world into a file. So, a crash may cause
a timewarp. If i use a rdbms, i can have a continious saving mechanism.
I can place modified objects into a queue, and a thread can write them
to the database continiously. So if server goes offline for some
reason, i would only loose data on the serialization buffer which will
not be a comparable amount to 30 mins. And rdbms system would allow
splitting the server to 2 machines (one for server application, one for
rdbms layer) w/o any effort.

On the other hand, filesystem mechanism will be very easy to implement
and maintain.

All suggestions are welcome Please help me decide what to do.

It seems like you already have the pros and cons thought out quite well.
An rdbms seems like a heavy tool for this, but for performance it may be
necessary. On the other hand, maybe you could use a separate thread that
continuously (or at least sooner than every 30 minutes) saves the world
state to file, without affecting performance much. As for the difficulty
in implementing and maintaining, it just depends on what you're used to.
Using an rdbms in Java isn't really that much more complicated than using
files. I think both options are about equal, so it doesn't really matter
which you choose. Just make a decision, and stick with it.

Roedy Green · Dec 4, 2005

All "ISerializable" objects will be mapped in a static
"Hashtable<Serial, ISerializable>" where "Serial" will be an id number
class assigned to each ISerializable on construction. So i can write
Serial id's as object pointers while serialization.

Why not just use Java's built-in serialisaton? you do one i/o and
Java chases dependent objects for you.

Antimon · Dec 4, 2005

I'm not familiar with java's built-in Serialization. I was thinking
about using "Externalizable" interface but i'm worried about error
handling.
I mean, if i just remove a class type from server, i can check for the
ctor and request input to ignore that type while deserialization.
"Serializable" on the other hand, produces too big object instance
data. And, there's not too much difference between Externalizable and
my approach. writeObject and readObject methods will be needed to
implement in both. I will just need to take care of pointers and stuff
but i will gain full control over serialization.

Dimitri Maziuk · Dec 4, 2005

Antimon sez:
....

It seems that there's no point on using an rdbms for something like
this but if i use filesystem, i will need to suspend server each 30
mins or something and dump all world into a file. So, a crash may cause
a timewarp. If i use a rdbms, i can have a continious saving mechanism.

I think the main advantade of rdbms would be transactions: a crash may
cause time warp, but the world will be restored to a consistent state.
If you use filesystem, you'll have to deal with the possibility of a
crash during file write.

There are a couple of persistence packages you should look into, like
db4o and hibernate.

Dima

Roedy Green · Dec 4, 2005

I think the main advantade of rdbms would be transactions: a crash may
cause time warp, but the world will be restored to a consistent state.
If you use filesystem, you'll have to deal with the possibility of a
crash during file write.

You can log transactions without a DBMS; you must commit every x
seconds or so to make sure they are fully written to disk. This is
considerably faster than all the before looks and after looks you
might do for a database transaction. The disadvantage is you have to
replay the updating transactions, including the calculations, against
an intact database backup. This can take quite a while before you
have recovered. This was the technique I used back in the 70s for
central banking on computers with less ram and CPU power that today's
desktops.

Richard Wheeldon · Dec 4, 2005

Antimon said:
I'm not familiar with java's built-in Serialization.

http://mindprod.com/jgloss/serialization.html
Thought I'd save Roedy the trouble

"Serializable" on the other hand, produces too big object instance
data.

Sounds like you need to take a look at the "transient" keyword. Java's
serialization should take care of most of this. You could also look at
the XML based serialization which is very bloated but much easier to
repair manually if required.

Alternatively, if you want the RDBMS features but don't want the bloat,
have you tried looking at an embedded database such as hsql ?

Richard

Hiran Chaudhuri · Dec 4, 2005

Roedy Green said:
Why not just use Java's built-in serialisaton? you do one i/o and
Java chases dependent objects for you.

If I understood zero's point, he thinks of an RDBMS for transactionality.
That is one point, but then there is also the overhead of real mapping the
data to the database.

Maybe some other solution might come handy. How about object oriented DBMS?
Or XML, whether in filesystem or database.....

Hiran

Roedy Green · Dec 5, 2005

The disadvantage is you have to
replay the updating transactions, including the calculations, against
an intact database backup. This can take quite a while before you
have recovered. This was the technique I used back in the 70s for
central banking on computers with les

the other disadvantage is you must take your database offline
periodically for backup. For most businesses you can take your
website down for maintenance providing only lookup for long enough to
copy the flat files, something considerably quicker than any sort of
record by record backup.

Chris Uppal · Dec 5, 2005

Antimon said:
This approach seemed to be fine to me. So i'm thinking about
implementing it on filesystem or an rdbms.

If you haven't already looked at it, theny you might find the 'Prevalence'
design concept interesting.

See: http://www.prevayler.org/

-- chris

isamura · Dec 5, 2005

"Roedy Green" wrote ...
: On Sun, 04 Dec 2005 22:30:07 GMT, Roedy Green
: indirectly quoted someone who said :
:
: >The disadvantage is you have to
: >replay the updating transactions, including the calculations, against
: >an intact database backup. This can take quite a while before you
: >have recovered. This was the technique I used back in the 70s for
: >central banking on computers with les
: the other disadvantage is you must take your database offline
: periodically for backup. For most businesses you can take your
: website down for maintenance providing only lookup for long enough to
: copy the flat files, something considerably quicker than any sort of
: record by record backup.
:
This is not necessarily true if you use MySQL. You can setup slaves to mirror the master db and get
instant backup copies. You can even go further by stopping a slave and back that up. Perhaps other
RDBMS also have this capability.

..k

Roedy Green · Dec 5, 2005

For most businesses you can take your
: website down for maintenance providing only lookup for long enough to
: copy the flat files, something considerably quicker than any sort of
: record by record backup.
:
This is not necessarily true if you use MySQL. You can setup slaves to mirror the master db and get
instant backup copies. You can even go further by stopping a slave and back that up. Perhaps other
RDBMS also have this capability.

We are differening on the meaning of quicker. The whole point of
using an advance SQL engine is it lets you backup without shutting
down. You can't get much quicker than an instantaneous backup. But in
another sense, e.g. it terms of total CPU cycles or total number of
I/Os, such a record by record backup has much more total overhead
than shutting down and backing up a flat file a meg a pop..

Dimitri Maziuk · Dec 5, 2005

Roedy Green sez:

You can log transactions without a DBMS; you must commit every x
seconds or so to make sure they are fully written to disk. This is
considerably faster than all the before looks and after looks you
might do for a database transaction. The disadvantage is you have to
replay the updating transactions, including the calculations, against
an intact database backup. This can take quite a while before you
have recovered. This was the technique I used back in the 70s for
central banking on computers with less ram and CPU power that today's
desktops.

How about granularity of your saves, did you have lots of small files
or one huge one? How many copies of each? Did you run round-robin on
2 files or did you create a new savefile every time? If it's one file,
how big was it and how long did it take to write it? If it was lots of
little ones, what did you do to avoid name clashes etfc. when creating
new ones?

Sure you can log transactions without a DBMS, if you want to write all
that code.

Dima

Roedy Green · Dec 6, 2005

How about granularity of your saves, did you have lots of small files
or one huge one? How many copies of each? Did you run round-robin on
2 files or did you create a new savefile every time? If it's one file,
how big was it and how long did it take to write it? If it was lots of
little ones, what did you do to avoid name clashes etfc. when creating
new ones?

Back the 70s you typically logged to mag tape. This was your way of
being sure you had everything captured no matter how terrible the
crash.

Roedy Green · Dec 6, 2005

Back the 70s you typically logged to mag tape. This was your way of
being sure you had everything captured no matter how terrible the
crash.

Typically back then you assigned two tape drives that automatically
toggled back and forth. So long as the operator got the new empty
tape up in time there was no delay.

Larry Coon · Dec 6, 2005

Roedy said:
the other disadvantage is you must take your database offline
periodically for backup.

This is definitely not true for Sybase, and I assume it's
also false for most/all modern dbms's.

Larry Coon
University of California

Roedy Green · Dec 6, 2005

This is definitely not true for Sybase, and I assume it's
also false for most/all modern dbms's.

but I was not describing an SQL database. I was talking about roll
your own transaction replay.

Different Serialization Technique In .NET	0	Sep 27, 2013
Serialization	10	Nov 4, 2010
Java serialization over network	7	Apr 2, 2009
naive serialization	4	Oct 15, 2010
Serialization Framework	3	Dec 16, 2012
Serialization	0	May 12, 2009
Getter/Setter - Serialization	6	Jul 24, 2011
Serialization	3	Dec 7, 2006

Serialization - filesystem or dbms

Antimon

zero

Roedy Green

Antimon

Dimitri Maziuk

Roedy Green

Richard Wheeldon

Hiran Chaudhuri

Roedy Green

Chris Uppal

isamura

Roedy Green

Dimitri Maziuk

Roedy Green

Roedy Green

Larry Coon

Roedy Green

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads