Java Serializing in C/C++?

Ramon F Herrera · Feb 25, 2008

My client application is written in C/C++ and runs on Windows.

Then, on the server side I have Linux, with a Java or Oracle listening
for queries.

My problem is the large number of arguments going back and forth. What
I have been doing is to pass all the arguments in a colon-separated
string, or even in several lines (separated by <CR>). It has becoming
increasingly troublesome to keep the client and server programs in
sync.

I keep on hearing about serialization, but have never used it. This
seems to be the problem that serialization is supposed to solve.

I would like to serialize my data in C/C++ before sending it to the
server and unpack the multiple results.

Is there any package out there, written in C/C++ that will prepare the
serialized "packets" and unpack them?

TIA,

-Ramon

Ian Collins · Feb 25, 2008

Ramon said:
My client application is written in C/C++ and runs on Windows

I would like to serialize my data in C/C++ before sending it to the
server and unpack the multiple results.

There's no such language as C/C++. There's C and there's C++, how one
serialises data differs significantly between the two. Which one are
you using?

Ramon F Herrera · Feb 25, 2008

There's no such language as C/C++. There's C and there's C++, how one
serialises data differs significantly between the two. Which one are
you using?

Thanks, Ian.

I am willing to program the relevant code in either C or in C++
depending on which ones is more convenient for interfacing with Java.

-Ramon

Ian Collins · Feb 25, 2008

Ramon said:
Thanks, Ian.

I am willing to program the relevant code in either C or in C++
depending on which ones is more convenient for interfacing with Java.

Then you are probably better of in C++. There isn't any native
serialisation support in C++, so you will either have to search for a
package that in compatible with Java, or roll your own which isn't too
big a deal.

Glen Dayton · Feb 25, 2008

Ian said:
Then you are probably better of in C++. There isn't any native
serialisation support in C++, so you will either have to search for a
package that in compatible with Java, or roll your own which isn't too
big a deal.

Java interfaces nicely with C++ with its JNI interface, and some
compilers will automatically generate the Java headers for you.

Serialization usually refers to the preparation of an object so
that it may be placed in persistent storage and restored from there.

The problem of packing and unpacking results between a client
and server is more closely related to marshalling.

Rather than concentrating on a low-level protocol in which you
must specify the marshalling of arguments, can you use a more
abstract interface in which your Java objects and C++ objects
interact via CORBA or SOAP? For CORBA objects I've used both
the TAO IDL and the AT&T Omniorb. For SOAP I've had great
success with gSOAP. I think its easier to use CORBA between
Java and C++.

Glen

James Kanze · Feb 25, 2008

My client application is written in C/C++ and runs on Windows.

Then, on the server side I have Linux, with a Java or Oracle listening
for queries.

My problem is the large number of arguments going back and
forth. What I have been doing is to pass all the arguments in
a colon-separated string, or even in several lines (separated
by <CR>). It has becoming increasingly troublesome to keep the
client and server programs in sync.

By keeping the client and server programs in sync, what do you
mean exactly? Making sure that both are using the same version
of the protocol? Or is it a problem of one of the two loosing
its place in the data stream.

I keep on hearing about serialization, but have never used it.
This seems to be the problem that serialization is supposed to
solve.

It seems to me you are using it. You're sending data on a
serial link, and recoving it on the other side.

I would like to serialize my data in C/C++ before sending it
to the server and unpack the multiple results.

Is there any package out there, written in C/C++ that will
prepare the serialized "packets" and unpack them?

There are a lot of them, but I don't see where they'll solve the
problem you have.

The surest solution is to use a self-identifying representation
for the data. If the data have a complicated structure, or
you're already using XML elsewhere in the project, and have the
parser, you might consider XML. In most cases, however, it is
overkill, and simply sending attribute-value pairs will be
largely sufficient. Then code so that missing attributes have a
default value, and extra attributes are ignored (except for
generating a message in the log), and your code should be able
to handle most version changes without problems, and will always
know exactly where it is in the data.

This does result in a lot more data on the line. If this is a
problem, the version control problem can also be handled by
protocol negociation on connection.

Ramon F Herrera · Feb 25, 2008

By keeping the client and server programs in sync, what do you
mean exactly? Making sure that both are using the same version
of the protocol?

That's what I mean. When I add a new field or move them around, I have
to place my finger on the screen and count the arguments. Then I get
busy working on one program and neglect to change the protocol in the
other affected programs.

The surest solution is to use a self-identifying representation
for the data. If the data have a complicated structure, or
you're already using XML elsewhere in the project, and have the
parser, you might consider XML. In most cases, however, it is
overkill, and simply sending attribute-value pairs will be
largely sufficient.

I arrived to the same conclusion (the attribute-value pair approach),
thanks for providing confirmation.

The most complicated data I have is a table. The XML parsing will have
to wait for a harder problem. It is an overkill for this.

Thanks!

-Ramon

Gordon Beaton · Feb 25, 2008

That's what I mean. hen I add a new field or move them around, I
have to place my finger on the screen and count the arguments. Then
I get busy working on one program and neglect to change the protocol
in the other affected programs.

Why don't the affected programs share the common library code that
defines the protocol?

/gordon

--

Lew · Feb 25, 2008

Glen said:
Rather than concentrating on a low-level protocol in which you must
specify the marshalling of arguments, can you use a more abstract
interface in which your Java objects and C++ objects interact via CORBA
or SOAP? For CORBA objects I've used both the TAO IDL and the AT&T
Omniorb. For SOAP I've had great success with gSOAP. I think its
easier to use CORBA between Java and C++.

"Easier" is a relative term - you have to maintain the ORB, for sure.

The SOAP approach carries the benefit of all XML approaches - a
platform-neutral, semantically-void and completely portable format. SOAP
carries the additional benefit of a large community working to polish the
rough edges ongoingly.

These approaches might seem like "overkill", but setting up a SOAP stack is
hardly more difficult than putting together an ORB-based solution, at this
point probably much easier. Then you automatically get the benefit of easy
expandability if you decide later to widen to a network implementation.

Planning for the future is especially good when it involves negligible extra
effort in the present.

Roedy Green · Feb 25, 2008

I keep on hearing about serialization, but have never used it. This
seems to be the problem that serialization is supposed to solve.

Serialisation is for Java to Java communication. It is great for
sending complicated trees of various objects where the structure of
the tree changes each time. It is not good for long term storage,
since it does not deal well with changes in the source code for
objects.

For Java to C++, you would need Corba. It is conceivable that you
would use Serialisation or RMI for Java to Java then use JNI to pass
the data to C++, but that would be a nightmare maintaining all the
JNI. You might as well cook up a set of binary message formats to
exchange, perhaps using Java DataOutputStream format.

see http://mindprod.com/jgloss/serialization.html
http://mindprod.com/jgloss/corba.html
http://mindprod.com/jgloss/rmi.html

Ramon F Herrera · Feb 25, 2008

Why don't the affected programs share the common library code that
defines the protocol?

/gordon

--

Good point. So far my applications and applets have been simple enough
that I have preferred to keep them in a single jar file.

-Ramon

k-e-n · Feb 25, 2008

My client application is written in C/C++ and runs on Windows.

Then, on the server side I have Linux, with a Java or Oracle listening
for queries.

My problem is the large number of arguments going back and forth. What
I have been doing is to pass all the arguments in a colon-separated
string, or even in several lines (separated by <CR>). It has becoming
increasingly troublesome to keep the client and server programs in
sync.

I keep on hearing about serialization, but have never used it. This
seems to be the problem that serialization is supposed to solve.

I would like to serialize my data in C/C++ before sending it to the
server and unpack the multiple results.

Is there any package out there, written in C/C++ that will prepare the
serialized "packets" and unpack them?

TIA,

-Ramon

This problems seems to be more about the application design and build
process.
Using the same/common code or message structure on both ends is a good
start

While all of the other suggestions are very valid, a simple low tech
approach, would be to include the "message type revision number" as
the first parameter in all messages.
This should be the first thing that your server side code checks.

Also a simple start to a 'self-descriptive' message format would be to
always use the 2nd parameter/entry in all messages to be the count of
the number of remaining elements.

Obviously another thing you can add is the total message length, again
at a fixed location near the beginning of the message.

Create for your self a MessageHeader class and think about useful
things it can contain: Message-Type, Type-Revision, Element-Count,
Byte-Count, ...
If you think you need to add a Message-Header-Revision-Number so that
the MessageHeader itself does not get out of synch.

EJP · Feb 26, 2008

Ian said:
Then you are probably better of in C++. There isn't any native
serialisation support in C++, so you will either have to search for a
package that in compatible with Java, or roll your own which isn't too
big a deal.

I doubt that you will find such a package, and rolling your own is an
*enormous* deal. You would have to provide the C++ package with enough
information to be able to serialize/deserialize every serializable class
in the JDK for a start, and do it the same way, and also a mechanism to
add your own application classes if you want to serialize them, and any
3rd-party stuff too.

In practice you can't do Serialization without a JVM.

I would look first at IDL for this, secondly XDR if you can find a Java
XDR implementation.

James Kanze · Feb 26, 2008

[...]

In practice you can't do Serialization without a JVM.

This is news to all of those of us who do it regularly. Corba
actually works better in C++ than in Java---you don't need all
of those helper classes for out an inout arguments.

James Kanze · Feb 26, 2008

Why don't the affected programs share the common library code that
defines the protocol?

If I understood him right, the programs are (or are likely to
be) in different languages. And of course, even using a common
library doesn't guarantee a solution to the versioning problem;
you've still got to ensure that the server and all of the
clients are using executables which have been linked with the
same version of the library.

James Kanze · Feb 26, 2008

Serialisation is for Java to Java communication.

Serialization is a generic term, and can be done in any
language. (I was doing serialization in C long before Java was
invented.)

It is great for sending complicated trees of various objects
where the structure of the tree changes each time. It is not
good for long term storage, since it does not deal well with
changes in the source code for objects.

Those might be the constraints of Java's serialization. It's
not a general constraint, and I've used serialization schemes
which managed version change well. (My experience with Java was
that the built-in serialization was pretty worthless. But then,
we were in an environment with some 60000 client machines,
spread out over several thousand sites, and it was a foregone
conclusion that we'd have to deal with different versions.)

For Java to C++, you would need Corba.

That's one solution. In his case, from what little he's said,
it might be overkill.

Note that whenever possible, it's a lot easier to debug
serialization formats which use text. Which is not the case for
either Java's built-in serialization or Corba.

James Kanze · Feb 26, 2008

[...]

While all of the other suggestions are very valid, a simple
low tech approach, would be to include the "message type
revision number" as the first parameter in all messages.
This should be the first thing that your server side code checks.

And what does it do if the revision numbers don't match?

For major changes, this is often the only alternative.
Normally, the revision number should be passed only once, as
part of the login procedure or the connection protocol, since
differences can go beyond the scope of a single message. But
doing so normally means maintaining code to handle several
different versions in the server.

coal · Feb 26, 2008

This problems seems to be more about the application design and build
process.
Using the same/common code or message structure on both ends is a good
start

While all of the other suggestions are very valid, a simple low tech
approach, would be to include the "message type revision number" as
the first parameter in all messages.
This should be the first thing that your server side code checks.

Also a simple start to a 'self-descriptive' message format would be to
always use the 2nd parameter/entry in all messages to be the count of
the number of remaining elements.

Obviously another thing you can add is the total message length, again
at a fixed location near the beginning of the message.

Do you always use total message lengths? I'm of the opinion that if
communication is happening behind a single firewall, the total message
lengths can be omitted. That assumes an organization can trust it's
employees. For the most part I think that is a safe
assumption, but there are some exceptions.

Create for your self a MessageHeader class and think about useful
things it can contain: Message-Type, Type-Revision, Element-Count,
Byte-Count, ...

Why send the Element-Count? It seems redundant given the Message-Type
and version number.

If you agree that total message lengths are optional in some contexts,
the MessageHeader would have to reflect that.

Brian Wood
Ebenezer Enterprises
www.webebenezer.net

James Kanze · Feb 26, 2008

On Feb 25, 2:27 pm, k-e-n <[email protected]> wrote:

[...]

Do you always use total message lengths? I'm of the opinion
that if communication is happening behind a single firewall,
the total message lengths can be omitted. That assumes an
organization can trust it's employees. For the most part I
think that is a safe assumption, but there are some
exceptions.

Interestingly, the major use of total message lengths I've seen
is between processes communicating over a pipe. It's mainly an
optimization measure, but it can make a significant difference,
and it can make the code easier to write as well.

Why send the Element-Count? It seems redundant given the
Message-Type and version number.

It depends. Suppose the Message-Type contains a variable length
array.

coal · Feb 26, 2008

On Feb 25, 2:27 pm, k-e-n <[email protected]> wrote:

Click to expand...

[...]

Do you always use total message lengths? I'm of the opinion
that if communication is happening behind a single firewall,
the total message lengths can be omitted. That assumes an
organization can trust it's employees. For the most part I
think that is a safe assumption, but there are some
exceptions.

Click to expand...

Interestingly, the major use of total message lengths I've seen
is between processes communicating over a pipe. It's mainly an
optimization measure, but it can make a significant difference,
and it can make the code easier to write as well.

I'm not sure why you describe it as an "optimization measure."
It seems to me that not calculating/sending/receiving a total msg
length is simpler if the context permits. What I got out of the
thread on clc++m about denial of service and serialization is that
using a total msg length is important from a security perspective.

It depends. Suppose the Message-Type contains a variable length
array.

You know from the msgid and version number that the message has one
high-level element - a variable length array. And the length of the
array is prepended to the array data as part of the payload. I guess
there is header-like info embedded in the payload the way I think
about it. It could be made part of a header but I'm don't think
there
is anything to be gained from that. If some messages don't have
variable
length data there is a little bit of unnecessary overhead in having a
element count.

Brian Wood

Serializing classes from Web Services	1	Aug 10, 2004
socket between java client and c++ server	2	Dec 14, 2006
C++, Win32 API, .NET, Java, J2ee in UK-Work permit	3	May 6, 2007
Memory leak when calling java code from C using JNI	0	Aug 28, 2009
Java RMI questions and MyEclipse	8	Oct 2, 2012
Is C++ faddish like Java?	25	Aug 4, 2005
Jython / Java / C / C++ interoperability	1	Feb 12, 2007
C++ to Java Conversion Utility	18	Jan 15, 2008

Java Serializing in C/C++?

Ramon F Herrera

Ian Collins

Ramon F Herrera

Ian Collins

Glen Dayton

James Kanze

Ramon F Herrera

Gordon Beaton

Lew

Roedy Green

Ramon F Herrera

k-e-n

EJP

James Kanze

James Kanze

James Kanze

James Kanze

coal

James Kanze

coal

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads