Low-latency alternative to Java Object Serialization

Discussion in 'Java' started by Giovanni Azua, Oct 1, 2011.

  1. Hi again :)

    I have this lite Client-Server framework based on Blocking IO using classic
    java.net.* Sockets (must develop it myself for a grad course project). The
    way I am using to pass data over the Sockets is via Serialization i.e.
    ObjectOutputStream#writeObject(...) and ObjectInputStream#readObject(...) I
    was wondering if anyone can recommend a Serialization framework that would
    outperform the vanilla Java default Serialization?

    Three years ago I worked for a "high frequency trading" company and they
    avoided default Java Serialization like "the devil to the cross" this is a
    Spanish idiom btw ... :) due to its latency. However, I must say that their
    remoting framework dated back to the Java stone age and my guess is that the
    default Serialization must have improved over the years; I don't have hard
    numbers to judge though. I remember JBoss Middleware implementation having
    some Serialization framework for this very same reason ... have to check
    that too.

    Can anyone advice what would be best than Java Serialization without
    requiring an unreasonably heavy dependency footprint?

    Many thanks in advance,
    Best regards,
    Giovanni
     
    Giovanni Azua, Oct 1, 2011
    #1
    1. Advertising

  2. Giovanni Azua

    Lew Guest

    Giovanni Azua wrote:
    > I have this lite Client-Server framework based on Blocking IO using classic
    > java.net.* Sockets (must develop it myself for a grad course project). The
    > way I am using to pass data over the Sockets is via Serialization i.e.
    > ObjectOutputStream#writeObject(...) and ObjectInputStream#readObject(...)I
    > was wondering if anyone can recommend a Serialization framework that would
    > outperform the vanilla Java default Serialization?
    >
    > Three years ago I worked for a "high frequency trading" company and they
    > avoided default Java Serialization like "the devil to the cross" this is a
    > Spanish idiom btw ... :) due to its latency. However, I must say that their
    > remoting framework dated back to the Java stone age and my guess is that the
    > default Serialization must have improved over the years; I don't have hard
    > numbers to judge though. I remember JBoss Middleware implementation having
    > some Serialization framework for this very same reason ... have to check
    > that too.
    >
    > Can anyone advice what would be best than Java Serialization without
    > requiring an unreasonably heavy dependency footprint?


    Side bar: What exactly do you mean by "latency" here?

    Serialization assumes no knowledge on the restoring end about the structures to restore, so all knowledge has to reside in the serialization format.

    Circular dependencies, inheritance chains, the whole megillah has to be encoded into the serialized stream.

    Serialization is designed to store and restore object graphs, not the data in them.

    Take a page from web services and create an XML schema to represent the *data* you wish to transfer. This assumes knowledge on both ends of the structures used to hold the data, unlike object serialization, hence much less information must flow between the participants.

    Use JAXB to generate the classes used to process that schema and incorporate those classes into the protocol at both ends.

    Fast, standard and fairly low effort and low maintenance, assuming you haveversion control and continuous integration (CI).

    By "fast" I mean both to develop and to operate.

    You will write custom code to jam the data into your JAXB-generated structures and retrieve them therefrom.

    But you will be transmitting data via a format that omits the object graph overhead and focuses on just the data to share. The object-graph knowledgeis coded into the application and need not be transferred.

    XML is awesome for this kind of task.

    --
    Lew
     
    Lew, Oct 1, 2011
    #2
    1. Advertising

  3. Giovanni Azua

    markspace Guest

    On 10/1/2011 5:46 AM, Giovanni Azua wrote:
    > Three years ago I worked for a "high frequency trading" company and they
    > avoided default Java Serialization like "the devil to the cross"



    Just because "avoid serialization" was a requirement for your previous
    work, doesn't mean that it should be a requirement for every project
    after that.

    Frequently, the low-developer cost of Java serialization overrides all
    other concerns. The increase in CPU costs and network bandwidth it
    requires is very cheap. DO NOT work around Java serialization unless
    you are sure you need to. I.e., after careful analysis (and profiling)
    of a working app or prototype.

    If you do need to work around Java serialization, look at
    Externalizable interface.

    http://java.sun.com/developer/technicalArticles/Programming/serialization/

    Note the sections on "gotchas" in that article. Esp. both the caching
    and the performance considerations.

    Totally rolling your own protocol is possible too if you need the utmost
    performance. 'Tain't hard. 'Tain't easy either. Data IO Streams are a
    good compromise between higher level serialization and raw sockets.

    <http://download.oracle.com/javase/7/docs/api/java/io/DataOutputStream.html>
     
    markspace, Oct 1, 2011
    #3
  4. On 10/01/2011 06:19 PM, Lew wrote:
    > Giovanni Azua wrote:
    >> I have this lite Client-Server framework based on Blocking IO using classic
    >> java.net.* Sockets (must develop it myself for a grad course project). The
    >> way I am using to pass data over the Sockets is via Serialization i.e.
    >> ObjectOutputStream#writeObject(...) and ObjectInputStream#readObject(...) I
    >> was wondering if anyone can recommend a Serialization framework that would
    >> outperform the vanilla Java default Serialization?
    >>
    >> Three years ago I worked for a "high frequency trading" company and they
    >> avoided default Java Serialization like "the devil to the cross" this is a
    >> Spanish idiom btw ... :) due to its latency. However, I must say that their
    >> remoting framework dated back to the Java stone age and my guess is that the
    >> default Serialization must have improved over the years; I don't have hard
    >> numbers to judge though. I remember JBoss Middleware implementation having
    >> some Serialization framework for this very same reason ... have to check
    >> that too.
    >>
    >> Can anyone advice what would be best than Java Serialization without
    >> requiring an unreasonably heavy dependency footprint?

    >
    > Side bar: What exactly do you mean by "latency" here?
    >
    > Serialization assumes no knowledge on the restoring end about the structures to restore, so all knowledge has to reside in the serialization format.
    >
    > Circular dependencies, inheritance chains, the whole megillah has to be encoded into the serialized stream.
    >
    > Serialization is designed to store and restore object graphs, not the data in them.
    >
    > Take a page from web services and create an XML schema to represent the *data* you wish to transfer. This assumes knowledge on both ends of the structures used to hold the data, unlike object serialization, hence much less information must flow between the participants.
    >
    > Use JAXB to generate the classes used to process that schema and incorporate those classes into the protocol at both ends.
    >
    > Fast, standard and fairly low effort and low maintenance, assuming you have version control and continuous integration (CI).
    >
    > By "fast" I mean both to develop and to operate.
    >
    > You will write custom code to jam the data into your JAXB-generated structures and retrieve them therefrom.
    >
    > But you will be transmitting data via a format that omits the object graph overhead and focuses on just the data to share. The object-graph knowledge is coded into the application and need not be transferred.
    >
    > XML is awesome for this kind of task.


    http://www.json.org/ might also be a good alternative which - depending
    on format etc. - can be less verbose. See http://json.org/example.html

    Kind regards

    robert
     
    Robert Klemme, Oct 1, 2011
    #4
  5. On 01.10.2011 21:35, jebblue wrote:
    > On Sat, 01 Oct 2011 21:13:40 +0200, Robert Klemme wrote:
    >
    >> On 10/01/2011 06:19 PM, Lew wrote:
    >>> Giovanni Azua wrote:
    >>>> I have this lite Client-Server framework based on Blocking IO using
    >>>> classic java.net.* Sockets (must develop it myself for a grad course
    >>>> project). The way I am using to pass data over the Sockets is via
    >>>> Serialization i.e. ObjectOutputStream#writeObject(...) and
    >>>> ObjectInputStream#readObject(...) I was wondering if anyone can
    >>>> recommend a Serialization framework that would outperform the vanilla
    >>>> Java default Serialization?
    >>>>
    >>> But you will be transmitting data via a format that omits the object
    >>> graph overhead and focuses on just the data to share. The object-graph
    >>> knowledge is coded into the application and need not be transferred.
    >>>
    >>> XML is awesome for this kind of task.

    >>
    >> http://www.json.org/ might also be a good alternative which - depending
    >> on format etc. - can be less verbose. See http://json.org/example.html
    >>

    >
    > JSON is convenient for JavaScript heads, it is not human readable,
    > this is one reason why XML exists in the first place.


    I am not sure why you say JSON is not human readable while XML is.
    Remember: for network transfer you would use the most compressed format
    of either which means that for XML you would not have line breaks and
    indentation. I'd say an XML on one line with a reasonable complex
    structure is not human readable.

    > JSON was
    > a mistake, instead of coming up with an arcane hacked syntax
    > to replace XML; JavaScript should have been improved to handle
    > XML.


    That sounds like opinion to me. Can you provide any real arguments why
    XML should be chosen for as a data transfer format over JSON?

    XML does have some overhead and often uses more bytes to represent the
    same structure.

    There's also an interesting discussion at stackoverflow:
    http://stackoverflow.com/questions/2636245/choosing-between-json-and-xml#2636380

    Kind regards

    robert

    --
    remember.guy do |as, often| as.you_can - without end
    http://blog.rubybestpractices.com/
     
    Robert Klemme, Oct 2, 2011
    #5
  6. On 01.10.2011 14:46, Giovanni Azua wrote:
    > Hi again :)
    >
    > I have this lite Client-Server framework based on Blocking IO using classic
    > java.net.* Sockets (must develop it myself for a grad course project). The
    > way I am using to pass data over the Sockets is via Serialization i.e.
    > ObjectOutputStream#writeObject(...) and ObjectInputStream#readObject(...) I
    > was wondering if anyone can recommend a Serialization framework that would
    > outperform the vanilla Java default Serialization?
    >
    > Three years ago I worked for a "high frequency trading" company and they
    > avoided default Java Serialization like "the devil to the cross" this is a
    > Spanish idiom btw ... :) due to its latency. However, I must say that their
    > remoting framework dated back to the Java stone age and my guess is that the
    > default Serialization must have improved over the years; I don't have hard
    > numbers to judge though. I remember JBoss Middleware implementation having
    > some Serialization framework for this very same reason ... have to check
    > that too.
    >
    > Can anyone advice what would be best than Java Serialization without
    > requiring an unreasonably heavy dependency footprint?


    Btw, there is a completely different option not mentioned so far: CORBA
    with IIOP which was specifically designed for remote communication. Of
    course this would mean that you had to exchange your complete
    communication layer - but I wanted to mention it because I believe CORBA
    is used too rarely because it somehow seems out of fashion. But if you
    look at network bandwidth used I believe CORBA is a pretty good
    contender compared to SOAP for example.

    Kind regards

    robert

    --
    remember.guy do |as, often| as.you_can - without end
    http://blog.rubybestpractices.com/
     
    Robert Klemme, Oct 2, 2011
    #6
  7. Giovanni Azua

    Tom Anderson Guest

    On Sat, 1 Oct 2011, Giovanni Azua wrote:

    > I was wondering if anyone can recommend a Serialization framework that
    > would outperform the vanilla Java default Serialization?


    Swords not words:

    https://github.com/eishay/jvm-serializers/wiki/

    I sent them a patch to add JBoss Serialization a while ago, but they
    haven't taken it. I should try again now the project is on GitHub.

    > I remember JBoss Middleware implementation having some Serialization
    > framework for this very same reason ... have to check that too.


    It's pretty good. More or less plug-compatible with JDK serialization at
    the API level (as in, it doesn't need schema generation or weird
    interfaces or anything), and much faster. From what i remember of my
    benchmarks, it was faster than any of the textual formats, and only a bit
    slower than the schema-based binary formats like Protocol Buffers.

    tom

    --
    Now I am thoroughly confused. -- Colin Brace sums up RT3090 support
    in Linux
     
    Tom Anderson, Oct 3, 2011
    #7
  8. Giovanni Azua

    Tom Anderson Guest

    On Sat, 1 Oct 2011, Lew wrote:

    > Giovanni Azua wrote:
    >
    >> Can anyone advice what would be best than Java Serialization without
    >> requiring an unreasonably heavy dependency footprint?

    >
    > Serialization assumes no knowledge on the restoring end about the
    > structures to restore, so all knowledge has to reside in the
    > serialization format.
    >
    > Circular dependencies, inheritance chains, the whole megillah has to be
    > encoded into the serialized stream.
    >
    > Serialization is designed to store and restore object graphs, not the
    > data in them.
    >
    > Take a page from web services and create an XML schema to represent the
    > *data* you wish to transfer. This assumes knowledge on both ends of the
    > structures used to hold the data, unlike object serialization, hence
    > much less information must flow between the participants.
    >
    > Use JAXB to generate the classes used to process that schema and
    > incorporate those classes into the protocol at both ends.
    >
    > Fast, standard and fairly low effort and low maintenance, assuming you
    > have version control and continuous integration (CI).
    >
    > By "fast" I mean both to develop and to operate.


    Interesting. I do not believe this to be true. Specifically, i believe
    that: (a) developing an XML-based transfer format using JAXB will take
    considerably more effort than using standard serialization, or an equally
    convenient library such as JBoss Serialization, although still not a large
    amount of effort, certainly; (b) the data will be larger than
    with standard serialization (because the "object graph overhead" is not
    actually that large, and XML is much less space-efficient than
    serialization's binary format); and (c) the speed of operation, even
    assuming an infinitely fast network, will be lower.

    One get-out clause: for very short streams (one or a few objects), XML
    might beat standard serialization for space and speed. Standard
    serialization does have some per-class overhead, which is
    disproportionately expensive for short streams.

    tom

    --
    Now I am thoroughly confused. -- Colin Brace sums up RT3090 support
    in Linux
     
    Tom Anderson, Oct 3, 2011
    #8
  9. Giovanni Azua

    Roedy Green Guest

    On Sat, 01 Oct 2011 14:35:17 -0500, jebblue <> wrote, quoted or
    indirectly quoted someone who said :

    >JSON is convenient for JavaScript heads, it is not human readable,

    JSON is reads much like Java source code. I find it easier
    understand than XML even though I know XML much better.

    You might have been looking at some compressed JSON, or encrypted SSL
    traffic. XML would be just as inscrutable if you compressed it. It
    compresses well. (This is not a compliment).
    --
    Roedy Green Canadian Mind Products
    http://mindprod.com
    It should not be considered an error when the user starts something
    already started or stops something already stopped. This applies
    to browsers, services, editors... It is inexcusable to
    punish the user by requiring some elaborate sequence to atone,
    e.g. open the task editor, find and kill some processes.
     
    Roedy Green, Oct 3, 2011
    #9
  10. Giovanni Azua

    Roedy Green Guest

    On Mon, 3 Oct 2011 19:24:20 +0100, Tom Anderson <>
    wrote, quoted or indirectly quoted someone who said :

    >Specifically, i believe
    >that: (a) developing an XML-based transfer format using JAXB will take
    >considerably more effort than using standard serialization


    Serialisation handles complex data structures, even loops. XML is
    limited to trees.

    Serialisation handles any imaginable data type without extra work. XML
    requires inventing an external character representation and a way of
    converting to chars and back.

    Serialisation is hard to upgrade. XML is easy. Serialisation pretty
    much requires everyone to stay in sync with identical software. XML
    allows clients with out of date software, software in other languages,
    or even no software at all.
    --
    Roedy Green Canadian Mind Products
    http://mindprod.com
    It should not be considered an error when the user starts something
    already started or stops something already stopped. This applies
    to browsers, services, editors... It is inexcusable to
    punish the user by requiring some elaborate sequence to atone,
    e.g. open the task editor, find and kill some processes.
     
    Roedy Green, Oct 4, 2011
    #10
  11. Giovanni Azua

    Roedy Green Guest

    On Sat, 01 Oct 2011 09:48:50 -0700, markspace <-@.> wrote, quoted or
    indirectly quoted someone who said :

    > The increase in CPU costs and network bandwidth it
    >requires is very cheap.


    I did a system for monitoring security cameras. The boss said
    efficiency in transport was the #1 priority because it limited how
    many cameras could be monitored at a remote site.

    I did it by defining a number of binary records and writing a method
    to read/write each type with DataStream. It is conceptually simple --
    COBOL think, and had almost no overhead. I could have written a
    program to generate the Java code to read and write each method given
    a data description, but the formats were stable enough I never
    bothered. There were heart beat packets in times of no traffic to let
    each side know if the other were still live.

    --
    Roedy Green Canadian Mind Products
    http://mindprod.com
    It should not be considered an error when the user starts something
    already started or stops something already stopped. This applies
    to browsers, services, editors... It is inexcusable to
    punish the user by requiring some elaborate sequence to atone,
    e.g. open the task editor, find and kill some processes.
     
    Roedy Green, Oct 4, 2011
    #11
  12. Giovanni Azua

    Lew Guest

    Tom Anderson wrote:
    > Lew wrote:
    >> Use JAXB to generate the classes used to process that schema and
    >> incorporate those classes into the protocol at both ends.
    >>
    >> Fast, standard and fairly low effort and low maintenance, assuming you
    >> have version control and continuous integration (CI).
    >>
    >> By "fast" I mean both to develop and to operate.

    >
    > Interesting. I do not believe this to be true. Specifically, i believe
    > that: (a) developing an XML-based transfer format using JAXB will take
    > considerably more effort than using standard serialization, or an equally
    > convenient library such as JBoss Serialization, although still not a large
    > amount of effort, certainly; (b) the data will be larger than
    > with standard serialization (because the "object graph overhead" is not
    > actually that large, and XML is much less space-efficient than
    > serialization's binary format); and (c) the speed of operation, even
    > assuming an infinitely fast network, will be lower.
    >
    > One get-out clause: for very short streams (one or a few objects), XML
    > might beat standard serialization for space and speed. Standard
    > serialization does have some per-class overhead, which is
    > disproportionately expensive for short streams.


    Well, I haven't measured, but let's to a little gedankenexperiment.

    Fast to develop - serialization is actually tricky to do right. You can use the absolute defaults, but the world is littered with projects that had maintenance issues because serialization was done simple-mindedly. /Effective Java/ devotes an entire chapter to the topic. JAXB solutions, and I've made several, are very straightforward. Most of the effort goes into schema design, which is parallel to modeling so not even an overhead. I do think JAXB wins, but on balance assess that a competent programmer could do either one well with more-or-less similar effort.

    Fast to perform - XML is fast enough. Compressed, its bandwidth is not egregious. Overall I/O considerations should dominate, but I'll take a slightloss for the safety benefits of JAXB.

    --
    Lew
     
    Lew, Oct 4, 2011
    #12
  13. On 10/2/2011 5:07 AM, Robert Klemme wrote:
    > On 01.10.2011 21:35, jebblue wrote:
    >> On Sat, 01 Oct 2011 21:13:40 +0200, Robert Klemme wrote:
    >>
    >>> On 10/01/2011 06:19 PM, Lew wrote:
    >>>> Giovanni Azua wrote:
    >>>>> I have this lite Client-Server framework based on Blocking IO using
    >>>>> classic java.net.* Sockets (must develop it myself for a grad course
    >>>>> project). The way I am using to pass data over the Sockets is via
    >>>>> Serialization i.e. ObjectOutputStream#writeObject(...) and
    >>>>> ObjectInputStream#readObject(...) I was wondering if anyone can
    >>>>> recommend a Serialization framework that would outperform the vanilla
    >>>>> Java default Serialization?
    >>>>>
    >>>> But you will be transmitting data via a format that omits the object
    >>>> graph overhead and focuses on just the data to share. The object-graph
    >>>> knowledge is coded into the application and need not be transferred.
    >>>>
    >>>> XML is awesome for this kind of task.
    >>>
    >>> http://www.json.org/ might also be a good alternative which - depending
    >>> on format etc. - can be less verbose. See http://json.org/example.html
    >>>

    >>
    >> JSON is convenient for JavaScript heads, it is not human readable,
    >> this is one reason why XML exists in the first place.

    >
    > I am not sure why you say JSON is not human readable while XML is.
    > Remember: for network transfer you would use the most compressed format
    > of either which means that for XML you would not have line breaks and
    > indentation. I'd say an XML on one line with a reasonable complex
    > structure is not human readable.


    For WAN network.

    For LAN network I doubt that the difference between JSON and XML
    will be noticeable (especially if the it is gzipped on the wire).

    But internet is usually a lot slower than gigabit inside the
    data center.

    >> JSON was
    >> a mistake, instead of coming up with an arcane hacked syntax
    >> to replace XML; JavaScript should have been improved to handle
    >> XML.

    >
    > That sounds like opinion to me. Can you provide any real arguments why
    > XML should be chosen for as a data transfer format over JSON?


    Unless limited bandwidth or ease of use in JavaScript is important,
    then XML do seem better.

    XML schemas, namespaces etc. makes it a lot more type safe
    and reusable among apps.

    But limited bandwidth and ease of use in JavaScript applies
    to all modern web apps and all smartphone apps, so that is
    a very big chunk of development today.

    Arne
     
    Arne Vajhøj, Nov 7, 2011
    #13
  14. On 10/1/2011 3:35 PM, jebblue wrote:
    > On Sat, 01 Oct 2011 21:13:40 +0200, Robert Klemme wrote:
    >
    >> On 10/01/2011 06:19 PM, Lew wrote:
    >>> Giovanni Azua wrote:
    >>>> I have this lite Client-Server framework based on Blocking IO using
    >>>> classic java.net.* Sockets (must develop it myself for a grad course
    >>>> project). The way I am using to pass data over the Sockets is via
    >>>> Serialization i.e. ObjectOutputStream#writeObject(...) and
    >>>> ObjectInputStream#readObject(...) I was wondering if anyone can
    >>>> recommend a Serialization framework that would outperform the vanilla
    >>>> Java default Serialization?
    >>>>
    >>> But you will be transmitting data via a format that omits the object
    >>> graph overhead and focuses on just the data to share. The object-graph
    >>> knowledge is coded into the application and need not be transferred.
    >>>
    >>> XML is awesome for this kind of task.

    >>
    >> http://www.json.org/ might also be a good alternative which - depending
    >> on format etc. - can be less verbose. See http://json.org/example.html
    >>

    >
    > JSON is convenient for JavaScript heads, it is not human readable,
    > this is one reason why XML exists in the first place.


    JSON is often more readable than XML for humans. You have the name
    and the value and that is what you ned in most cases. XML with
    heavy use of namespaces provide a lot of information for the
    programs reading it, but for the human mind all that stuff is
    more of a distraction from the essential.

    Arne
     
    Arne Vajhøj, Nov 7, 2011
    #14
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.

Share This Page