Re: a new java to c socket question

Discussion in 'Java' started by Steve Horsley, Jul 21, 2003.

  1. On Mon, 21 Jul 2003 11:14:15 -0500, Brad Chambers wrote:

    > I'm working with code that passes a structure over a socket connection
    > between a c client and server. I have an application where I need the
    > client-side to be implemented in java. Is there an easy way to create
    > and send a structure that will require no modifications to my server? I
    > haven't worked much with java, especially sockets, and am just beginning
    > to explore this option. Basically all the structure contains are two
    > fields, one representing a type (to determine whether the command is
    > even valid, and then to determine an appropriate action) and one with
    > some parameters for the command.
    >
    > Thanks,
    >
    > Brad


    Start by defining the protocol on the wire - byte by byte. This involves
    specifying the size of each field and the order of bytes. If the C
    programmer was sloppy and simply sent a memory image of the structure then
    watch out for extra alignment padding which may confuse you. You may need
    a protocol analyser to help reverse engineer this - ethereal is a good
    free one. Also, the characterset of any strings should be specified.

    Building the message in java really consists of converting the data you
    want to transfer into a byte[] and then sending it. Reverse for receiving.
    Expect to do a lot of shift and mask work if you're dealing with int
    values.

    For strings, convert to bytes with String.getBytes(encoding), where
    encoding is the name of the characterset you know the C code is using.
    E.g. byte[] outBytes = myString.getBytes("8859-1"); Beware padding (and
    maybe null termination) requirements in the wire protocol. For decoding
    Strings, use the constructor that takes the encoding string too.

    You might find DataOutputStream useful because it can write short and int
    values in one call, but beware ofthe byte ordering issues. It is very
    unlikely that writeUTF will be useful for sending Strings.

    Steve
    Steve Horsley, Jul 21, 2003
    #1
    1. Advertising

  2. Steve Horsley

    Jon A. Cruz Guest

    Steve Horsley wrote:
    > For strings, convert to bytes with String.getBytes(encoding), where
    > encoding is the name of the characterset you know the C code is using.
    > E.g. byte[] outBytes = myString.getBytes("8859-1");


    Remember to check the range from $80 through $9F. For Windows machines,
    this is where they will fail, and where "Cp1252" needs to be used instead.

    Of course, if you have any say at all, the strings should be sent in UTF-8.
    Jon A. Cruz, Jul 22, 2003
    #2
    1. Advertising

  3. Steve Horsley

    Jon A. Cruz Guest

    Nigel Wade wrote:
    >
    > Yuk! It's time to move away from those outdated "C" algorithms,
    > ByteBuffers are the way to go. Create a byte[] of the correct size, wrap it
    > in a ByteBuffer, set the endianness to match what's required and then just
    > put variables to it. Do the reverse for reading.


    But with many formats, it's hard to tell exact size ahead of time. It
    also compilcates code.



    > For char's it always
    > reads/writes 2 bytes so you still need to convert those to a format that C
    > will understand and use a put() rather than putChar() method.


    No, it's not.

    It's only always 2 bytes if you use UCS-2. If you use UTF-16 or UTF-8 or
    UTF-32, it can differ.


    In fact, for many applications UTF-8 is best.


    >
    > Much simpler than masking and shifting; let the ByteBuffer to the messy
    > stuff.


    Ahhh.... however

    >
    > Of course, you do need 1.4 to get ByteBuffer.
    >



    Which is the big gotcha. Even Mac OS X only has it if you paid to
    upgrade to Jaguar. No ByteBuffer for 10.1.x users.

    :-(
    Jon A. Cruz, Jul 23, 2003
    #3
  4. Steve Horsley

    Nigel Wade Guest

    Jon A. Cruz wrote:

    > Nigel Wade wrote:
    >>
    >> Yuk! It's time to move away from those outdated "C" algorithms,
    >> ByteBuffers are the way to go. Create a byte[] of the correct size, wrap
    >> it in a ByteBuffer, set the endianness to match what's required and then
    >> just put variables to it. Do the reverse for reading.

    >
    > But with many formats, it's hard to tell exact size ahead of time. It
    > also compilcates code.


    You don't need to know ahead of time any size. You create the byte[] on the
    fly according to how much you need to read/write.

    Also, I find it simplifies the code. Are you saying that

    byte[] buffer = new byte[length];
    dataInputStream.readFully( length );

    ByteBuffer bb = ByteBuffer.wrap(buffer);
    bb.order( ByteOrder.LITTLE_ENDIAN );

    int i = bb.getInt();
    float f = bb.getFloat();

    is more complicated than doing all the byte swapping and shifting yourself?


    >
    >
    >
    > > For char's it always
    >> reads/writes 2 bytes so you still need to convert those to a format that
    >> C will understand and use a put() rather than putChar() method.

    >
    > No, it's not.
    >
    > It's only always 2 bytes if you use UCS-2. If you use UTF-16 or UTF-8 or
    > UTF-32, it can differ.


    I don't think so. ByteBuffer has no way of knowing what type of character
    coding you have used.

    The documentation for ByteBuffer.getChar() quite explicitly states it reads
    2 bytes, and increments the counter by 2. That's why I said to use
    get()/put() rather than getChar()/putChar() so you control how many bytes
    get read/written.

    >
    >
    > In fact, for many applications UTF-8 is best.
    >
    >
    >>
    >> Much simpler than masking and shifting; let the ByteBuffer to the messy
    >> stuff.

    >
    > Ahhh.... however
    >
    >>
    >> Of course, you do need 1.4 to get ByteBuffer.
    >>

    >
    >
    > Which is the big gotcha. Even Mac OS X only has it if you paid to
    > upgrade to Jaguar. No ByteBuffer for 10.1.x users.
    >
    > :-(


    One day Mac will join the C21st. Even sgi have 1.4 for IRIX. ;-)

    --
    Nigel Wade, System Administrator, Space Plasma Physics Group,
    University of Leicester, Leicester, LE1 7RH, UK
    E-mail :
    Phone : +44 (0)116 2523548, Fax : +44 (0)116 2523555
    Nigel Wade, Jul 23, 2003
    #4
  5. Steve Horsley

    Jon A. Cruz Guest

    Nigel Wade wrote:
    > Jon A. Cruz wrote:
    >
    >
    >>Nigel Wade wrote:
    >>
    >>>Yuk! It's time to move away from those outdated "C" algorithms,
    >>>ByteBuffers are the way to go. Create a byte[] of the correct size, wrap
    >>>it in a ByteBuffer, set the endianness to match what's required and then
    >>>just put variables to it. Do the reverse for reading.

    >>
    >>But with many formats, it's hard to tell exact size ahead of time. It
    >>also compilcates code.

    >
    >
    > You don't need to know ahead of time any size. You create the byte[] on the
    > fly according to how much you need to read/write.
    >
    > Also, I find it simplifies the code. Are you saying that
    >
    > byte[] buffer = new byte[length];
    > dataInputStream.readFully( length );
    >
    > ByteBuffer bb = ByteBuffer.wrap(buffer);
    > bb.order( ByteOrder.LITTLE_ENDIAN );
    >
    > int i = bb.getInt();
    > float f = bb.getFloat();
    >
    > is more complicated than doing all the byte swapping and shifting yourself?


    Yes, as 'length' doesn't exist for most of the stuff I'm talking about.

    Instead, DataInpuStream or the LE equivalent is easier and safer.


    int i = dataInputStream.readInt();
    float f = dataInputStream.readFloat();




    Writing, though, is where it's harder.




    >
    > I don't think so. ByteBuffer has no way of knowing what type of character
    > coding you have used.


    Exactly. Which makes it inappropriate for this type of operation.


    >
    > The documentation for ByteBuffer.getChar() quite explicitly states it reads
    > 2 bytes, and increments the counter by 2. That's why I said to use
    > get()/put() rather than getChar()/putChar() so you control how many bytes
    > get read/written.


    Yes... but that's

    A) inefficient for many protocol implementations.

    B) incompatible with most existing protocols.
    Jon A. Cruz, Jul 24, 2003
    #5
  6. Steve Horsley

    Nigel Wade Guest

    Jon A. Cruz wrote:

    > Nigel Wade wrote:
    >> Jon A. Cruz wrote:
    >>
    >>
    >>>Nigel Wade wrote:
    >>>
    >>>>Yuk! It's time to move away from those outdated "C" algorithms,
    >>>>ByteBuffers are the way to go. Create a byte[] of the correct size, wrap
    >>>>it in a ByteBuffer, set the endianness to match what's required and then
    >>>>just put variables to it. Do the reverse for reading.
    >>>
    >>>But with many formats, it's hard to tell exact size ahead of time. It
    >>>also compilcates code.

    >>
    >>
    >> You don't need to know ahead of time any size. You create the byte[] on
    >> the fly according to how much you need to read/write.
    >>
    >> Also, I find it simplifies the code. Are you saying that
    >>
    >> byte[] buffer = new byte[length];
    >> dataInputStream.readFully( length );
    >>
    >> ByteBuffer bb = ByteBuffer.wrap(buffer);
    >> bb.order( ByteOrder.LITTLE_ENDIAN );
    >>
    >> int i = bb.getInt();
    >> float f = bb.getFloat();
    >>
    >> is more complicated than doing all the byte swapping and shifting
    >> yourself?

    >
    > Yes, as 'length' doesn't exist for most of the stuff I'm talking about.


    If you don't know how much you have to read/write, how do you know how much
    to read/write?

    >
    > Instead, DataInpuStream or the LE equivalent is easier and safer.


    Easier and safer how?

    >
    >
    > int i = dataInputStream.readInt();
    > float f = dataInputStream.readFloat();


    After which you have start messing with shifting and masking and all that (I
    notice you didn't include that code)...

    Alternatively:

    byte[] buffer = byte[8];
    ByteBuffer bb = ByteBuffer.wrap(buffer);
    bb.order( ByteOrder.LITTLE_ENDIAN );

    dataInputStream.readFully( buffer);
    bb.getInt();
    bb.getFloat();

    no nasty messing with shifts and masks. Of course you do have to know how
    big an int and float is, but if you don't know that you shouldn't be
    messing around sending data over the wire...

    >
    > Writing, though, is where it's harder.
    >
    >
    >
    >
    >>
    >> I don't think so. ByteBuffer has no way of knowing what type of character
    >> coding you have used.

    >
    > Exactly. Which makes it inappropriate for this type of operation.


    It can be used without any problem, you just have to create a byte buffer of
    the characters first (String will do that for you) then send it. It's
    perfectly appropriate, you just have to open your mind to the
    possibility...

    >
    >
    >>
    >> The documentation for ByteBuffer.getChar() quite explicitly states it
    >> reads 2 bytes, and increments the counter by 2. That's why I said to use
    >> get()/put() rather than getChar()/putChar() so you control how many bytes
    >> get read/written.

    >
    > Yes... but that's
    >
    > A) inefficient for many protocol implementations.
    >
    > B) incompatible with most existing protocols.


    No, just don't try to use putChar()/getChar(). I'm getting a feeling of deja
    vu here.

    --
    Nigel Wade, System Administrator, Space Plasma Physics Group,
    University of Leicester, Leicester, LE1 7RH, UK
    E-mail :
    Phone : +44 (0)116 2523548, Fax : +44 (0)116 2523555
    Nigel Wade, Jul 24, 2003
    #6
  7. Steve Horsley

    Jon A. Cruz Guest

    Nigel Wade wrote:
    > If you don't know how much you have to read/write, how do you know how much
    > to read/write?


    You check as you enter each chunk or structure.


    >>Instead, DataInpuStream or the LE equivalent is easier and safer.

    >
    >
    > Easier and safer how?


    You don't need to make an intermediate byte array.

    >>int i = dataInputStream.readInt();
    >>float f = dataInputStream.readFloat();

    >
    >
    > After which you have start messing with shifting and masking and all that (I
    > notice you didn't include that code)...


    Here's a more detailed example snippet.

    int count = in.readInt();
    for ( int i = 0; i < count; i++ )
    {
    int type = in.readInt();
    switch ( type )
    {
    case ITEM_TYPE_NAMELIST:
    {
    int numPeople = in.readInt();
    Vector people = new Vector();
    for ( int i = 0; i < numPeople; i++ )
    {
    int len = in.readInt();
    byte[] nameBuf = new byte[ len ];
    in.readFully( nameBuf );
    String name = new String( nameBuf, "UTF-8" );
    people.add( name );
    }
    obj.setNames( people );
    }
    break;

    case ITEM_TYPE_PRIORITY:
    {
    int priority = in.readInt();
    obj.setPriority( priority );
    }
    break;
    ....


    Note that DataInputStream does most all the shift and masking that
    ByteBuffer gives you. Perhaps that was part of what you weren't getting.


    > It can be used without any problem, you just have to create a byte buffer of
    > the characters first (String will do that for you) then send it. It's
    > perfectly appropriate, you just have to open your mind to the
    > possibility...


    Yes, but also using an output stream and/or dataoutputstream directly
    might give you the same abilities without the extra overhead.



    >
    >
    > No, just don't try to use putChar()/getChar(). I'm getting a feeling of deja
    > vu here.
    >


    Yes. Then again there's not a real benefit to using ByteBuffer over
    DataOutputStream.
    Jon A. Cruz, Jul 25, 2003
    #7
  8. Steve Horsley

    Nigel Wade Guest

    Jon A. Cruz wrote:

    > Nigel Wade wrote:


    >
    >>>int i = dataInputStream.readInt();
    >>>float f = dataInputStream.readFloat();

    >>
    >>
    >> After which you have start messing with shifting and masking and all that
    >> (I notice you didn't include that code)...

    >
    > Here's a more detailed example snippet.
    >
    > int count = in.readInt();
    > for ( int i = 0; i < count; i++ )
    > {
    > int type = in.readInt();
    > switch ( type )
    > {
    > case ITEM_TYPE_NAMELIST:
    > {
    > int numPeople = in.readInt();
    > Vector people = new Vector();
    > for ( int i = 0; i < numPeople; i++ )
    > {
    > int len = in.readInt();
    > byte[] nameBuf = new byte[ len ];
    > in.readFully( nameBuf );
    > String name = new String( nameBuf, "UTF-8" );
    > people.add( name );
    > }
    > obj.setNames( people );
    > }
    > break;
    >
    > case ITEM_TYPE_PRIORITY:
    > {
    > int priority = in.readInt();
    > obj.setPriority( priority );
    > }
    > break;
    > ...
    >
    >
    > Note that DataInputStream does most all the shift and masking that
    > ByteBuffer gives you. Perhaps that was part of what you weren't getting.


    I see there's no shifting and masking involved there. I assumed in your
    original post when you mentioned shifting and masking for ints that you
    were referring to data of either endianness, as masking and shifting
    shouldn't be neccessary for network byte order data.

    It's the byte swapping feature of ByteBuffer I use to read little endian
    data. That isn't in DataInput.

    I misunderstood your meaning.

    --
    Nigel Wade, System Administrator, Space Plasma Physics Group,
    University of Leicester, Leicester, LE1 7RH, UK
    E-mail :
    Phone : +44 (0)116 2523548, Fax : +44 (0)116 2523555
    Nigel Wade, Jul 25, 2003
    #8
  9. Steve Horsley

    Jon A. Cruz Guest

    Nigel Wade wrote:
    >
    > I see there's no shifting and masking involved there. I assumed in your
    > original post when you mentioned shifting and masking for ints that you
    > were referring to data of either endianness, as masking and shifting
    > shouldn't be neccessary for network byte order data.
    >
    > It's the byte swapping feature of ByteBuffer I use to read little endian
    > data. That isn't in DataInput.
    >
    > I misunderstood your meaning.
    >


    Not really.

    Just that I tend to do that sometimes, and sometimes not. Mainly just
    when it's appropriate. Since DataInputStream does that for
    network-byte-order normal data, it's often good to use.


    Other times, I'll either have a stream or a serializer/deserializer that
    do explicit shifting down in them.
    Jon A. Cruz, Jul 26, 2003
    #9
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Laszlo Nagy
    Replies:
    1
    Views:
    4,818
    Mark Wooding
    Jan 27, 2009
  2. Jean-Paul Calderone
    Replies:
    0
    Views:
    966
    Jean-Paul Calderone
    Jan 27, 2009
  3. Laszlo Nagy
    Replies:
    0
    Views:
    545
    Laszlo Nagy
    Feb 1, 2009
  4. Steve Holden
    Replies:
    0
    Views:
    662
    Steve Holden
    Feb 1, 2009
  5. Steve Holden
    Replies:
    1
    Views:
    715
Loading...

Share This Page