C to Java socket input question

R

Richard

Level: Java newbie, C experienced
Platform: Linux and Win32, Intel

Another programmer and I are working on a small project together.
He's writing a server process in Java that accepts input from
processes I've written over a TCP connection. My processes are all
written in C; his are all done in Java. He's new to Java, and I've
never really used it.

My input is basically a stream of 32-bit unsigned integers (e.g., the
low-level form of IP addresses) along with the occasional fixed-
length char array.

We're starting to spin our wheels about how to present input to his
server, since there's no native 32-bit unsigned int in Java (so far
as I understand it). He's game, but is dealing with learning Java,
and we're at the point where we have to get past this issue and on to
other parts of what we're building. I'm not getting a very clear
picture of what the Java-side problem is, and I'm at the point where
I'll take on the obligation of learning what the problem is, and
doing any necessary data conversion, if that's what it takes to get
past this.

Any references to, or discussion of, this kind of IO problem would be
greatly appreciated.
 
D

David Zimmerman

Richard said:
My input is basically a stream of 32-bit unsigned integers (e.g., the
low-level form of IP addresses) along with the occasional fixed-
length char array.

Treat it as a 4 byte array. InetAddress.getByAddress() wants a byte[]
anyway.

Or put it in a signed 32 bit integer and never do arithmetic with it,
the bits are the same regardless of how they're interpretted.

Either way, mind the endienness of the data. Best idea here would be to
get the C porogram to put everything in network order, the same as Java's.
 
R

Richard

(e-mail address removed) wrote...
My input is basically a stream of 32-bit unsigned integers (e.g., the
low-level form of IP addresses) along with the occasional fixed-
length char array.

Treat it as a 4 byte array. InetAddress.getByAddress() wants a byte[]
anyway.

Or put it in a signed 32 bit integer and never do arithmetic with it,
the bits are the same regardless of how they're interpretted.

Sorry for being brain-dead, but will he be able to do arithmetic on
it if it's in a 4-byte array? As in: unsigned char fbarray[4] ; ?
Either way, mind the endienness of the data. Best idea here would be to
get the C porogram to put everything in network order, the same as Java's.

Already been bit by that one. :)
 
?

=?ISO-8859-15?Q?Thomas_Gagn=E9?=

Don't send binary data. Send it in text. Both C and Java understand text and
you don't have to worry about network byte ordering or whether your CPUs are
big or little-endian.

Or you could use some middleware to make your job easier.
 
R

Richard

(e-mail address removed) wrote...
Ummm.... Maybe. Or maybe not. :-D

32-bit is *one* form of an IP address. Another is 128-bit addresses in IPv6.

It's best that you account for both.

<Sound of hand slapping forehead>

Yes, that's true.
see java.net.InetAddress, java.net.Inet4Address and
java.net.Inet6Address for more.


Fixed length char array? Or string? Logically which do you have?

String of fixed length.
And what encoding? If you want to avoid many-to-many complexity
nightmares, it's probably best to use UTF-8.

Oui. Of course.
Hmmm..... so maybe fixed might not be as robust.

But an inet address is not really a 32-bit unsigned int.

??? Unless you're referring to IPV6,

in_addr_t => uint32_t ( said:
So you can just
use Java's int, which is 32-bit signed. No problems.

Hmmmm.... you could search these newsgroups in Google or some such for
any posts where I've touched on this (I do when it comes up).

a few off hand:
Message-ID: <[email protected]>
Message-ID: <[email protected]>
Message-ID: <[email protected]_SPAM.net>
Message-ID: <[email protected]>
Message-ID: <[email protected]>
Message-ID: <[email protected]>

Thank you.
 
S

Sudsy

Richard said:
Level: Java newbie, C experienced
Platform: Linux and Win32, Intel

Another programmer and I are working on a small project together.
He's writing a server process in Java that accepts input from
processes I've written over a TCP connection. My processes are all
written in C; his are all done in Java. He's new to Java, and I've
never really used it.

My input is basically a stream of 32-bit unsigned integers (e.g., the
low-level form of IP addresses) along with the occasional fixed-
length char array.

We're starting to spin our wheels about how to present input to his
server, since there's no native 32-bit unsigned int in Java (so far
as I understand it). He's game, but is dealing with learning Java,
and we're at the point where we have to get past this issue and on to
other parts of what we're building. I'm not getting a very clear
picture of what the Java-side problem is, and I'm at the point where
I'll take on the obligation of learning what the problem is, and
doing any necessary data conversion, if that's what it takes to get
past this.

Any references to, or discussion of, this kind of IO problem would be
greatly appreciated.

Why not just read as a stream of bytes and use
InetAddress.getByAddress( byte[] )
to convert into a format which can be conveniently
used by various Java methods?
So while "there's no native 32-bit unsigned in in Java"
there IS this wonderful class called InetAddress which
encapsulates IP addresses.
Drop me a line if you need some sample code.
 
R

Richard

(e-mail address removed) wrote...
[Since this reply is all java I've removed comp.lang.c from the post]

It's not a particularly difficult problem in Java. The class DataInputStream
almost does it; it has methods readUnsignedByte() and readUnsignedShort() -
quite why they didn't add a readUnsignedInt() I don't know.

But you could easily extend DataInputStream and add a readUnsignedInt()
method which returns a long. All you have to do then is open a Socket and
wrap its InputStream with your extended DataInputStream. To read unsigned
ints you just invoke readUnsignedInt() on the extended DataInputStream.

If you want to use unsigned int's in Java, WBEM services have developed some
classes. You can find out more at http://wbemservices.sourceforge.net/

Let me crack out the Java doc now that I've got a good place to look.
Thanks very much for the WBEM services link, too.
 
S

Steve Horsley

Level: Java newbie, C experienced
Platform: Linux and Win32, Intel

Another programmer and I are working on a small project together. He's
writing a server process in Java that accepts input from processes I've
written over a TCP connection. My processes are all written in C; his
are all done in Java. He's new to Java, and I've never really used it.

My input is basically a stream of 32-bit unsigned integers (e.g., the
low-level form of IP addresses) along with the occasional fixed- length
char array.

We're starting to spin our wheels about how to present input to his
server, since there's no native 32-bit unsigned int in Java (so far as I
understand it). He's game, but is dealing with learning Java, and we're
at the point where we have to get past this issue and on to other parts
of what we're building. I'm not getting a very clear picture of what
the Java-side problem is, and I'm at the point where I'll take on the
obligation of learning what the problem is, and doing any necessary data
conversion, if that's what it takes to get past this.

Any references to, or discussion of, this kind of IO problem would be
greatly appreciated.

Well, you have already had some good responses, but let me add my own
opinion anyway...

Start by defining the protocol between the machines. Do this by defining
the byte stream in terms of bytes, not any higher structure, since TCP
does not inherently support any higher structure. If you MUST send larger
structures such as IP addresses (yes I know you must), define how they
will be sent byte-by-byte. E.g. an IPv4 address goes as 4 bytes, in
the following order: 1.2.3.4 is sent as 0x01, 0x02, 0x03, 0x04.
This is how the RFCs define their protocols.

Now each of you can go about assembling and disassembling these messages
in the way that seems most natural to you and your language. This may well
turn out to be that the C code mainly handles addresses as unsigned ints
but the java end handles them as byte[4]. You can ignore each other's
implementation and concentrate on meeting the network spec.

By the way, NEVER send structures from C like this:
socket.write(myStruct, sizeof myStruct);
This can introduce hidden padding, and does not specify which order things
will be sent in. Always specify the byte stream format, and work to that.

Steve
 
J

Jordan Zimmerman

Send it as a String. You have a problem sitting there in C as well.
Different machine types have different long int encodings. It's much safer
to send a String.
 
R

Richard

(e-mail address removed) wrote...
Richard said:
Level: Java newbie, C experienced
Platform: Linux and Win32, Intel

Another programmer and I are working on a small project together.
He's writing a server process in Java that accepts input from
processes I've written over a TCP connection. My processes are all
written in C; his are all done in Java. He's new to Java, and I've
never really used it.

My input is basically a stream of 32-bit unsigned integers (e.g., the
low-level form of IP addresses) along with the occasional fixed-
length char array.

We're starting to spin our wheels about how to present input to his
server, since there's no native 32-bit unsigned int in Java (so far
as I understand it). He's game, but is dealing with learning Java,
and we're at the point where we have to get past this issue and on to
other parts of what we're building. I'm not getting a very clear
picture of what the Java-side problem is, and I'm at the point where
I'll take on the obligation of learning what the problem is, and
doing any necessary data conversion, if that's what it takes to get
past this.

Any references to, or discussion of, this kind of IO problem would be
greatly appreciated.

Why not just read as a stream of bytes and use
InetAddress.getByAddress( byte[] )
to convert into a format which can be conveniently
used by various Java methods?
So while "there's no native 32-bit unsigned in in Java"
there IS this wonderful class called InetAddress which
encapsulates IP addresses.
Drop me a line if you need some sample code.

Actually, IP addresses were just an example of the various unsigned
ints I have to pass the server. I should have been clearer.

Thanks to the guidance I've picked up here, we're already well past
our sticking point. Thanks to you and everyone who helped.
 
R

Richard

(e-mail address removed)_SPAM.net wrote...
Well, you have already had some good responses, but let me add my own
opinion anyway...

Start by defining the protocol between the machines. Do this by defining
the byte stream in terms of bytes, not any higher structure, since TCP
does not inherently support any higher structure.

Actually, we already had that; it's definitely the first place to
start. Turning the protocol spec into a working protocol was what
was hanging us up.
 
R

Roedy Green

Send it as a String. You have a problem sitting there in C as well.
Different machine types have different long int encodings. It's much safer
to send a String.

You have endian problems but I think everything has settled on 2-twos
complements for longs.

Strings have their own set of problems.

1. are they printable, 7bit, 8bit, 16 bit? If 16 bit, you have endian
issues again! PHHTT.

2. what encoding is being used? See
http://mindprod.com/jgloss/encoding.html

3. how are they terminated? Do they have length bytes, implied
lengths, or null terminators or some other separator/terminator?

Printable strings in a CSV file with 7-bit ascii are tractable, but
the others, you need to be just as on your toes as ever.

see http://mindprod.com/jgloss/products.html#CSV
 
T

Thomas G. Marshall

In
David Zimmerman said:
Richard said:
My input is basically a stream of 32-bit unsigned integers (e.g., the
low-level form of IP addresses) along with the occasional fixed-
length char array.

Treat it as a 4 byte array. InetAddress.getByAddress() wants a byte[]
anyway.

Or put it in a signed 32 bit integer and never do arithmetic with it,
the bits are the same regardless of how they're interpretted.

Most of the time even arithmetic is fine. You just get overflow effects.

For example, in 32 bits alone, adding a FFFFFFFF to a FFFFFFFF yields
FFFFFFFE regardless of whether you see it as -1 + -1 or not.

Then again, you can always use long's (in java these are 64 bits).
 
D

David Zimmerman

Thomas said:
In

Most of the time even arithmetic is fine. You just get overflow effects.

For example, in 32 bits alone, adding a FFFFFFFF to a FFFFFFFF yields
FFFFFFFE regardless of whether you see it as -1 + -1 or not.

Everything except the residual/modulo operator ('%') should work fine.
Then again, you can always use long's (in java these are 64 bits).

For some reason this always leaves a bad taste for me.
 
D

David Zimmerman

Richard said:
(e-mail address removed) wrote...
Richard said:
My input is basically a stream of 32-bit unsigned integers (e.g., the
low-level form of IP addresses) along with the occasional fixed-
length char array.

Treat it as a 4 byte array. InetAddress.getByAddress() wants a byte[]
anyway.

Sorry for being brain-dead, but will he be able to do arithmetic on
it if it's in a 4-byte array? As in: unsigned char fbarray[4] ; ?

No, but why do you want to do arithmetic on IP addresses?

(Actually, he can do arithmetic on byte arrays, but it's tedious as all
get out.)
 
J

Jon A. Cruz

Richard said:
(e-mail address removed) wrote...



??? Unless you're referring to IPV6,

in_addr_t => uint32_t (<netinet/in.h>) => "unsigned 32-bit int"

No, you're being a little confused by how your platform decides to deal
with it.

First hint is that uint32_t lives in stdint.h, which is a relatively new
creature. C99, IIRC.

However, instead of just looking to how it's implemented on one platform
you're looking at, we should look to the official standard.

RFC-791 Internet Protocol
"Addresses are fixed length of four octets..."

Four separate parts, totalling 32 bits. That seems quite clear.
However... if some might be mislead by the "(32 bits)" that follows.
However, when the RFC's say "octets", then they are clearly talking
about what we commonly know as bytes.

And as to the contention that addresses are unsigned, well, not once in
the entire RFC does the word "unsigned" appear.

Just 32 bits. However you want to treat it. Or 4 octets.

In fact, inet addresses in Java are accessed as byte arrays, not ints or
longs.
 
J

Jon A. Cruz

Jordan said:
Send it as a String. You have a problem sitting there in C as well.
Different machine types have different long int encodings. It's much safer
to send a String.

Actually, it's not.

Just follow the practices of the ancients and use octects in specified
order. For some reason TCP/IP hasn't caused the end of the world yet. :)


Oh, and if you need to deal with floats, just use IEEE754 32 and 64 bit
in some explicit order.
 
R

Richard

(e-mail address removed) wrote...
No, you're being a little confused by how your platform decides to deal
with it.

Good point, except that how any given platform "decides to deal with
it" is what the problem often boils down to.
First hint is that uint32_t lives in stdint.h, which is a relatively new
creature. C99, IIRC.

You wouldn't be confusing when and where a typedef exists with the
existence and use of unsigned 32-bit integer values, would you? (You
seem to have missed the transform from 'uint32_t' to 'unsigned 32-bit
int' I think.)
In fact, inet addresses in Java are accessed as byte arrays, not ints or
longs.

Hence my sending 4 bytes, in NBO of course, over to that Java
process: To make it easy for the Java programmer wrt how Java
"decides to deal with it."

Thanks for your points, though. It's very useful to be reminded of
first principles.
 
T

Thomas G. Marshall

In
Jon A. Cruz said:
Richard wrote:



Ummm.... Maybe. Or maybe not. :-D

32-bit is *one* form of an IP address. Another is 128-bit addresses
in IPv6.

It's best that you account for both.

I always wondered if they intended to have IP addresses for all machines in
the visible universe with that 128 bit thing. But I'm all for it.

What I'm not sure of is just how backward compatible it would be, and if we
are facing a y2k scenario of converting over to it.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,581
Members
45,057
Latest member
KetoBeezACVGummies

Latest Threads

Top