Reading Java byte[] data stream over standard input

S

sapsi

Hello,
I am using HadoopStreaming using a BinaryInputStream. What this
basically does is send a stream of bytes ( the java type is : private
byte[] bytes) to my python program.

I have done a test like this,
while 1:
x=sys.stdin.read(100)
if x:
print x
else:
break

Now, the incoming data is binary(though mine is actually merely ascii
text) but the output is not what is expected. I expect for e.g

all/86000/114.310.151.209.60370-121.110.5.176.113\n62485.9718
118.010.241.12 60370 128.210.5.176

However i get a 1 before all and a 4 just after \n and before the 6.

My question is : how do i read binary data(Java's byte stream) from
stdin?
Or is this actually what i'm getting?

Thanks
Sapsi
 
S

sapsi

I should also mention that for some reason there are several binay
values popping in between for some reason. This behavior (for the
inputr stream) is not expected
 
M

Marc 'BlackJack' Rintsch

I am using HadoopStreaming using a BinaryInputStream. What this
basically does is send a stream of bytes ( the java type is : private
byte[] bytes) to my python program.

I have done a test like this,
while 1:
x=sys.stdin.read(100)
if x:
print x
else:
break

Now, the incoming data is binary(though mine is actually merely ascii
text) but the output is not what is expected. I expect for e.g

all/86000/114.310.151.209.60370-121.110.5.176.113\n62485.9718
118.010.241.12 60370 128.210.5.176

However i get a 1 before all and a 4 just after \n and before the 6.

My question is : how do i read binary data(Java's byte stream) from
stdin?
Or is this actually what i'm getting?

If there's extra data in `x` then it was sent to stdin. Maybe there's
some extra information like string length, Java type information, or
checksums encoded in that data!?

Ciao,
Marc 'BlackJack' Rintsch
 
S

sapsi

Yes, that could be the case. Browsing through hadoop's source, i see
stdin in the above code is reading from piped Java DataOutputStream.
I read of a libray on the net Javadata.py that reads this but it has
disappeared.
What is involved in reading from a Dataoutputstream?

Thank you
Sapsi
 
M

Marc 'BlackJack' Rintsch

Yes, that could be the case. Browsing through hadoop's source, i see
stdin in the above code is reading from piped Java DataOutputStream.
I read of a libray on the net Javadata.py that reads this but it has
disappeared.
What is involved in reading from a Dataoutputstream?

According to the Java docs of `DataInput` and `DataOutput` it is quite
simple. Most methods just seem to write the necessary bytes for the
primitive types except `writeUTF()` which prefixes the string data with
length information.

So if it is not Strings you are writing then "hadoop" seems to throw in
some information into the stream.

Ciao,
Marc 'BlackJack' Rintsch
 
G

Giles Brown

Hello,
I am using HadoopStreaming using a BinaryInputStream. What this
basically does is send a stream of bytes ( the java type is : private
byte[] bytes) to my python program.

I have done a test like this,
while 1:
x=sys.stdin.read(100)
if x:
print x
else:
break

Now, the incoming data is binary(though mine is actually merely ascii
text) but the output is not what is expected. I expect for e.g

all/86000/114.310.151.209.60370-121.110.5.176.113\n62485.9718
118.010.241.12 60370 128.210.5.176

However i get a 1 before all and a 4 just after \n and before the 6.

My question is : how do i read binary data(Java's byte stream) from
stdin?
Or is this actually what i'm getting?

Thanks
Sapsi

In the past I've sent binary data to a java applet reading
DataInputStream using xdrlib from the standard library. I'd expect
that it would work in the reverse direction so I suggest you have a
look at that.

Giles
 
J

John Machin

sapsi said:
I should also mention that for some reason there are several binay
values popping in between for some reason. This behavior (for the
inputr stream) is not expected

Consider changing "print x" to "print repr(x)" ... this would mean that
you have a better chance of understanding what the extra or unexpected
popping-in bytes are.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,764
Messages
2,569,564
Members
45,039
Latest member
CasimiraVa

Latest Threads

Top