Reading Java byte[] data stream over standard input

Discussion in 'Python' started by sapsi, May 19, 2008.

  1. sapsi

    sapsi Guest

    Hello,
    I am using HadoopStreaming using a BinaryInputStream. What this
    basically does is send a stream of bytes ( the java type is : private
    byte[] bytes) to my python program.

    I have done a test like this,
    while 1:
    x=sys.stdin.read(100)
    if x:
    print x
    else:
    break

    Now, the incoming data is binary(though mine is actually merely ascii
    text) but the output is not what is expected. I expect for e.g

    all/86000/114.310.151.209.60370-121.110.5.176.113\n62485.9718
    118.010.241.12 60370 128.210.5.176

    However i get a 1 before all and a 4 just after \n and before the 6.

    My question is : how do i read binary data(Java's byte stream) from
    stdin?
    Or is this actually what i'm getting?

    Thanks
    Sapsi
     
    sapsi, May 19, 2008
    #1
    1. Advertising

  2. sapsi

    sapsi Guest

    I should also mention that for some reason there are several binay
    values popping in between for some reason. This behavior (for the
    inputr stream) is not expected


    > Now, the incoming data is binary(though mine is actually merely ascii
    > text) but the output is not what is expected. I expect for e.g
    >
    > all/86000/114.310.151.209.60370-121.110.5.176.113\n62485.9718
    > 118.010.241.12 60370 128.210.5.176
    >
    > However i get a 1 before all and a 4 just after \n and before the 6.
    >
    > My question is : how do i read binary data(Java's byte stream) from
    > stdin?
    > Or is this actually what i'm getting?
    >
    > Thanks
    > Sapsi
     
    sapsi, May 19, 2008
    #2
    1. Advertising

  3. On Sun, 18 May 2008 22:11:33 -0700, sapsi wrote:

    > I am using HadoopStreaming using a BinaryInputStream. What this
    > basically does is send a stream of bytes ( the java type is : private
    > byte[] bytes) to my python program.
    >
    > I have done a test like this,
    > while 1:
    > x=sys.stdin.read(100)
    > if x:
    > print x
    > else:
    > break
    >
    > Now, the incoming data is binary(though mine is actually merely ascii
    > text) but the output is not what is expected. I expect for e.g
    >
    > all/86000/114.310.151.209.60370-121.110.5.176.113\n62485.9718
    > 118.010.241.12 60370 128.210.5.176
    >
    > However i get a 1 before all and a 4 just after \n and before the 6.
    >
    > My question is : how do i read binary data(Java's byte stream) from
    > stdin?
    > Or is this actually what i'm getting?


    If there's extra data in `x` then it was sent to stdin. Maybe there's
    some extra information like string length, Java type information, or
    checksums encoded in that data!?

    Ciao,
    Marc 'BlackJack' Rintsch
     
    Marc 'BlackJack' Rintsch, May 19, 2008
    #3
  4. sapsi

    sapsi Guest

    Yes, that could be the case. Browsing through hadoop's source, i see
    stdin in the above code is reading from piped Java DataOutputStream.
    I read of a libray on the net Javadata.py that reads this but it has
    disappeared.
    What is involved in reading from a Dataoutputstream?

    Thank you
    Sapsi
     
    sapsi, May 19, 2008
    #4
  5. On Mon, 19 May 2008 00:14:25 -0700, sapsi wrote:

    > Yes, that could be the case. Browsing through hadoop's source, i see
    > stdin in the above code is reading from piped Java DataOutputStream.
    > I read of a libray on the net Javadata.py that reads this but it has
    > disappeared.
    > What is involved in reading from a Dataoutputstream?


    According to the Java docs of `DataInput` and `DataOutput` it is quite
    simple. Most methods just seem to write the necessary bytes for the
    primitive types except `writeUTF()` which prefixes the string data with
    length information.

    So if it is not Strings you are writing then "hadoop" seems to throw in
    some information into the stream.

    Ciao,
    Marc 'BlackJack' Rintsch
     
    Marc 'BlackJack' Rintsch, May 19, 2008
    #5
  6. sapsi

    Giles Brown Guest

    On 19 May, 06:11, sapsi <> wrote:
    > Hello,
    > I am using HadoopStreaming using a BinaryInputStream. What this
    > basically does is send a stream of bytes ( the java type is : private
    > byte[] bytes) to my python program.
    >
    > I have done a test like this,
    > while 1:
    > x=sys.stdin.read(100)
    > if x:
    > print x
    > else:
    > break
    >
    > Now, the incoming data is binary(though mine is actually merely ascii
    > text) but the output is not what is expected. I expect for e.g
    >
    > all/86000/114.310.151.209.60370-121.110.5.176.113\n62485.9718
    > 118.010.241.12 60370 128.210.5.176
    >
    > However i get a 1 before all and a 4 just after \n and before the 6.
    >
    > My question is : how do i read binary data(Java's byte stream) from
    > stdin?
    > Or is this actually what i'm getting?
    >
    > Thanks
    > Sapsi


    In the past I've sent binary data to a java applet reading
    DataInputStream using xdrlib from the standard library. I'd expect
    that it would work in the reverse direction so I suggest you have a
    look at that.

    Giles
     
    Giles Brown, May 19, 2008
    #6
  7. sapsi

    John Machin Guest

    sapsi wrote:
    > I should also mention that for some reason there are several binay
    > values popping in between for some reason. This behavior (for the
    > inputr stream) is not expected
    >
    >
    >> Now, the incoming data is binary(though mine is actually merely ascii
    >> text) but the output is not what is expected. I expect for e.g
    >>
    >> all/86000/114.310.151.209.60370-121.110.5.176.113\n62485.9718
    >> 118.010.241.12 60370 128.210.5.176
    >>
    >> However i get a 1 before all and a 4 just after \n and before the 6.
    >>
    >> My question is : how do i read binary data(Java's byte stream) from
    >> stdin?
    >> Or is this actually what i'm getting?
    >>


    Consider changing "print x" to "print repr(x)" ... this would mean that
    you have a better chance of understanding what the extra or unexpected
    popping-in bytes are.
     
    John Machin, May 19, 2008
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Deep
    Replies:
    6
    Views:
    524
    Nick Keighley
    Feb 28, 2007
  2. Replies:
    9
    Views:
    671
    Alex Buell
    Apr 27, 2006
  3. dolphin
    Replies:
    6
    Views:
    595
    Thomas Fritsch
    Mar 18, 2007
  4. petek1976
    Replies:
    1
    Views:
    330
    James Kanze
    Nov 19, 2007
  5. Roedy Green

    byte stream vs char stream buffer

    Roedy Green, May 7, 2014, in forum: Java
    Replies:
    20
    Views:
    266
    Silvio
    May 18, 2014
Loading...

Share This Page