Ridiculous readInt() bug? Read-head not advancing far enough?

Discussion in 'Java' started by nobrow@eircom.net, Apr 13, 2005.

  1. Guest

    This one is just fantastic! I have a large binary file being processed
    in Java. After insertion of much debugging code, and with a hex editor
    I have discovered the following behaviour.

    At some point the DataInputStream (dis) which I am using to read the
    file hits the following sequence of bytes;

    .... 05 00 00 00 00 00 03 07 C0 08 05 24 18 4D 11 E0 A8 ...

    The following sequence of methods are executed;

    dis.read() ... gives 5 (0x5) ... fine
    dis.readLong() ... gives 198592 (0x00000000000307C0) ... fine
    dis.readInt() ... gives 134554648 (0x08052418) ... fine
    dis.readInt() ... gives 407704032 (0x184D11E0) ... WTF!?

    Notice anything about that last one? ... The last byte read by the
    preceeding readInt() is being read as the first byte by this
    readInt()!!!!!

    The really annoying thing is that its intermittent. Happens every time
    today. Worked fine yesterday. Happened every time the day before. There
    is nothing unusal about my system. No background processes that could
    be getting involved. No changes in it from day to day.

    I know posting code would be a good move but the program is quite
    involved and difficult to chop down into a minimal example. Suffice it
    to say that there is nothing complicated about the offending portion of
    the code. The DataInputStream is not being shared across threads or
    anything, and those methods are executed in succession, with nothing
    else happening in between.

    Am running on Linux.

    $ java -version
    java version "1.4.2_02"
    Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.2_02-b03)
    Java HotSpot(TM) Client VM (build 1.4.2_02-b03, mixed mode)

    Seriously! Whats that about? Anyone ever seen anything like this before?
    , Apr 13, 2005
    #1
    1. Advertising

  2. El Guest

    Also, on some occassions, an EOFException is thrown, despite the fact
    that the DataInputStream is nowhere near the end of the file.
    El, Apr 13, 2005
    #2
    1. Advertising

  3. Daniel Dyer Guest

    On Wed, 13 Apr 2005 12:21:12 +0100, <> wrote:

    > This one is just fantastic! I have a large binary file being processed
    > in Java. After insertion of much debugging code, and with a hex editor
    > I have discovered the following behaviour.
    >
    > At some point the DataInputStream (dis) which I am using to read the
    > file hits the following sequence of bytes;
    >
    > ... 05 00 00 00 00 00 03 07 C0 08 05 24 18 4D 11 E0 A8 ...


    Where is the DataInputStream getting this data from? Can you be sure the
    bug is in the DataInputStream and not somewhere else? Have you wrapped
    the DataInputStream around some other input stream (is the data coming
    from a file or a socket)? If the bug is intermittent are you certain that
    the above sequence of bytes is exactly what is being fed to the
    DataInputStream every time?

    Dan.

    --
    Daniel Dyer
    http://www.footballpredictions.net
    Daniel Dyer, Apr 13, 2005
    #3
  4. Chris Uppal Guest

    wrote:

    > The really annoying thing is that its intermittent. Happens every time
    > today. Worked fine yesterday. Happened every time the day before. There
    > is nothing unusal about my system. No background processes that could
    > be getting involved. No changes in it from day to day.


    I think there must be something very strange about the stream you are reading
    from. The source to DataInputStream.readInt() is straightforward and could not
    possibly cause the results you are seeing (at least the 1.4.2 for Windows
    version is, I assume the Linux version is identical).

    Unless someone else recognises the symptoms, I think you'll have to give more
    detail about how you are creating the DataInputStream.

    Incidentally, can you reproduce the effect on a different Linux box (ideally
    one
    that does not have an identical installation) ?

    -- chris
    Chris Uppal, Apr 13, 2005
    #4
  5. bugbear Guest

    wrote:
    > This one is just fantastic! I have a large binary file being processed
    > in Java. After insertion of much debugging code, and with a hex editor
    > I have discovered the following behaviour.
    >
    > At some point the DataInputStream (dis) which I am using to read the
    > file hits the following sequence of bytes;
    >
    > ... 05 00 00 00 00 00 03 07 C0 08 05 24 18 4D 11 E0 A8 ...
    >
    > The following sequence of methods are executed;
    >
    > dis.read() ... gives 5 (0x5) ... fine
    > dis.readLong() ... gives 198592 (0x00000000000307C0) ... fine
    > dis.readInt() ... gives 134554648 (0x08052418) ... fine
    > dis.readInt() ... gives 407704032 (0x184D11E0) ... WTF!?
    >
    > Notice anything about that last one? ... The last byte read by the
    > preceeding readInt() is being read as the first byte by this
    > readInt()!!!!!
    >
    > The really annoying thing is that its intermittent. Happens every time
    > today. Worked fine yesterday. Happened every time the day before. There
    > is nothing unusal about my system. No background processes that could
    > be getting involved. No changes in it from day to day.


    I would recommend interposing a BufferredInputStream between
    your DataInputStream and your actual InputStream, and
    messing around with the BufferSize to see what happens.
    I suspect this wil "stir the pot".

    This sounds (horribly) like a buffer boundary problem
    somewhere in your layers of InputStream-nes

    BugBear
    bugbear, Apr 13, 2005
    #5
  6. Alex Buell Guest

    On Wed, 13 Apr 2005 14:15:13 +0100, bugbear
    <bugbear@trim_papermule.co.uk_trim> wrote:

    > wrote:
    >> This one is just fantastic! I have a large binary file being processed
    >> in Java. After insertion of much debugging code, and with a hex editor
    >> I have discovered the following behaviour.
    >>
    >> At some point the DataInputStream (dis) which I am using to read the
    >> file hits the following sequence of bytes;
    >>
    >> ... 05 00 00 00 00 00 03 07 C0 08 05 24 18 4D 11 E0 A8 ...
    >>
    >> The following sequence of methods are executed;
    >>
    >> dis.read() ... gives 5 (0x5) ... fine
    >> dis.readLong() ... gives 198592 (0x00000000000307C0) ... fine
    >> dis.readInt() ... gives 134554648 (0x08052418) ... fine
    >> dis.readInt() ... gives 407704032 (0x184D11E0) ... WTF!?
    >>
    >> Notice anything about that last one? ... The last byte read by the
    >> preceeding readInt() is being read as the first byte by this
    >> readInt()!!!!!
    >>
    >> The really annoying thing is that its intermittent. Happens every time
    >> today. Worked fine yesterday. Happened every time the day before. There
    >> is nothing unusal about my system. No background processes that could
    >> be getting involved. No changes in it from day to day.

    >
    >I would recommend interposing a BufferredInputStream between
    >your DataInputStream and your actual InputStream, and
    >messing around with the BufferSize to see what happens.
    >I suspect this wil "stir the pot".
    >
    >This sounds (horribly) like a buffer boundary problem
    >somewhere in your layers of InputStream-nes


    Change to BufferedReader instead. Isn't DataInputStream old hat
    anyway?

    Cheers,
    Alex.
    --
    http://www.munted.org.uk
    Alex Buell, Apr 13, 2005
    #6
  7. El Guest

    The DataInputStream is wrapping a FileInputStream.

    The nature of the app means that the file varies considerably from
    execution to execution. My OP was just one example, but the same thing
    happens (with different numbers) each execution.
    El, Apr 13, 2005
    #7
  8. El Guest

    There really is nothing special about how the stream is created. A
    FileInputStream is created and then turned to a DataInputStream.

    It is difficult to test on other systems as the project is a cumbersome
    in terms of the other bits and pieces that have to be configured in
    order to run it so Im pretty much stuck where I am.
    El, Apr 13, 2005
    #8
  9. El Guest

    I threw a BufferredInputStream into the mix and it worked. I should add
    that its only worked once ... the program is slow so itll take a while
    to gain confidence in this result.

    Thats just bad. You understand what the problem actually is?

    Thanks for the suggestion.
    El, Apr 13, 2005
    #9
  10. Nigel Wade Guest

    Alex Buell wrote:

    > On Wed, 13 Apr 2005 14:15:13 +0100, bugbear
    > <bugbear@trim_papermule.co.uk_trim> wrote:
    >
    >> wrote:
    >>> This one is just fantastic! I have a large binary file being processed
    >>> in Java. After insertion of much debugging code, and with a hex editor
    >>> I have discovered the following behaviour.
    >>>
    >>> At some point the DataInputStream (dis) which I am using to read the
    >>> file hits the following sequence of bytes;
    >>>
    >>> ... 05 00 00 00 00 00 03 07 C0 08 05 24 18 4D 11 E0 A8 ...
    >>>
    >>> The following sequence of methods are executed;
    >>>
    >>> dis.read() ... gives 5 (0x5) ... fine
    >>> dis.readLong() ... gives 198592 (0x00000000000307C0) ... fine
    >>> dis.readInt() ... gives 134554648 (0x08052418) ... fine
    >>> dis.readInt() ... gives 407704032 (0x184D11E0) ... WTF!?
    >>>
    >>> Notice anything about that last one? ... The last byte read by the
    >>> preceeding readInt() is being read as the first byte by this
    >>> readInt()!!!!!
    >>>
    >>> The really annoying thing is that its intermittent. Happens every time
    >>> today. Worked fine yesterday. Happened every time the day before. There
    >>> is nothing unusal about my system. No background processes that could
    >>> be getting involved. No changes in it from day to day.

    >>
    >>I would recommend interposing a BufferredInputStream between
    >>your DataInputStream and your actual InputStream, and
    >>messing around with the BufferSize to see what happens.
    >>I suspect this wil "stir the pot".
    >>
    >>This sounds (horribly) like a buffer boundary problem
    >>somewhere in your layers of InputStream-nes

    >
    > Change to BufferedReader instead. Isn't DataInputStream old hat
    > anyway?
    >
    > Cheers,
    > Alex.


    No, he certainly doesn't want to use any kind of io.Reader. They are for
    reading character streams, which would be no use for binary data.

    To the OP, could this be a problem with the underlying filesystem? What
    happens if you strip out everything apart from the FileInputStream and
    DataInputStream to read the data?

    --
    Nigel Wade, System Administrator, Space Plasma Physics Group,
    University of Leicester, Leicester, LE1 7RH, UK
    E-mail :
    Phone : +44 (0)116 2523548, Fax : +44 (0)116 2523555
    Nigel Wade, Apr 13, 2005
    #10
  11. Alex Buell <> writes:

    > Change to BufferedReader instead. Isn't DataInputStream old hat
    > anyway?


    Of course not: Readers are for character data, this is binary data.
    Tor Iver Wilhelmsen, Apr 13, 2005
    #11
  12. El wrote:
    > Thats just bad. You understand what the problem actually is?


    We can only guess. E.g. is the file still written while you start to
    read it (maybe the OS gets confused)? Do you (accidentally) share the
    reference to the reader (e.g. in another thread)? Does the problem
    happen at some magic position in the file (multiple of 512 bytes, 1k,
    2k, 1G, etc.)?

    Does it happen only if you read the file from the particular file
    system? Can you change the type of the file system? Is that by any
    change a network mounted / shared file system? Does it happen with
    different hard drives, or only with one particular drive? Does it happen
    on different machined / different motherboards, or only a particular
    one, or ones of a particular type? Does it happen with other VMs, too?
    Are you absolutely sure the input data is correct?

    You best bet would be to manage to create a stand-alone test case which
    reproduces the bug, at least most of the time, in an acceptable time
    frame. From that you have a much better change to work.

    /Thomas


    --
    The comp.lang.java.gui FAQ:
    ftp://ftp.cs.uu.nl/pub/NEWS.ANSWERS/computer-lang/java/gui/faq
    Thomas Weidenfeller, Apr 13, 2005
    #12
  13. bugbear Guest

    El wrote:
    > I threw a BufferredInputStream into the mix and it worked. I should add
    > that its only worked once ... the program is slow so itll take a while
    > to gain confidence in this result.
    >
    > Thats just bad. You understand what the problem actually is?
    >
    > Thanks for the suggestion.
    >


    It wasn't meant to be a fix - just part of an information
    gathering excercise leading to a daignosis - with any luck!

    BugBear
    bugbear, Apr 13, 2005
    #13
  14. bugbear Guest

    El wrote:
    > I threw a BufferredInputStream into the mix and it worked. I should add
    > that its only worked once ... the program is slow so itll take a while
    > to gain confidence in this result.
    >
    > Thats just bad. You understand what the problem actually is?
    >
    > Thanks for the suggestion.
    >


    BTW, for *performance*, you should have a BufferredInputStream
    over your FileInputStream anyway. FileInputStream
    was never meant to accessed a byte at a time.

    BugBear
    bugbear, Apr 13, 2005
    #14
  15. Chris Uppal Guest

    El wrote:

    > There really is nothing special about how the stream is created. A
    > FileInputStream is created and then turned to a DataInputStream.


    Well, that /shouldn't/ cause problems, but it obviously is. So it seems that
    /something/ in the IO stack between the lower levels of Java and the actual
    disk is flaky on your box -- it could be the disk, the file-system, the kernel,
    the IO libraries linked into the JVM, ....

    Whatever, I recommend ensuring that your distaster recovery plans are adequate,
    and that you have proper (validated) backups.

    However, you shouldn't be putting DataInputStream directly around a
    FileInputStream; there should be a layer of buffering inbetween, otherwise --
    for instance -- reading one int will cause 4 separate read()s to hit the
    kernel. That will be /killing/ your performance. It is also possible that
    such an unusual load is showing up bugs that would not affect an application
    that read data in "normal" sized chunks. If so then fixing the buffering may
    fix the problem. Unfortunately it may instead only /mask/ the problem, so that
    it seems as if it's fixed but it's really still waiting to happen again later
    (or even for the disk to crash...).

    -- chris
    Chris Uppal, Apr 13, 2005
    #15
  16. El wrote:
    > I threw a BufferredInputStream into the mix and it worked. I should add
    > that its only worked once ... the program is slow so itll take a while
    > to gain confidence in this result.
    >
    > Thats just bad. You understand what the problem actually is?
    >
    > Thanks for the suggestion.
    >


    So I guess the system has errors when the IO is being hit real
    hard. You shoud worry about that.

    It occurs to me that if you are writing results to another file
    byte-by-byte as well, (i.e. a FileOutputStream not wrapped in a
    BufferedOutputStream) than this will also be generating lots of
    unneccesary I/O calls and hurting performance.

    Steve
    Steve Horsley, Apr 13, 2005
    #16
  17. wrote:
    > This one is just fantastic! I have a large binary file being processed
    > in Java. After insertion of much debugging code, and with a hex editor
    > I have discovered the following behaviour.
    >


    [snip]

    > Am running on Linux.
    >
    > $ java -version
    > java version "1.4.2_02"
    > Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.2_02-b03)
    > Java HotSpot(TM) Client VM (build 1.4.2_02-b03, mixed mode)
    >
    > Seriously! Whats that about? Anyone ever seen anything like this before?
    >


    I too am running on Linux, SuSE 9.2, kernel 2.6.8, and Java 1.4.2_06. I
    tried to replicate your symptoms in Java (code below), and used a c app
    (code below) to diff the outputs. The only oddity I could find is that
    Long.toHexString(long) does not want to convert bytes with values less
    than 16 correctly. It keeps dropping the '0', i.e. x02 is always printed
    as '2'.

    You need to look elsewhere in your application for the cause of your
    failure. The alternating pattering is very interesting, and might be
    telling you where the problem is perhaps. As a general rule, and IMHO,
    software tends to fail the same way all the time, and not just part time.

    Hope this helps. Code below;

    // tdis.c
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    #include <fcntl.h>

    int
    main(int argc, char *argv[])
    {
    int fd,ii = 0,rdType[] = { 1, 8, 8, 4, 4, 0 };
    char byte[8];

    fd=open(argv[1],O_RDONLY);
    while(-1)
    {
    int jj;

    if (rdType[ii] != read(fd,&byte,rdType[ii]))
    {
    close(fd);
    exit(0);
    }

    for(jj=0;jj < rdType[ii];jj++)
    printf("%x",byte[jj]);
    printf("\n");

    if (!(rdType[++ii])) ii = 0;
    }
    }

    // tDis.java
    import java.io.DataInputStream;
    import java.io.FileInputStream;

    public class tDis
    {
    public static void main(String[] args) throws Exception
    {
    DataInputStream dis = null;

    try {
    dis = new DataInputStream(new
    FileInputStream((String)args[0]));

    while(true)
    {
    System.out.println(Integer.toHexString(dis.readByte()));
    System.out.println(Long.toHexString(dis.readLong()));
    System.out.println(Long.toHexString(dis.readLong()));
    System.out.println(Integer.toHexString(dis.readInt()));
    System.out.println(Integer.toHexString(dis.readInt()));
    }
    } catch(Exception ex) {}

    if (null != dis) dis.close();
    System.exit(0);
    }
    }

    joseph
    Joseph Dionne, Apr 13, 2005
    #17
  18. El Guest

    Thanks for all the replies. Im going to leave this for the time being.
    Inserting the BufferedInputStream is consistently working (dont know
    why I didnt have it in there anyway).

    I may come back to this when I have more time and see if I can pin
    point the problem, so keep your eyes peeled for a future post.

    Thanks again.
    El, Apr 15, 2005
    #18
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Andy Dingley
    Replies:
    45
    Views:
    1,638
    Andy Mabbett
    Jun 11, 2006
  2. RoSsIaCrIiLoIA

    ANSI C Challenge on readint

    RoSsIaCrIiLoIA, Apr 26, 2004, in forum: C Programming
    Replies:
    20
    Views:
    666
    RoSsIaCrIiLoIA
    May 2, 2004
  3. RoSsIaCrIiLoIA

    ANSI C Challenge on readint

    RoSsIaCrIiLoIA, May 2, 2004, in forum: C Programming
    Replies:
    0
    Views:
    403
    RoSsIaCrIiLoIA
    May 2, 2004
  4. RC
    Replies:
    2
    Views:
    435
    Chase Preuninger
    Jan 8, 2008
  5. Rolando Abarca

    IO.readint ?

    Rolando Abarca, Oct 12, 2006, in forum: Ruby
    Replies:
    15
    Views:
    180
    Ross Bamford
    Oct 14, 2006
Loading...

Share This Page