reading file, asii 161 (meta-space) converted to question mark

Discussion in 'Java' started by Michael Muller, Sep 17, 2003.

  1. I'm trying to read an HTML file that has been generated by MS excel.
    When I use od -c to examine this file, I see lots of octal 240
    (decimal 161) chars. This is supposedly a "meta space", whatever that
    means. When I read the file in on Windows, everything works ok (the
    characters stay as 240), but when I read the file in on linux (RH9),
    the "meta-spaces" are converted to question marks, rendering the html
    unreadable.

    My LANG envar on unix is set to us_ENG.UTF-8. On windows, it's not
    set. I tried unsetting and exporting LANG on linux -- no joy.

    Help! I'm using 1.4.2 on Linux and 1.4.1 on windows. I sure hope
    that's not the issue. The code that reads the file is appended.

    Thanks in advance for any help anyone can offer,

    -- Mike

    private static String slurp(File file)
    throws IOException
    {
    StringBuffer sb = new StringBuffer();
    char[] buf = new char[1024 * 4];
    BufferedReader br = new BufferedReader(new FileReader(file));
    int bytesRead;
    while ((bytesRead = br.read(buf, 0, buf.length)) != -1)
    {
    sb.append(buf, 0, bytesRead);
    }

    return sb.toString();
    }
    Michael Muller, Sep 17, 2003
    #1
    1. Advertising

  2. Michael Muller

    Roedy Green Guest

    Roedy Green, Sep 18, 2003
    #2
    1. Advertising

  3. Michael Muller

    Neomorph Guest

    On 17 Sep 2003 15:58:26 -0700, (Michael Muller)
    two-finger typed:

    >I'm trying to read an HTML file that has been generated by MS excel.
    >When I use od -c to examine this file, I see lots of octal 240
    >(decimal 161) chars.


    That should be: 160 decimal.
    You can't have a even octal number becoming an uneven decimal number ;-)

    > This is supposedly a "meta space", whatever that
    >means.


    Usually used to 'connect' two words, so they are not split when realigning
    text. Like the non-breaking space in HTML (coded as  ).

    >When I read the file in on Windows, everything works ok (the
    >characters stay as 240), but when I read the file in on linux (RH9),
    >the "meta-spaces" are converted to question marks, rendering the html
    >unreadable.


    HTML should either only contain US ASCII (32-127), or should have a special
    codepage/encoding set.
    You should be replacing the 0240 (octal) with the code   as long as
    it's not part of a parameter value.

    >My LANG envar on unix is set to us_ENG.UTF-8. On windows, it's not
    >set. I tried unsetting and exporting LANG on linux -- no joy.


    Either way, the Linux font probably has no correlation to that character
    code.

    >Help! I'm using 1.4.2 on Linux and 1.4.1 on windows. I sure hope
    >that's not the issue. The code that reads the file is appended.
    >
    >Thanks in advance for any help anyone can offer,
    >
    > -- Mike
    >
    > private static String slurp(File file)
    > throws IOException
    > {
    > StringBuffer sb = new StringBuffer();
    > char[] buf = new char[1024 * 4];
    > BufferedReader br = new BufferedReader(new FileReader(file));
    > int bytesRead;
    > while ((bytesRead = br.read(buf, 0, buf.length)) != -1)
    > {
    > sb.append(buf, 0, bytesRead);
    > }
    >
    > return sb.toString();
    > }



    Cheers.
    Neomorph, Sep 18, 2003
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Cirene
    Replies:
    5
    Views:
    569
    Cirene
    May 17, 2008
  2. mdh

    p 161...File access

    mdh, Sep 29, 2008, in forum: C Programming
    Replies:
    17
    Views:
    555
    Keith Thompson
    Sep 29, 2008
  3. M. Edward (Ed) Borasky

    Resubmitting legacy RCR 161

    M. Edward (Ed) Borasky, Aug 6, 2006, in forum: Ruby
    Replies:
    4
    Views:
    95
    M. Edward (Ed) Borasky
    Aug 7, 2006
  4. Matthew Moss
    Replies:
    34
    Views:
    525
    Harry Kakueki
    May 9, 2008
  5. Matthew Moss
    Replies:
    1
    Views:
    146
    ThoML
    May 9, 2008
Loading...

Share This Page