Re: number of bytes for each (uni)code point while using utf-8 as encoding ...

Discussion in 'Java' started by Roedy Green, Jul 12, 2012.

  1. Roedy Green

    Roedy Green Guest

    On 10 Jul 2012 10:21:30 GMT, lbrt chx _ gemale wrote, quoted or
    indirectly quoted someone who said :

    >number of bytes for each (uni)code point while using utf-8 as encoding ...


    Let's assume there is something not quite right in the UTF-8 encoding
    of the file (or possibly the file is not even UTF-8).

    Read the file with a Reader and UTF-8 encoding.
    see http://mindprod.com/applet/fileio.html for the code.

    Then write the internal encoding back out to another file. See code at
    same place.

    Compare the files byte by byte till you figure out what is going on.

    Unicode has alternate ways of doing accents, with a single glyph and
    with a separate accent dead key. That may be nailing you. You might be
    adding/losing BOM marks. See http://mindprod.com/jgloss/bom.html
    though I have never seen Java insert or remove one.
    ..



    --
    Roedy Green Canadian Mind Products
    http://mindprod.com
    Mathematicians and computer scientists are far more interested
    in impressing you than informing you. If this were not
    so, the tutorials on building a robots.txt file, for example,
    would consist primarily of an annotated example. What you get
    instead are nothing but inscrutable adstract fragments in some
    obscure dialect of BNF.
    Roedy Green, Jul 12, 2012
    #1
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Daniele Futtorovic
    Replies:
    0
    Views:
    202
    Daniele Futtorovic
    Jul 10, 2012
  2. Lew
    Replies:
    0
    Views:
    211
  3. Daniele Futtorovic
    Replies:
    1
    Views:
    297
  4. Robert Klemme
    Replies:
    0
    Views:
    213
    Robert Klemme
    Jul 11, 2012
  5. Lew
    Replies:
    0
    Views:
    202
Loading...

Share This Page