character encoding problem

Discussion in 'Java' started by bj, Jun 15, 2007.

  1. bj

    bj Guest

    hi

    i've got the following problem :
    i'am getting some string with polish characters as some artifacts how
    can i change these letters back to normal polish letters??
    i've tried something like that

    new String(b.getDescription().getBytes("UTF-8"),"ISO-8859-1")

    but it didn't work, i've got still some dirty characters in string

    please help
     
    bj, Jun 15, 2007
    #1
    1. Advertising

  2. bj

    Roedy Green Guest

    On Fri, 15 Jun 2007 01:31:51 +0200, bj
    <_TOonet.pl> wrote, quoted or indirectly quoted
    someone who said :

    >i've got the following problem :
    >i'am getting some string with polish characters as some artifacts how
    >can i change these letters back to normal polish letters??
    > i've tried something like that
    >
    >new String(b.getDescription().getBytes("UTF-8"),"ISO-8859-1")
    >
    >but it didn't work, i've got still some dirty characters in string


    You have to be careful with terms. A string is 16-bit characters. You
    probably have some bytes, some sort of 8-bit encoding.

    In any case, my essay on encoding should tell you more than you wanted
    to know about translating back and forth.

    http://mindprod.com/jgloss/encoding.html
    --
    Roedy Green Canadian Mind Products
    The Java Glossary
    http://mindprod.com
     
    Roedy Green, Jun 15, 2007
    #2
    1. Advertising

  3. bj wrote:
    > i've got the following problem :
    > i'am getting some string with polish characters as some artifacts how
    > can i change these letters back to normal polish letters??
    > i've tried something like that
    >
    > new String(b.getDescription().getBytes("UTF-8"),"ISO-8859-1")
    >
    > but it didn't work, i've got still some dirty characters in string

    Are you really sure, that you want ISO-8859-1 ?
    I doubt it, because ISO-8859-1 cannot represent some polish characters:
    for example '\u0104' (A with cedilla below), '\u0141' (L with slash across).
    Polish text is often encoded in ISO-8859-2, but not in ISO-8859-1.

    --
    Thomas
     
    Thomas Fritsch, Jun 15, 2007
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. =?Utf-8?B?VGhvbWFzIEthcmxzc29u?=

    Character encoding problem?

    =?Utf-8?B?VGhvbWFzIEthcmxzc29u?=, Feb 2, 2004, in forum: ASP .Net
    Replies:
    0
    Views:
    342
    =?Utf-8?B?VGhvbWFzIEthcmxzc29u?=
    Feb 2, 2004
  2. raavi
    Replies:
    2
    Views:
    919
    raavi
    Mar 2, 2006
  3. Dhananjay
    Replies:
    8
    Views:
    7,036
    Dhananjay
    May 10, 2006
  4. gialloporpora
    Replies:
    1
    Views:
    415
    gialloporpora
    Oct 15, 2009
  5. bob
    Replies:
    1
    Views:
    147
    Axel Etzold
    Jun 14, 2007
Loading...

Share This Page