How do i decode an ISO-8859-1 encoded text file?

Discussion in 'Java' started by SHIRE, Jan 19, 2004.

  1. SHIRE

    SHIRE Guest

    Hi,

    I want to decode the content of the text file which looks like this:

    Subject: =?iso-8859-1?Q?Sek=2Dm=F6te?=
    Content-Type: text/plain; charset=iso-8859-1
    Content-Transfer-Encoding: quoted-printable

    Vill bara s=E4ga att du hade r=E4tt.

    I tried to do it like this:

    import java.io.*;
    import java.util.*;
    import java.net.*;
    import java.text.SimpleDateFormat;



    public class MessageHandler {

    BufferedReader _reader = null;

    /**
    * Constructor
    */
    public MessageHandler(String msgFile){
    try {
    _reader = new BufferedReader(new InputStreamReader(new
    FileInputStream(msgFile), "ISO-8859-1"));
    }catch(IOException e){System.out.println(e.toString());}
    }

    //---------------------------------------------------------------
    public void close(){
    try {
    _reader.close();
    }catch(IOException e){
    System.out.println(e.toString());
    }
    }

    //-------------------------------------------------------------
    public String getFileContent(){
    String msg ="";
    String line ="";
    try{
    while((line = _reader.readLine()) != null){
    msg += line+"\r\n";
    }
    }catch(IOException e){
    System.out.println(e.toString());
    }
    return msg;
    }


    //-------------------------------------------------------------------
    public static void main(String args[]){
    MessageHandler msgHandler = new MessageHandler("data.txt");
    System.out.println("Content:" + msgHandler.getFileContent() );
    msgHandler.close();
    }

    }//End of MessageHandler


    But it didn't work.
    Please advice.
    Thanks for your help.

    Mohamud Jama
    SHIRE, Jan 19, 2004
    #1
    1. Advertising

  2. SHIRE wrote:

    > Hi,
    >
    > I want to decode the content of the text file which looks like this:
    >
    > Subject: =?iso-8859-1?Q?Sek=2Dm=F6te?=
    > Content-Type: text/plain; charset=iso-8859-1
    > Content-Transfer-Encoding: quoted-printable
    >
    > Vill bara s=E4ga att du hade r=E4tt.


    The key here is not ISO-8859-1, it's "quoted printable".
    To decode it, replace all equal signs which are followed by two
    hexadecimal digits by the character with the ASCII number of those
    digits' value, and whenever an equal sign ends a line, remove that
    line break.
    Michael Borgwardt, Jan 19, 2004
    #2
    1. Advertising

  3. SHIRE

    Oscar Kind Guest

    SHIRE <> wrote:
    > Hi,
    >
    > I want to decode the content of the text file which looks like this:
    >
    > Subject: =?iso-8859-1?Q?Sek=2Dm=F6te?=
    > Content-Type: text/plain; charset=iso-8859-1
    > Content-Transfer-Encoding: quoted-printable
    >
    > Vill bara s=E4ga att du hade r=E4tt.
    >
    > I tried to do it like this:
    >

    [code that AFAIK correctly uses character encoding]

    > But it didn't work.


    You didn't decode the quoted printable text (7-bit text; US-ASCII) into
    it's original (8-bit text; ISO-8859-1).


    Oscar

    --
    No trees were harmed in creating this message.
    However, a large number of electrons were terribly inconvenienced.
    Oscar Kind, Jan 19, 2004
    #3
  4. SHIRE wrote:

    > Subject: How do i decode an ISO-8859-1 encoded text file?


    For the record: That mail is NOT ISO-8859-1 encoded. You get ISO-8859-1
    if you manage to decode it. ISO-8859-1 is a common 8bit character set
    (aka Latin-1), but in order to get it down to 7 bits for mail transfer,
    it needs encoding. In your case, quoted printable has been used for
    encoding. You have to reverse that encoding to get an ISO-8859-1 text back.

    > I want to decode the content of the text file which looks like this:
    >
    > Subject: =?iso-8859-1?Q?Sek=2Dm=F6te?=
    > Content-Type: text/plain; charset=iso-8859-1
    > Content-Transfer-Encoding: quoted-printable
    >
    > Vill bara s=E4ga att du hade r=E4tt.


    E.g., you either get the RFC which describes quoted printable encoding,
    and implement your own decoder, or you get the JavaMail package from Sun
    and use the decoder in that package - once you have figured out how to
    use it. JavaMail will also be able to decode the subject line (which is
    AFAIR defined in yet another RFC).

    /Thomas
    Thomas Weidenfeller, Jan 19, 2004
    #4
  5. SHIRE

    SHIRE Guest

    Thank you all. I used JavaMail package for decoding.
    Thanks Thomas!

    /Mohamud


    "Thomas Weidenfeller" <> wrote in message
    news:bugpal$opu$...
    > SHIRE wrote:
    >
    > > Subject: How do i decode an ISO-8859-1 encoded text file?

    >
    > For the record: That mail is NOT ISO-8859-1 encoded. You get ISO-8859-1
    > if you manage to decode it. ISO-8859-1 is a common 8bit character set
    > (aka Latin-1), but in order to get it down to 7 bits for mail transfer,
    > it needs encoding. In your case, quoted printable has been used for
    > encoding. You have to reverse that encoding to get an ISO-8859-1 text

    back.
    >
    > > I want to decode the content of the text file which looks like this:
    > >
    > > Subject: =?iso-8859-1?Q?Sek=2Dm=F6te?=
    > > Content-Type: text/plain; charset=iso-8859-1
    > > Content-Transfer-Encoding: quoted-printable
    > >
    > > Vill bara s=E4ga att du hade r=E4tt.

    >
    > E.g., you either get the RFC which describes quoted printable encoding,
    > and implement your own decoder, or you get the JavaMail package from Sun
    > and use the decoder in that package - once you have figured out how to
    > use it. JavaMail will also be able to decode the subject line (which is
    > AFAIR defined in yet another RFC).
    >
    > /Thomas
    >
    SHIRE, Feb 4, 2004
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Franck DARRAS
    Replies:
    12
    Views:
    632
    Jim Higson
    Aug 23, 2004
  2. Harobed
    Replies:
    1
    Views:
    364
    Martin v. =?iso-8859-15?q?L=F6wis?=
    Sep 1, 2003
  3. Harobed
    Replies:
    0
    Views:
    335
    Harobed
    Sep 1, 2003
  4. Peter Jacobi
    Replies:
    13
    Views:
    845
    =?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=
    Aug 3, 2004
  5. Dan Polansky
    Replies:
    4
    Views:
    2,711
    Dan Polansky
    Apr 22, 2005
Loading...

Share This Page