How do i decode an ISO-8859-1 encoded text file?

S

SHIRE

Hi,

I want to decode the content of the text file which looks like this:

Subject: =?iso-8859-1?Q?Sek=2Dm=F6te?=
Content-Type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: quoted-printable

Vill bara s=E4ga att du hade r=E4tt.

I tried to do it like this:

import java.io.*;
import java.util.*;
import java.net.*;
import java.text.SimpleDateFormat;



public class MessageHandler {

BufferedReader _reader = null;

/**
* Constructor
*/
public MessageHandler(String msgFile){
try {
_reader = new BufferedReader(new InputStreamReader(new
FileInputStream(msgFile), "ISO-8859-1"));
}catch(IOException e){System.out.println(e.toString());}
}

//---------------------------------------------------------------
public void close(){
try {
_reader.close();
}catch(IOException e){
System.out.println(e.toString());
}
}

//-------------------------------------------------------------
public String getFileContent(){
String msg ="";
String line ="";
try{
while((line = _reader.readLine()) != null){
msg += line+"\r\n";
}
}catch(IOException e){
System.out.println(e.toString());
}
return msg;
}


//-------------------------------------------------------------------
public static void main(String args[]){
MessageHandler msgHandler = new MessageHandler("data.txt");
System.out.println("Content:" + msgHandler.getFileContent() );
msgHandler.close();
}

}//End of MessageHandler


But it didn't work.
Please advice.
Thanks for your help.

Mohamud Jama
 
M

Michael Borgwardt

SHIRE said:
Hi,

I want to decode the content of the text file which looks like this:

Subject: =?iso-8859-1?Q?Sek=2Dm=F6te?=
Content-Type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: quoted-printable

Vill bara s=E4ga att du hade r=E4tt.

The key here is not ISO-8859-1, it's "quoted printable".
To decode it, replace all equal signs which are followed by two
hexadecimal digits by the character with the ASCII number of those
digits' value, and whenever an equal sign ends a line, remove that
line break.
 
O

Oscar Kind

SHIRE said:
Hi,

I want to decode the content of the text file which looks like this:

Subject: =?iso-8859-1?Q?Sek=2Dm=F6te?=
Content-Type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: quoted-printable

Vill bara s=E4ga att du hade r=E4tt.

I tried to do it like this:
[code that AFAIK correctly uses character encoding]
But it didn't work.

You didn't decode the quoted printable text (7-bit text; US-ASCII) into
it's original (8-bit text; ISO-8859-1).


Oscar
 
T

Thomas Weidenfeller

SHIRE said:
Subject: How do i decode an ISO-8859-1 encoded text file?

For the record: That mail is NOT ISO-8859-1 encoded. You get ISO-8859-1
if you manage to decode it. ISO-8859-1 is a common 8bit character set
(aka Latin-1), but in order to get it down to 7 bits for mail transfer,
it needs encoding. In your case, quoted printable has been used for
encoding. You have to reverse that encoding to get an ISO-8859-1 text back.
I want to decode the content of the text file which looks like this:

Subject: =?iso-8859-1?Q?Sek=2Dm=F6te?=
Content-Type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: quoted-printable

Vill bara s=E4ga att du hade r=E4tt.

E.g., you either get the RFC which describes quoted printable encoding,
and implement your own decoder, or you get the JavaMail package from Sun
and use the decoder in that package - once you have figured out how to
use it. JavaMail will also be able to decode the subject line (which is
AFAIR defined in yet another RFC).

/Thomas
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,053
Latest member
BrodieSola

Latest Threads

Top