Problem converting euro from windows-1252 to UTF-8 !!

N

nevosa

Hi Folks,

I am trying to convert RFC-2047 encode MIME headers to UTF-8. It is
working fine so far using the MIME::WordDecoder and Unicode::MapUTF8
CPAN packages. I have done some unit testing and it seems to work fine
except for when I try to convert windows-1252 encode euro(0x80) symbol
to UTF-8.

Subject: This sub has non-ascii chars.Pound:
?windows-1252?Q?Euro_=80_?=

In this case the conversion simply fails and what i see as output is 2
spaces (0x20 0x20).

Has anyone encountered this before? Any immediate help is appreciated.

Rgds.
Naveen
 
B

Bart Van der Donck

nevosa said:
I am trying to convert RFC-2047 encode MIME headers to UTF-8. It is
working fine so far using the MIME::WordDecoder and Unicode::MapUTF8
CPAN packages. I have done some unit testing and it seems to work fine
except for when I try to convert windows-1252 encode euro(0x80) symbol
to UTF-8.

Subject: This sub has non-ascii chars.Pound:
?windows-1252?Q?Euro_=80_?=

In this case the conversion simply fails and what i see as output is 2
spaces (0x20 0x20).

Windows-1252 differs from "default" ISO-8859-1 by using displayable
characters rather than control characters in the 0x80-0x9F range. If
you're running *nix, Windows-1252 might not be available.

To make sure, see

http://en.wikipedia.org/wiki/Windows-1252

And then try to display the characters in the yellow squares. If
they're not correctly converted, you've found the culprit.

Hope this helps,
 
N

nevosa

Thanks Bart for taking out time and looking into my problem.
But my problem is not the display part. I don't mind it not being
displayed properly on my putty screen. What i want is the correct UTF-8
encoding for euro (package Unicode:MapUTF-8 claims of being capable of
converting from any charset to UTF-8 and vice -versa). But it fails to
convert to correct UTF-8 value for euro. Euro in UTF-8 should have a
hex value of 0xE2-0x82-0xAC but what i see as output for euro is
0x20-0x20 which is wrong. I also tried for other chars in range
0x80-0x9f and rest all seem to get converted to their correct UTF-8
values.Problem is only with euro (the most frequently used of them all
!!).
To me it seems that there is a missing mapping for euro in the map
files being used by the perl MapUTF-8 package. what can i do if that is
the case?
I can't edit the .map files also as they are in binary.

Rgds,
Naveen
 
D

Dr.Ruud

nevosa schreef:

Don't top-post. Get rid of the double exclamation marks. Better, get rid
of any exclamation marks. Phrases like "immediate help" are
counterproductive too.

A googlegroups search on "utf-8 mime" in this group, sorted by date,
would have brought you to this message:
http://groups.google.co.uk/[email protected]
(click on View thread)

it fails to convert to correct UTF-8 value for euro. Euro in UTF-8
should have a hex value of 0xE2-0x82-0xAC but what i see as output
for euro is 0x20-0x20 which is wrong. I also tried for other chars in
range 0x80-0x9f and rest all seem to get converted to their correct
UTF-8 values.Problem is only with euro (the most frequently used of
them all !!).
To me it seems that there is a missing mapping for euro in the map
files being used by the perl MapUTF-8 package. what can i do if that
is the case?

The two spaces come from the two underscores.

In your OP, there was an equal sign missing before the first question
mark.

perl -MEncode -le '
print decode("MIME-Header", "Subject:
{=?windows-1252?Q?_Euro_=80_?=}")
'
 
N

nevosa

Thanks.
I agree that there was a missing equal sign - that just got missed out
in copy and paste. Forgot to mention in my last post that I am using
perl5.6. I guess if i upgrade to perl8, I can use Encode and Decode the
way u suggested below. But are u aware why below statement does not
work in perl5.6. Because i want to preferably use perl 5.6.

to_utf8({ -string => "€", -charset => "windows-1252"});

Rgds,
Naveen
 
D

David Squire

nevosa said:
Thanks.
I agree that there was a missing equal sign - that just got missed out

[snip]

As Dr. Ruud said below, please don't top-post. Read the posting
guidelines for this group, that are posted here several times a week.

DS


[snip]
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,768
Messages
2,569,574
Members
45,050
Latest member
AngelS122

Latest Threads

Top