URL encoding of non ASCII characters

H

Hugo

Hi,

how do I have to encode non-ASCII characters like German Umlaute? I know how
to encode "normal problematic" characters like space and &. But what do I
have to do with these non-ASCII characters?

Thanks.
 
J

Jukka K. Korpela

Scripsit Hugo:
how do I have to encode non-ASCII characters like German Umlaute? I
know how to encode "normal problematic" characters like space and &.
But what do I have to do with these non-ASCII characters?

Some browsers may support a URL encoding that is based on ISO-8859-1 or some
other assumed default, so that you would represent an Umlaut letter as an
octet (byte) by ISO-8859-1 and then encode the result as %xx where xx is the
code in hexadecimal.

However, the modern and official method is based on UTF-8. You first
represent an Umlaut letter as two octets by UTF-8, then encode both as %xx.

References:
http://www.apps.ietf.org/rfc/rfc3986.html#sec-2.5
http://www.w3.org/International/O-URL-and-ident.html
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,020
Latest member
GenesisGai

Latest Threads

Top