non-western characters in links and JAVASCRIPT?

N

namemattersnot

re,

I have the following problem: links containinig cyrillic characters do
not display by the javascript.

server-side PHP script encodes them with rawurlecode() function and
everything works in Firefox , however, the encoded link does not work
in IE and only when trying to open it from within a javascript.

Here's the example: http://tmd.df.ru/test

Any ideas? :)
 
V

VK

re,

I have the following problem: links containinig cyrillic characters do
not display by the javascript.

server-side PHP script encodes them with rawurlecode() function and
everything works in Firefox , however, the encoded link does not work
in IE and only when trying to open it from within a javascript.

Here's the example: http://tmd.df.ru/test

Any ideas? :)

What encoding is set in Content-Type by your server?
 
N

namemattersnot

honestly - i don't know. i've tested it on 3 different servers (one
located in Russia, 1 in the States, and one installed on my own box)
and regardless of the content-type that I specify in the <meta tag>,
the problem still persists.

i don't have a single other problem displaying links encoded in KOI8-R,
CP1251, etc.
 
G

Gérard Talbot

(e-mail address removed) wrote :

1-
Re-edit your document so that <title> is inside the <head> part:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="ru" lang="ru">

<head>

<meta http-equiv="Content-Type" content="text/html;
charset=windows-1251"></meta>
<meta http-equiv="Content-Language" content="ru"></meta>
<meta http-equiv="Content-Style-Type" content="text/css"></meta>

<title>test</title>

(...)

</head>

<body>

2-
If you are going to serve your document as text/html, then why do you
choose XHTML 1.0 strict? HTML 4.01 strict would do just fine.

3-
HTML 4.01 section B.2.1 Non-ASCII characters in URI attribute values
http://www.w3.org/TR/html401/appendix/notes.html#non-ascii-chars

I would escape the non-ascii characters as suggested by the HTML 4.01
spec: "Escape these bytes with the URI escaping mechanism (i.e., by
converting each byte to %HH, where HH is the hexadecimal notation of the
byte value)."

Gérard
 
V

VK

honestly - i don't know. i've tested it on 3 different servers (one
located in Russia, 1 in the States, and one installed on my own box)
and regardless of the content-type that I specify in the <meta tag>,
the problem still persists.

i don't have a single other problem displaying links encoded in KOI8-R,
CP1251, etc.

This works just fine (presuming paneuropean support is installed):

That should be the word "Russian" written in Russian through all
samples. Edit if jamed by newsreader. Also check that the declared
encoding corresponds to the *actual encoding used to type the text*
(thus to not bother with any other encoding issues: only the encoding
used to type the text and meta-content-type matching to that encoding).


<html>
<head>
<title>Russian</title>
<meta http-equiv="Content-Type"
content="text/html; charset=windows-1251">
<script type="text/javascript" charset="windows-1251">
function test(v) {
alert(v);
alert('РуÑÑкий');
}
</script>
</head>

<body>

<p>
<a href="foo.html" onclick="test('РуÑÑкий');return
false;">РуÑÑкий</a>
</p>

</body>
</html>
 
N

namemattersnot

Gérard, thanks for the answer. this doesn't seem to be the
content-type problem nor my compliance with XHTML or HTML standards :)
just in case, i've modified the file as per your suggetion; alas, to no
avail.

non-ascii characters are escaped/converted using the PHP's
rawurlencode() function. as you can see.. it works without the
javascript in IE but doesn't when i pass the link to my script.

odd.
 
N

namemattersnot

VK,

your little example works just fine. still.. i don't have an answer to
my original post :)

this has been bugging me a couple of days now.
 
V

VK

VK,

your little example works just fine. still.. i don't have an answer to
my original post :)

this has been bugging me a couple of days now.

Indicate the right charset in the page head and <script> (if needed).
If you want to have extra chars in the page text, use &...; form
If you want to have extra chars in your script use \uFFFF form

%FF form is *not* legal neither in HTML text nor in script. Do not
confuse it with urlencoded GET request. Bring your page into right form
and be happy ever after :)
 
N

namemattersnot

the right charset is already indicated in the head section. russian
characters, therefore, are properly displayed. why wouldn't links work?
how do i encode them then?
 
V

VK

the right charset is already indicated in the head section. russian
characters, therefore, are properly displayed. why wouldn't links work?
how do i encode them then?

Again: %FF format is not valid neither in the HTML page nor within
<script> literals

This part:
<a href="test_%20%F0%F3%F1%F1%EA%E8%E9%201.gif">
is not valid.

Make sure that it's a plain Russian text in the encoding matching to
the declared Content-Type.
 
G

Gérard Talbot

Gérard, thanks for the answer. this doesn't seem to be the
content-type problem nor my compliance with XHTML or HTML standards :)
just in case, i've modified the file as per your suggetion; alas, to no
avail.

Well, this should always be your first step whenever a page has some
problem: validate the markup and CSS code, then the problem gets easier
to figure out as you're eliminating a potential source of the problem
otherwise the problem often disappears.
non-ascii characters are escaped/converted using the PHP's
rawurlencode() function. as you can see.. it works without the
javascript in IE but doesn't when i pass the link to my script.

odd.

You do not need that rawurlencode function. You can escape the href
value while not escaping the anchor node.

Maybe I misunderstand your problem. I still don't see why you need
javascript in there.

Gérard
 
N

namemattersnot

Javascript makes some neat effect with the picture. take a look in
Firefox.

in any case, i am grateful for all of your comments. thank you! I will
take a look at it in a couple of days -- too annoyed at the moment :)
 
G

Ge'rard Talbot

Javascript makes some neat effect with the picture. take a look in
Firefox.

What has the visual effect on the image to do with non-ascii characters
in the href value? This is what I do not understand.

in any case, i am grateful for all of your comments. thank you! I will
take a look at it in a couple of days -- too annoyed at the moment :)

This is how I would code your link:

<a href="test_%20%F0%F3%F1%F1%EA%E8%E9%201.gif">test_ ?o'n~n~e^e`e' 1.gif</a>

and this is what browsers like Mozilla and Firefox do: they
convert/escape accordingly non-ascii characters the way indicated in
section B.2.1

This post was windows-1251 encoded.

Gerard
 
G

Ge'rard Talbot

Javascript makes some neat effect with the picture. take a look in
Firefox.

What has the visual effect on the image to do with non-ascii characters
in the href value? This is what I do not understand. What has the visual
effect on the image to do with the non-ascii characters?

in any case, i am grateful for all of your comments. thank you! I will
take a look at it in a couple of days -- too annoyed at the moment :)

This is how I would code your link:

<a href="test_%20%F0%F3%F1%F1%EA%E8%E9%201.gif">test_ ?o'n~n~e^e`e' 1.gif</a>

and this is what browsers like Mozilla and Firefox do: they
convert/escape accordingly non-ascii characters the way indicated in
section B.2.1 In the above example, you do it for MSIE 6+.

This post was windows-1251 encoded.

Gerard
 
T

Thomas 'PointedEars' Lahn

(e-mail address removed) wrote:
^^^^^^^^^^^^^^
Well, it matters to me.
the right charset is already indicated in the head section.
^^^^^^^
What matters is if the same _encoding_ is declared with the `charset' label
of the `Content-Type' HTTP header because that takes precedence over any

<meta http-equiv="Content-Type" content="...; charset=...">

in the (X)HTML source code if the resource is served via HTTP.
See RFC1945 [1], 3.6.1, and RFC2616 [2], section 3.4.1. Use services like
russian characters, therefore, are properly displayed.

Maybe. Maybe not.
why wouldn't links work? how do i encode them then?

You do not encode (visual hyper)links [3], you encode their target URIs as
described in RFC3986 [4], and, after that, as described in HTML 4.01 [5].
This has nothing to do with J(ava)Script/ECMAScript, though. [6]


PointedEars
___________
[1] <URL:http://www.rfc-editor.org/rfc/rfc1945.txt>
[2] <URL:http://www.rfc-editor.org/rfc/rfc2616.txt>
[3] <URL:http://www.rfc-editor.org/rfc/rfc3986.txt>
[4] <URL:http://www.w3.org/TR/html4/struct/links.html>
[5] <URL:http://www.w3.org/TR/html4/struct/links.html#h-12.2>
<URL:http://www.w3.org/TR/html4/types.html#h-6.4>
<URL:http://www.w3.org/TR/html4/appendix/notes.html#non-ascii-chars>
[6] <URL:http://validator.w3.org/>
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,011
Latest member
AjaUqq1950

Latest Threads

Top