print cyriilic (UTF-8) characters with PERL on HTML page

A

andipfaff

Hi there,

beginners question: I have written a small website with a PERL script
which can be viewed in 4 western european languages. Each text to be
displayed is stored in arrays like
$text1[1] = qq(Hallo);
$text1[2] = qq(Salut);
$text1[3] = qq(Saluti);
$text1[4] = qq(Hello);
and printed to STDOUT. Now I want to extend this with cyrillic
characters (5th language: russian). Unfortunately I have no idea how to
do that. I simply tried to cut&paste cyrillic letters from websites
like wikipedia cyrillic aplphabet, but in my Editor (DZSoft) I just get
a question mark when pasting. In Notepad pasting is OK, because he is
capable of UTF-8. DZSoft does not seem to be.

What are the minimum requirements to print a cyrillic character with
PERL into a website? I am working with Windows 2k english or german,
IIS 5 on a W2k Server, ActiveState PERL, and a MySQL database.

The PERL script is using CGI, but HTML code is written directly:
print qq(Content-type: text/html\n\n
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html>
<head> etc.

Does anybody can give me a simple script which I can test with Notepad
as en editor?

Thanks in advance
Andi Pfaff
 
P

Peter J. Holzer

beginners question: I have written a small website with a PERL script

The language is called "Perl", not "PERL".
which can be viewed in 4 western european languages. Each text to be
displayed is stored in arrays like
$text1[1] = qq(Hallo);
$text1[2] = qq(Salut);
$text1[3] = qq(Saluti);
$text1[4] = qq(Hello);
and printed to STDOUT. Now I want to extend this with cyrillic
characters (5th language: russian). Unfortunately I have no idea how to
do that. I simply tried to cut&paste cyrillic letters from websites
like wikipedia cyrillic aplphabet, but in my Editor (DZSoft) I just get
a question mark when pasting. In Notepad pasting is OK, because he is
capable of UTF-8.

So what's your problem? (Well, being forced to use notepad may count as
a problem, but it should be good enough for a short test script)

What are the minimum requirements to print a cyrillic character with
PERL into a website?

I don't think I understand the question.
The PERL script is using CGI, but HTML code is written directly:
print qq(Content-type: text/html\n\n

You need to declare the charset in the content-type:

print qq(Content-type: text/html; charset=utf-8\n\n

or you need to convert all non-latin-1 characters to entities.
Does anybody can give me a simple script which I can test with Notepad
as en editor?


#!/usr/bin/perl
use utf8;
use warnings;
use strict;

my $txt = "ÑкраноплаÌн";

binmode STDOUT, ":utf8";
print "Content-Type: text/html; charset=utf-8\n";
print "\n";
print "<p>$txt</p>\n";

hp
 
A

andipfaff

auf so eine hochnäsige Antwort sollte man sich eigentlich den
Kommentar sparen!
- wieso fragst du nach dem problem, obwohl du noch gar nicht bis zum
Ende gelesen hast?
- wenn du die Frage nach dem Minnimum nicht verstehst, wieso
beantwortest du diese dann doch noch am Ende deines Kommentars?
- wieso soll Notepad ein Problem sein? Er ist immerhin kostenlos und
kann Unicode. Alle anderen drei Editoren, die ich probiert habe,
können das nicht
- Ob du nun Perl oder PERL schreibst, ist mir wurscht, denn
offensichtlich hast du ja trotz der falschen Rechtschreibung kapiert,
worum es geht, oder?
- ich benötige kein "binmode STDOUT, ":utf8";", es funktioniert auch
ohne
beginners question: I have written a small website with a PERL script

The language is called "Perl", not "PERL".
which can be viewed in 4 western european languages. Each text to be
displayed is stored in arrays like
$text1[1] = qq(Hallo);
$text1[2] = qq(Salut);
$text1[3] = qq(Saluti);
$text1[4] = qq(Hello);
and printed to STDOUT. Now I want to extend this with cyrillic
characters (5th language: russian). Unfortunately I have no idea how to
do that. I simply tried to cut&paste cyrillic letters from websites
like wikipedia cyrillic aplphabet, but in my Editor (DZSoft) I just get
a question mark when pasting. In Notepad pasting is OK, because he is
capable of UTF-8.

So what's your problem? (Well, being forced to use notepad may count as
a problem, but it should be good enough for a short test script)

What are the minimum requirements to print a cyrillic character with
PERL into a website?

I don't think I understand the question.
The PERL script is using CGI, but HTML code is written directly:
print qq(Content-type: text/html\n\n

You need to declare the charset in the content-type:

print qq(Content-type: text/html; charset=utf-8\n\n

or you need to convert all non-latin-1 characters to entities.
Does anybody can give me a simple script which I can test with Notepad
as en editor?


#!/usr/bin/perl
use utf8;
use warnings;
use strict;

my $txt = "ÑкраноплаÌн";

binmode STDOUT, ":utf8";
print "Content-Type: text/html; charset=utf-8\n";
print "\n";
print "<p>$txt</p>\n";

hp


--
_ | Peter J. Holzer | > Wieso sollte man etwas erfinden was nicht
|_|_) | Sysadmin WSR | > ist?
| | | (e-mail address removed) | Was sonst wäre der Sinn des Erfindens?
__/ | http://www.hjp.at/ | -- P. Einstein u. V. Gringmuth in desd
 
A

anno4000

Please don't top-post. Put your comments after the quoted text you're
replying to. Also, this isn't private mail, it's Usenet. Even if
your partner seems to have a German address your reply should be in
the preferred language of the group, which is English.
auf so eine hochnäsige Antwort sollte man sich eigentlich den
Kommentar sparen!

That would have been a good idea.
- wieso fragst du nach dem problem, obwohl du noch gar nicht bis zum
Ende gelesen hast?

Your problem was poorly stated.
- wenn du die Frage nach dem Minnimum nicht verstehst, wieso
beantwortest du diese dann doch noch am Ende deines Kommentars?

Just as poorly stated. "Printing into a website" can be done
in many different ways. Which do you intend to use?

[...]
- Ob du nun Perl oder PERL schreibst, ist mir wurscht, denn
offensichtlich hast du ja trotz der falschen Rechtschreibung kapiert,
worum es geht, oder?

If you don't know to spell the language you're using, don't expect to be
taken very seriously. Just good advice for the future. For the moment
I believe you have forfeited your chances for a useful reply.

Anno
 
J

Jürgen Exner

andipfaff wrote:
[Unfortunately you have chosen a quoting style that makes is very difficult
to identify what are are referring to; trying to mitigate]
auf so eine hochnäsige Antwort sollte man sich eigentlich den
Kommentar sparen!

Considering how poorly the original question was stated IMHO it was a pretty
moderate response.
- wieso fragst du nach dem problem, obwohl du noch gar nicht bis zum
Ende gelesen hast?

I can only guess that you are refering to:

Please note that just before the OP stated:
So obviously there is no problem here except for him being forced to use
Notepad, or is there?
Besides: it is beyond me how the editor used could have any impact on how to
print something from a program.

- wenn du die Frage nach dem Minnimum nicht verstehst, wieso
beantwortest du diese dann doch noch am Ende deines Kommentars?

I can only guess that you are refering to the OPs question

and the program sample that was provided by Peter.
Well, Peter didn't answer the OPs question at all. The correct answer to the
OPs question would have been:
Your question is non-sensical because the Perl print() command always writes
to a file handle. It cannot print to a web site (whatever that is supposed
to mean!). If you are asking about which minimal format/content a web page
must have to display cyrillic characters then please ask in a newsgroup that
deals with HTML.

Peter was polishing his crystal ball and reading tea leafes to guess what
the OP may have been talking about and kindly provided that HTML
format/content although it has nothing to do with Perl. If it solves the OPs
problem is unclear because the OP really didn't say if the content/fromat of
the HTML page was his problem in the frist place.
- wieso soll Notepad ein Problem sein? Er ist immerhin kostenlos und
kann Unicode. Alle anderen drei Editoren, die ich probiert habe,
können das nicht

Your choice of editor has no bearing on the execution of Perl program or its
output. Therefore even mentioning them in the original posting was a red
hering.
- Ob du nun Perl oder PERL schreibst, ist mir wurscht, denn
offensichtlich hast du ja trotz der falschen Rechtschreibung kapiert,
worum es geht, oder?

Well, then I guess you don't care about "Only perl can parse Perl" (please
note the capitalization!). You problem, because it does have an important
meaning.
- ich benötige kein "binmode STDOUT, ":utf8";", es funktioniert auch
ohne

[Leaving the rest of the previous posting in to provide context in case I
missinterpreted the references]
Peter said:
beginners question: I have written a small website with a PERL
script

The language is called "Perl", not "PERL".
which can be viewed in 4 western european languages. Each text to be
displayed is stored in arrays like
$text1[1] = qq(Hallo);
$text1[2] = qq(Salut);
$text1[3] = qq(Saluti);
$text1[4] = qq(Hello);
and printed to STDOUT. Now I want to extend this with cyrillic
characters (5th language: russian). Unfortunately I have no idea
how to do that. I simply tried to cut&paste cyrillic letters from
websites like wikipedia cyrillic aplphabet, but in my Editor
(DZSoft) I just get a question mark when pasting. In Notepad
pasting is OK, because he is capable of UTF-8.

So what's your problem? (Well, being forced to use notepad may count
as a problem, but it should be good enough for a short test script)

What are the minimum requirements to print a cyrillic character with
PERL into a website?

I don't think I understand the question.
The PERL script is using CGI, but HTML code is written directly:
print qq(Content-type: text/html\n\n

You need to declare the charset in the content-type:

print qq(Content-type: text/html; charset=utf-8\n\n

or you need to convert all non-latin-1 characters to entities.
Does anybody can give me a simple script which I can test with
Notepad as en editor?


#!/usr/bin/perl
use utf8;
use warnings;
use strict;

my $txt = "???????????";

binmode STDOUT, ":utf8";
print "Content-Type: text/html; charset=utf-8\n";
print "\n";
print "<p>$txt</p>\n";

jue
 
P

peter_emmerich

Hi Andi,

assuming you are working with a script running on a webserver (you
mentioned print qq(Content-type: text/html;.....)) here is the perl
code you need at least to print unicode/utf-8 with a script:

#!/usr/bin/perl -w
use CGI;
$my_string = qq(ä ö ü à é è Ä Ö Ü Ж ж З з И и Й й);
print qq(Content-type: text/html; charset=utf-8\n\n
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<META HTTP-EQUIV="CONTENT-TYPE" content="text/html; charset=utf-8">
<title>UTF-8 Testpage</title>
</head>
<body>
$my_string
</body>
</html>
);

Create the file with Windows Notepad or Notepad++ or another Unicode
capable editor in UTF-8 format, otherwise it will not work!

Best regards
PE
 
A

andipfaff

Thanks,

that's exactly the answer I was waiting for!

Have a nice weekend
Andi

peter_emmerich said:
Hi Andi,

assuming you are working with a script running on a webserver (you
mentioned print qq(Content-type: text/html;.....)) here is the perl
code you need at least to print unicode/utf-8 with a script:

#!/usr/bin/perl -w
use CGI;
$my_string = qq(ä ö ü à é è Ä ÖÜ Ж ж З з И и Й й);
print qq(Content-type: text/html; charset=utf-8\n\n
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<META HTTP-EQUIV="CONTENT-TYPE" content="text/html; charset=utf-8">
<title>UTF-8 Testpage</title>
</head>
<body>
$my_string
</body>
</html>
);

Create the file with Windows Notepad or Notepad++ or another Unicode
capable editor in UTF-8 format, otherwise it will not work!

Best regards
PE
Hi there,

beginners question: I have written a small website with a PERL script
which can be viewed in 4 western european languages. Each text to be
displayed is stored in arrays like
$text1[1] = qq(Hallo);
$text1[2] = qq(Salut);
$text1[3] = qq(Saluti);
$text1[4] = qq(Hello);
and printed to STDOUT. Now I want to extend this with cyrillic
characters (5th language: russian). Unfortunately I have no idea how to
do that. I simply tried to cut&paste cyrillic letters from websites
like wikipedia cyrillic aplphabet, but in my Editor (DZSoft) I just get
a question mark when pasting. In Notepad pasting is OK, because he is
capable of UTF-8. DZSoft does not seem to be.

What are the minimum requirements to print a cyrillic character with
PERL into a website? I am working with Windows 2k english or german,
IIS 5 on a W2k Server, ActiveState PERL, and a MySQL database.

The PERL script is using CGI, but HTML code is written directly:
print qq(Content-type: text/html\n\n
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html>
<head> etc.

Does anybody can give me a simple script which I can test with Notepad
as en editor?

Thanks in advance
Andi Pfaff
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,055
Latest member
SlimSparkKetoACVReview

Latest Threads

Top