printing to xls with Unicode encoding character

S

sam

Hi,

I want to convert the result from the perl script to excel file. The
result contains chinese characters. What can I use to do handle this
function in Perl?

Thanks
Sam
 
P

peter pilsl

sam said:
Hi,

I want to convert the result from the perl script to excel file. The
result contains chinese characters. What can I use to do handle this
function in Perl?

What is the exact problem?

I use a script here that reads unicode from a database and prints out
CSV-data on a utf8-terminal. Then I import this CSV in OpenOffice
(cause I run linux) and specifiy UTF8 as encoding and I've all
characters just fine.
Then I saved as xls and mailed to the translators and everything was
fine. (I didnt have chinese, but I had a lot of german umlauts, spanish,
slovensky, hungarian, french and some eastern languages I dont reckognize :)

The code of my script (using a private module and accessing parts of its
internal datastructure, but you get the idea !!) is:

I think the locate-part is not important, cause no collating takes place
in this script, but its my standard-header when dealing with unicode.
Maybe the utf8::encode is what you are missing in your script?

----------------------------
#!/usr/bin/perl -w

use strict;
use goldfisch::tt2;
use POSIX qw(locale_h);
use locale;
setlocale(LC_COLLATE, "de_AT.UTF-8");
$|=1;

my $db=$ARGV[0];
my $lang=$ARGV[1];

my $var=goldfisch::tt2->new();

foreach (keys %{$var->{g_ptr}->{db}->{$db}->{pdata}->{de}}) {
my $x=$var->{g_ptr}->{db}->{$db}->{pdata}->{de}->{$_};
utf8::encode($x);
print $_,"\t",$x,"\n";
}
 
S

sam

peter said:
What is the exact problem?
I use Spreadsheet::WriteExcel save the result from print statement to an
external file, then opened by MS Excel. But the characters are not
readable by chinese MS Excel. They are all printed as garbage in Excel.
But if I use mouse highlight the chinese from the webpage and paste it
to Excel, it is printed as chinese.

I use a script here that reads unicode from a database and prints out
CSV-data on a utf8-terminal. Then I import this CSV in OpenOffice
(cause I run linux) and specifiy UTF8 as encoding and I've all
characters just fine.
Then I saved as xls and mailed to the translators and everything was
fine. (I didnt have chinese, but I had a lot of german umlauts, spanish,
slovensky, hungarian, french and some eastern languages I dont
reckognize :)

The code of my script (using a private module and accessing parts of its
internal datastructure, but you get the idea !!) is:

I think the locate-part is not important, cause no collating takes place
in this script, but its my standard-header when dealing with unicode.
Maybe the utf8::encode is what you are missing in your script?
Thanks for the code. May I ask what is goldfisch::tt2?

Thanks
Sam
----------------------------
#!/usr/bin/perl -w

use strict;
use goldfisch::tt2;
use POSIX qw(locale_h);
use locale;
setlocale(LC_COLLATE, "de_AT.UTF-8");
$|=1;

my $db=$ARGV[0];
my $lang=$ARGV[1];

my $var=goldfisch::tt2->new();

foreach (keys %{$var->{g_ptr}->{db}->{$db}->{pdata}->{de}}) {
my $x=$var->{g_ptr}->{db}->{$db}->{pdata}->{de}->{$_};
utf8::encode($x);
print $_,"\t",$x,"\n";
}
 
A

Alan J. Flavell

I use Spreadsheet::WriteExcel save the result from print statement
to an external file, then opened by MS Excel.

You're doing this from a Windows implementation of Perl?
Are you using the -C option on the command line (or the equivalent
environment setting) to get wide system calls?
But the characters are not readable by chinese MS Excel. They are
all printed as garbage in Excel.

I understand what you're saying, although some details (hex dump of
the bytes, with some mention of what they were supposed to be) would
be more useful for a diagnosis than "garbage".
But if I use mouse highlight the chinese from the webpage

"the webpage"? What is this "webpage", and where can we see it,
please?
 
J

jmcnamara

A

Alan J. Flavell

The more recent versions of Spreadsheet::WriteExcel support utf8
transparently with Perl 5.8. For older versions of perl you can use
UTF-16.

See this example of Chinese in Big5 format from the standard
Spreadsheet::WriteExcel distro:

http://search.cpan.org/src/JMCNAMARA/Spreadsheet-WriteExcel-2.11/examples/unicode_big5.pl

Thanks for posting this. Just for my own edification - if this is run
on Windows, *does* it need the -C wide system calls option, or was
that a red-herring?
 
S

sam

The more recent versions of Spreadsheet::WriteExcel support utf8
transparently with Perl 5.8. For older versions of perl you can use
UTF-16.

See this example of Chinese in Big5 format from the standard
Spreadsheet::WriteExcel distro:

http://search.cpan.org/src/JMCNAMARA/Spreadsheet-WriteExcel-2.11/examples/unicode_big5.pl

http://search.cpan.org/src/JMCNAMARA/Spreadsheet-WriteExcel-2.11/examples/unicode_big5.txt

There are also several other encoding examples in the same distro.

John.
Hi John, thank you very much for this info, this bit of coding is
aparently missing from my perl code.

Thanks
Sam
 
J

jmcnamara

if this is run on Windows, *does* it need the -C wide system calls
option, or was that a red-herring?

The program will work on Windows without -C. So I guess that it is a
red-herring (at least on more recent perls):
From perlrun:
In Perls earlier than 5.8.1 the -C switch was a Win32-only
switch that enabled the use of Unicode-aware "wide system
call" Win32 APIs. This feature was practically unused,
however, and the command line switch was therefore "recycled".

http://perldoc.perldrunks.org/perlrun.html#Command-Switches

John.
--
 
P

peter pilsl

sam said:
Thanks for the code. May I ask what is goldfisch::tt2?

its a template-engine based on TemplateToolkit2 that runs under mod_perl
and provides many project-specific data (that is stored in a
sql-database and read then the module is loaded by apache on startup and
then shared between all the threads). Some of this data needs to be
exported to excel for translation and so I use goldfisch::tt2 for
reading from the database, cause its very convinient.

The module is very project-specific and strictly private :)
you can take a sneak-preview to the whole project at

http://tt2.adulteducation.at/6m/
or the almost ready english version at
http://tt2.adulteducation.at/6m/en

comments welcome and completely OT here ;)

;)

best,
peter
 
A

Alan J. Flavell

The program will work on Windows without -C. So I guess that it is a
red-herring (at least on more recent perls):

Thanks - then I offer my apologies to the questioner for giving
a misleading answer.
From perlrun:
In Perls earlier than 5.8.1 the -C switch was a Win32-only
switch that enabled the use of Unicode-aware "wide system
call" Win32 APIs. This feature was practically unused,
however, and the command line switch was therefore "recycled".

Oh yes, so it does. But that prompts the inevitable question as to
how Win32 now controls the choice of Win32 APIs ...?
 
S

sam

sam said:
Hi John, thank you very much for this info, this bit of coding is
aparently missing from my perl code.
As I look thru the code, I found this is only for converting an external
file to an excel file.
How can I take the result of the print statement from perl and convert
the result to an excel file? the result of the print statement is
chinese characters.

Thanks
Sam
 
J

jmcnamara

The example relates to reading encoded input from a file but that
doesn't mean that it is the only way to write encoded text to an
Spreadsheet::WriteExcel file.

If you have a string of characters in some known encoding then you
should convert it to UTF-16BE and use the write_unicode() method or,
better still, to UTF-8 and use the write() method.

You can use the standard (with perl 5.8) Encode module to convert
between almost any encoding.

If you still have a problem then post a small example program to the
Spreadsheet::WriteExcel Google Group:

http://groups-beta.google.com/group/spreadsheet-writeexcel

John.
--
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,007
Latest member
obedient dusk

Latest Threads

Top