how to read Chinese filenames?

C

ckyang74

I have many files named with Chinese characters. When I read in
filenames, the Chinese characters become question marks. Here is the
code:

opendir (DIR, "./") || die $!;
foreach (readdir (DIR)) {
print $_,"\n";
}
closedir DIR;

I use Windows XP. The Regional and Language Settings for non-unicode
programs is English. Changing that to Chinese probably can't solve the
problem because I also have filenames with Japanese characters.

Also, I remember filenames in Windows XP are encoded in utf8, is that
correct?

Clint
 
B

Ben Bullock

I have many files named with Chinese characters. When I read in
filenames, the Chinese characters become question marks. Here is the
code:

opendir (DIR, "./") || die $!;
foreach (readdir (DIR)) {
print $_,"\n";
}
closedir DIR;

I use Windows XP. The Regional and Language Settings for non-unicode
programs is English. Changing that to Chinese probably can't solve the
problem because I also have filenames with Japanese characters.

Also, I remember filenames in Windows XP are encoded in utf8, is that
correct?

In that case could you use

binmode DIR utf8;

? You probably also need to say
binmode STDOUT utf8;

to get the names to print correctly.
 
A

Alan J. Flavell

I have many files named with Chinese characters. When I read in
filenames, the Chinese characters become question marks.

As I understand it: you need to use Win32 wide system calls.

In an earlier version of Perl, this was implemented using the -C flag.
Then the developers changed their minds, and took it out again, saying
they would re-introduced it in a different way. But I don't know
whether they did, nor how. Hope those clues are useful somehow. The
phrase "wide system calls" may be a useful term to search for.
Also, I remember filenames in Windows XP are encoded in utf8, is
that correct?

Internally, Windows uses utf-16 (at least, plane 0 of it), but it
ought not to be necessary for the Perl programmer to be concerned with
that. The Perl implementation (at least, the version of Perl which
implemented wide system calls, back when I needed it) automatically
converted characters from Perl's internal representation (which
happens to be based on utf-8, but the Perl programmer should not
normally be concerned with that) to the appropriate encoding for the
Win32 wide system calls.

As I've made clear, I haven't kept up with how current Perl versions
are handling this. If no-one else steps in with better details, I
hope you'll report back on your findings, to help anyone else who is
trying to deal with this.
 
B

Ben Morrow

Quoth "Alan J. Flavell said:
As I understand it: you need to use Win32 wide system calls.

In an earlier version of Perl, this was implemented using the -C flag.
Then the developers changed their minds, and took it out again, saying
they would re-introduced it in a different way. But I don't know
whether they did, nor how. Hope those clues are useful somehow. The
phrase "wide system calls" may be a useful term to search for.

AFAICT, it is not implemented in 5.8.7 at all. You probably want to use
Win32::API to call the FindFirstFileW/FindNextFileW/FindClose Win32 API
functions directly: Win32API::File doesn't seem to wrap these, which is
a shame. This is likely to be at least slightly tricky: if you manage
it, you may want to consider publishing a module which does the work.

Ben
 
A

Alan J. Flavell

AFAICT, it is not implemented in 5.8.7 at all.

Well, I've looked around again, since my last posting - but found
nothing to contradict you.
You probably want to use Win32::API to call the
FindFirstFileW/FindNextFileW/FindClose Win32 API functions directly:

So it seems!

I fooled around for a while with cygwin perl, to see if I could find
some backdoor way into this, but - so far - without success. I've
produced some truly bizarre file names - but none of them were what I
had intended! Fortunately, with placeholders like "?" for the
unspeakable characters, I /have/ been able to delete them again.

best regards

(this does seem to be rather a minority sport in Perl, doesn't it? A
pity, as I thought things were going rather well for a while...)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,764
Messages
2,569,566
Members
45,041
Latest member
RomeoFarnh

Latest Threads

Top