Ruby on win32 cannot handle certain filenames

D

David Barri

Hello all!

This is my first post in any one the ruby forums :)

I have a serious ruby windows problem. When I use IO related calls (such
as Dir.glob, File.open, etc) on my machine (WinXP), filenames are always
returned in the shift_jis charset. I've been using iconv to convert to
utf8 but then I came across a much more serious problem: because
File.open is using the shift_jis charset for filenames, it is NOT
POSSIBLE (!) for Ruby to open files that have say European chars in the
filename!! In this day and age SURELY it cannot be the case that it's
not possible in Ruby! It must be my inexperience :) There must be some
way that I don't know about. Any ideas/opinions/suggestions?

Also, I've tried changing the KCODE but it has absolutely no effect on
Dir.glob or Flie.open.

Also, when I used Dir.glob, Japanese filenames worked fine but one file
that had an ë in it (e with umlat) had been converted somewhere in
ruby's internals to just a plain ASCII e. There must be some way to
disable this internal charset conversion.

Golly
 
A

Austin Ziegler

This is my first post in any one the ruby forums :)

I have a serious ruby windows problem. When I use IO related calls (such
as Dir.glob, File.open, etc) on my machine (WinXP), filenames are always
returned in the shift_jis charset. I've been using iconv to convert to
utf8 but then I came across a much more serious problem: because
File.open is using the shift_jis charset for filenames, it is NOT
POSSIBLE (!) for Ruby to open files that have say European chars in the
filename!! In this day and age SURELY it cannot be the case that it's
not possible in Ruby! It must be my inexperience :) There must be some
way that I don't know about. Any ideas/opinions/suggestions?

Also, I've tried changing the KCODE but it has absolutely no effect on
Dir.glob or Flie.open.

Also, when I used Dir.glob, Japanese filenames worked fine but one file
that had an =EB in it (e with umlat) had been converted somewhere in
ruby's internals to just a plain ASCII e. There must be some way to
disable this internal charset conversion.

You're correct.

This is a limitation of Ruby as it is currently built. I haven't had
time to check whether Matz has m17n Strings in, but this will not be
fixable in Ruby 1.8 without using a Unicode string extension and a
significant overhaul to the Windows code.

-austin
--=20
Austin Ziegler * (e-mail address removed) * http://www.halostatue.ca/
* (e-mail address removed) * http://www.halostatue.ca/feed/
* (e-mail address removed)
 
N

Nobuyoshi Nakada

Hi,

At Tue, 14 Nov 2006 09:28:58 +0900,
David Barri wrote in [ruby-talk:224879]:
I have a serious ruby windows problem. When I use IO related calls (such =
as Dir.glob, File.open, etc) on my machine (WinXP), filenames are always =
returned in the shift_jis charset.

Do you use Japanese version Windows? Those methods use "OEM
string" but have no shift_jis specific code.
File.open is using the shift_jis charset for filenames, it is NOT=20
POSSIBLE (!) for Ruby to open files that have say European chars in the=20
filename!! In this day and age SURELY it cannot be the case that it's=20
not possible in Ruby! It must be my inexperience :) There must be some=20
way that I don't know about. Any ideas/opinions/suggestions?

If your system runs with European 8-bit charset, it should work
by setting $KCODE to "N".
Also, I've tried changing the KCODE but it has absolutely no effect on=20
Dir.glob or Flie.open.

$KCODE is for internal use, typically Regexp.
Also, when I used Dir.glob, Japanese filenames worked fine but one file=20
that had an =C3=AB in it (e with umlat) had been converted somewhere in=20
ruby's internals to just a plain ASCII e. There must be some way to=20
disable this internal charset conversion.

Not ruby's internals, it's done in Windows kernel.

--=20
Nobu Nakada
 
A

Austin Ziegler

Not ruby's internals, it's done in Windows kernel.

Sort of. FindFirstFile is used, not FindFirstFileW (which returns a
UTF-16 string). The Windows kernel does the translation "safely", but
if Ruby used FindFirstFileW (explicitly!), it would work. Sadly, this
is very difficult without an encoding-aware String.

-austin
--=20
Austin Ziegler * (e-mail address removed) * http://www.halostatue.ca/
* (e-mail address removed) * http://www.halostatue.ca/feed/
* (e-mail address removed)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,743
Messages
2,569,478
Members
44,898
Latest member
BlairH7607

Latest Threads

Top