win32 ruby1.9 regexp and cyrillic string

N

Nikolay Khodyunya

#coding: utf-8
str2 = "asdfМикимауÑ"
p str2.encoding #<Encoding:UTF-8>
p str2.scan /\p{Cyrillic}/ #found all cyrillic charachters
str2.gsub!(/\w/u,'') #removes only latin characters
puts str2

The question is why /\w/ ignore cyrillic characters?

I have installed latest ruby package from http://rubyinstaller.org/.
Here is my output of ruby -v
ruby 1.9.1p378 (2010-01-10 revision 26273) [i386-mingw32]
 
R

Roger Pack

str2.gsub!(/\w/u,'') #removes only latin characters
The question is why /\w/ ignore cyrillic characters?

Are cyrillic characters supposed to count as "word characters"? (\w) ?
If so then looks like a bug to me. Ping core.
-rp
 
R

Roger Pack

Nikolay said:
#coding: utf-8
str2 = "asdfМикимауÑ"
p str2.encoding #<Encoding:UTF-8>
p str2.scan /\p{Cyrillic}/ #found all cyrillic charachters
str2.gsub!(/\w/u,'') #removes only latin characters
puts str2

The question is why /\w/ ignore cyrillic characters?

I have installed latest ruby package from http://rubyinstaller.org/.
Here is my output of ruby -v
ruby 1.9.1p378 (2010-01-10 revision 26273) [i386-mingw32]

http://redmine.ruby-lang.org/issues/show/3181
http://redmine.ruby-lang.org/issues/show/3202

might be related. If you think it's wrong then bring it up on core.
-rp
 
C

Caleb Clausen

#coding: utf-8
str2 =3D "asdf=D0=9C=D0=B8=D0=BA=D0=B8=D0=BC=D0=B0=D1=83=D1=81"
p str2.encoding #<Encoding:UTF-8>
p str2.scan /\p{Cyrillic}/ #found all cyrillic charachters
str2.gsub!(/\w/u,'') #removes only latin characters
puts str2

The question is why /\w/ ignore cyrillic characters?

I think that \w (and similar shortcuts) is supposed to match ascii
characters only... thus it's equivalent to [a-zA-Z].

Isn't there some kind of unicode character class you can use?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,581
Members
45,056
Latest member
GlycogenSupporthealth

Latest Threads

Top