A
Andreas S.
I found that, unlike Ruby 1.8, the word character class in Ruby 1.9
regexes does not match german umlauts (or any other letters other than
ASCII). According to the oniguruma documentation
(http://www.geocities.jp/kosako3/oniguruma/doc/RE.txt), it should match
everything from the unicode "letter" category, which includes umlauts.
test.rb (also attached):
# encoding: utf-8
$KCODE='u'
s = "ü"
puts s.match(/\w/u).inspect
Result with ruby 1.8:
#<MatchData "ü">
Result with ruby 1.9.2:
nil
Is that a bug, or is there any reason behind this behavior?
Attachments:
http://www.ruby-forum.com/attachment/5113/test.rb
regexes does not match german umlauts (or any other letters other than
ASCII). According to the oniguruma documentation
(http://www.geocities.jp/kosako3/oniguruma/doc/RE.txt), it should match
everything from the unicode "letter" category, which includes umlauts.
test.rb (also attached):
# encoding: utf-8
$KCODE='u'
s = "ü"
puts s.match(/\w/u).inspect
Result with ruby 1.8:
#<MatchData "ü">
Result with ruby 1.9.2:
nil
Is that a bug, or is there any reason behind this behavior?
Attachments:
http://www.ruby-forum.com/attachment/5113/test.rb