transform non-english text

H

Hu Ma

Hello,
I have a Ruby aplication that deals with non-english text and I want to
transform some of that text to [^a-zA-Z0-9].
Examples:
búsqueda -> busqueda
presenças -> presencas
für -> fur
avião1 -> aviao1

Call anyone help me?

Thanks.
Best regards,
Migrate
 
M

Martin Boese

I don't think there is a unified mapping table to transform non-[^a-zA-Z0-9=
]=20
characters into a specific one of them. But if you can concider to write a=
=20
map yourself try something like:


class String
MAP =3D [[/=C3=BC/, 'u'],
[/=C3=B6/, 'o']]

def eng_char
res =3D String.new(self)
MAP.each { |r| res =3D res.gsub(r[0],r[1]) }
return res
end

end

s =3D "ab=C3=BCc=C3=BC=C3=B6=C3=B6"
puts s + " =3D> " + s.eng_char

=2D---------
Will output:

ab=C3=BCc=C3=BC=C3=B6=C3=B6 =3D> abucuoo


Martin
 
F

F. Senault

Le 19 janvier 2007 à 10:19, Hu Ma a écrit :
Hello,
I have a Ruby aplication that deals with non-english text and I want to
transform some of that text to [^a-zA-Z0-9].

You could try with Iconv to convert from your encoding to ASCII. Quick
example :
require "iconv" => true
Iconv.iconv("ascii//translit", "iso-8859-1", "aéioù") => ["a'eio`u"]
Iconv.iconv("ascii//translit", "iso-8859-1", "aéiou")[0].tr('^a-z', '')
=> "aeiou"

Fred
 
H

Hu Ma

Hello,

Thanks for your help.

I will try both approaches to see what fits best.

Best regards,
Migrate


Martin said:
I don't think there is a unified mapping table to transform
non-[^a-zA-Z0-9]
characters into a specific one of them. But if you can concider to write
a
map yourself try something like:


class String
MAP = [[/ü/, 'u'],
[/ö/, 'o']]

def eng_char
res = String.new(self)
MAP.each { |r| res = res.gsub(r[0],r[1]) }
return res
end

end

s = "abücüöö"
puts s + " => " + s.eng_char

----------
Will output:

abücüöö => abucuoo


Martin
 
D

Daniel DeLorme

F. Senault said:
require "iconv" => true
Iconv.iconv("ascii//translit", "iso-8859-1", "aéioù") => ["a'eio`u"]
Iconv.iconv("ascii//translit", "iso-8859-1", "aéiou")[0].tr('^a-z', '')
=> "aeiou"

iconv translit is really nice... when it works. It works on our FreeBSD
server but not on my ubuntu dev machine. Your mileage may vary.

Daniel
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top