REXML & Extended characters - newbie question

R

Ralph Mason

I am doing a quick and dirty automatic translation from English to
spanish of some text in an xml document.

However the translation returns characters outsize the 7 bit range,
which seems to creates ain invalid xml document. I need those string
utf8 encoded before I set the text of an element. But I cant see how to
do this.

Thanks for any help

Regards
Ralph


A test doc looks like

<?xml version='1.0' encoding='UTF-8'?>
<text>Vehicle</text>

Full code.

require 'net/http'
require 'cgi'
require 'rexml/document'

def translate(text)
puts "translating #{text}"
ret =""
Net::HTTP.start('translate.google.com'){ |session|

session.get("/translate_t?langpair=en|es&hl=en&text=#{CGI.escape(text)}"){
|result| ret<< result
}
}
ret =~ /(name=q.*?>)(.*?)</
$2
end

def process(node)
puts node.name
node.text = translate(node.text) if ( node.text.strip != "" )
node.elements.each{|x| process x}
end

doc = REXML::Document.new File.new "lang_eng.xml"
doc.elements.each{|x| process x }
doc.write(File.new("lang_spn.xml","w"),0)
 
J

Josef 'Jupp' SCHUGT

Hi!

* Ralph Mason:
I need those string utf8 encoded before I set the text of an
element.

IIRC the encoding used by Google defaults to ISO-8859-1 while adding
an explicit 'en=utf-8' to the argument part of the URL makes it use
utf-8.

Josef 'Jupp' SCHUGT
 
R

Ralph Mason

Josef said:
Hi!

* Ralph Mason:



IIRC the encoding used by Google defaults to ISO-8859-1 while adding
an explicit 'en=utf-8' to the argument part of the URL makes it use
utf-8.

Josef 'Jupp' SCHUGT
Thanks for that, I'll give it a go I had a workaround with

node.text = str.pack("C*").unpack("U*")

It would be good if there was some documentation somewhere about text
conversions and REXML. Or some kind of encoding aware string class
that could act as an intermediary.

Ralph
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,578
Members
45,052
Latest member
LucyCarper

Latest Threads

Top