Lyrics and Chinese in Ruby?

N

NewtonApple

Hello, has anyone used Chinese with Ruby? I'm trying to write a
script that would import/export my lyrics from/to lyricwiki.org.
Since they don't have a lot of Chinese lyrics, I thought I'd help them
out by exporting my collection. Looking at their SOAP API wiki
(http://lyricwiki.org/LyricWiki:SOAP), they seem have some troubles
encoding and decoding Unicode characters using Ruby's soap/
wsdlDriver. I tried to run my songs' meta data through rubyosa. But
I have no luck.

Here's a few songs I've tried out by copy-pasting the artist and song
names into the source and fetching the lyrics:
http://lyricwiki.org/Category:Language/Cantonese.

And here is a simple script, I've been testing with (w/ RubyOSA):
http://pastie.caboo.se/141454

My system is running on MacOS Leopard, Ruby 1.8.6. Thanks in advance
for all your help!

David
 
H

has

Hello, has anyone used Chinese with Ruby?  I'm trying to write a
script that would import/export my lyrics from/to lyricwiki.org.
Since they don't have a lot of Chinese lyrics, I thought I'd help them
out by exporting my collection.  Looking at their SOAP API wiki
(http://lyricwiki.org/LyricWiki:SOAP), they seem have some troubles
encoding and decoding Unicode characters using Ruby's soap/
wsdlDriver.

Looks like something's hosed somewhere:

require 'soap/wsdlDriver'
driver = SOAP::WSDLDriverFactory.new("http://lyricwiki.org/server.php?
wsdl").create_rpc_driver

p driver.getSong("La Mosca Ts\303\251-Ts\303\251","Madrid Amaneci
\303\263").artist
# "La Mosca Ts\303\203\302\203\303\202\302\251-Ts
\303\203\302\203\303\202\302\251" (!)

Same problem seems to happen on Python, which suggests the problem
might be on the server side:

import LyricWiki_services

soap = LyricWiki_services.LyricWikiBindingSOAP('http://lyricwiki.org/
server.php')
song = LyricWiki_services.getSongRequest()
song.Artist = unicode('La Mosca Ts\xc3\xa9-Ts\xc3\xa9', 'utf8')
song.Song = unicode('Madrid Amaneci\xc3\xb3', 'utf8')
result = soap.getSong(song)

print `result.Return.Artist.encode('utf8')`
# 'La Mosca Ts\xc3\x83\xc2\x83\xc3\x82\xc2\xa9-Ts
\xc3\x83\xc2\x83\xc3\x82\xc2\xa9'

You might want to speak to the LyricWiki folks about that.

And here is a simple script, I've been testing with (w/RubyOSA):http://pastie.caboo.se/141454

Note that this script won't work as-is for non-English names since
RubyOSA uses ASCII by default, although this can be changed.
Alternatively, use rb-appscript, which uses UTF8 by default.

HTH

has
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,766
Messages
2,569,569
Members
45,042
Latest member
icassiem

Latest Threads

Top