Mac OS Roma to UTF-8 (Kconv | Iconv]

U

Une bévue

i've a small test using rubyaeosa-0.2.3, works great if i let the output
"as is".

In that case, according to SubEthaEdit (a MacOS X editor) the output
string is encoded in MacOS Roman.

Because i'll use the output in an xml form i need to translate to UTF-8.

if i make use of Kconv#toutf8 i get japanese (presumably) chars :

René (true char "é" output "as is" = MacOS Roman)
Ren (if i use Kconv#toutf8)
Anaïs (true char "é" output "as is" = MacOS Roman)
"A"n"a<ï and s replaced by a "japanese" char> (use of Kconv#toutf8)

also if i make use of Iconv.new('MACROMAN', 'UTF-8').iconv(str)

i get an error message :
AddressBook2vCardXml.rb:32:in `iconv': "\216" (Iconv::IllegalSequence)

for the first accentuated string (the "é" of René).


here is my script :
<code>
require 'osx/aeosa'
require 'kconv'
require 'iconv'


def album_list
result = OSX.do_osascript %{
tell application "Address Book"
set a to first name of people
set b to last name of people
{a,b}
end tell
}
firstName = result[0].map {|i| i.to_rbobj }
lastName = result[1].map {|i| i.to_rbobj }
return firstName.map {|i| [ i,lastName.shift ] }
end

aFile = File.new("AddressBook.xml", "w")
album_list.each do |f,l|
aFile.puts "#{f} #{l}" // output "as is"
# aFile.puts "#{f.toutf8} #{l.toutf8}" // use Kconv#toutf8
# fu = Iconv.new('MACROMAN', 'UTF-8').iconv(f) // use of Iconv
# lu = Iconv.new('MACROMAN', 'UTF-8').iconv(l) // use of Iconv
# aFile.puts "#{fu} #{lu}" // use of Iconv
end
</code>

notice also that, if i do the encoding conversion using command line by
:
iconv -f MACROMAN -t UTF-8 AddressBook.xml > AddressBook-UTF-8.xml

"AddressBook.xml" being the output of my Ruby script, i get
"AddressBook-UTF-8.xml" correctly encoded !!!


may be that's the only solution for the time being ?
 
P

Paul Battley

Because i'll use the output in an xml form i need to translate to UTF-8.

Not necessarily. Just make sure that the encoding is specified in your
XML prolog:

also if i make use of Iconv.new('MACROMAN', 'UTF-8').iconv(str)

i get an error message :
AddressBook2vCardXml.rb:32:in `iconv': "\216" (Iconv::IllegalSequence)

for the first accentuated string (the "=E9" of Ren=E9).

That's because the parameters are in the wrong order. They should be
given as (to, from). Your example is therefore trying to convert
*from* UTF-8 *to* Mac Roman, which is why the =E9 is illegal.

Try this instead:

utf8_str =3D Iconv.new('UTF-8', 'MacRoman').iconv(mac_str)

e.g.
$KCODE =3D 'u'
require 'iconv'
Iconv.new('UTF-8', 'MacRoman').iconv("Ren\216") # =3D> "Ren=E9" [in UTF-8]

Paul.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,059
Latest member
cryptoseoagencies

Latest Threads

Top