Output UTF-16LE BOM to file - 1.9

C

Chris Morris

[Note: parts of this message were removed to make it a legal post.]

ruby 1.9.1p0 (2009-01-30 revision 21907) [i386-mswin32]

With this code:

File.open('zz.txt', 'w:UTF-16LE') do |f|
f.print "Hello Uni-world"
end

...I get no BOM

guts = File.read('zz.txt')
puts guts.bytes.to_a.inspect

#=> [72, 0, 101, 0, 108, 0, 108, 0, 111, 0, 32, 0,...

...and my brain can't concoct a way to insert it myself, though I know
it must be simple...
 
J

James Gray

ruby 1.9.1p0 (2009-01-30 revision 21907) [i386-mswin32]

With this code:

File.open('zz.txt', 'w:UTF-16LE') do |f|
f.print "Hello Uni-world"
end

...I get no BOM

guts = File.read('zz.txt')
puts guts.bytes.to_a.inspect

#=> [72, 0, 101, 0, 108, 0, 108, 0, 111, 0, 32, 0,...

...and my brain can't concoct a way to insert it myself, though I know
it must be simple...

Yeah, it's easy stuff.

A Unicode BOM is just the character U+FEFF encoded at the beginning of
the document. You can insert that character yourself with Ruby 1.9's
Unicode escape and it will be transcoded into the proper byte order
based on the external_encoding() you are writing to:

$ cat utf16_bom.rb
# encoding: UTF-8
File.open("utf16_bom.txt", "w:UTF-16LE") do |f|
f.puts "\uFEFFThis is UTF-16LE with a BOM."
end
$ ruby -v utf16_bom.rb
ruby 1.9.1p0 (2009-01-30 revision 21907) [i386-darwin9.6.0]
$ ruby -e 'p File.binread(ARGV.shift)[0..9]' utf16_bom.txt
"\xFF\xFET\x00h\x00i\x00s\x00"

Hope that helps.

James Edward Gray II
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,581
Members
45,056
Latest member
GlycogenSupporthealth

Latest Threads

Top