Windows 2008 Server: Reading Text File with Ruby.

A

Angelo NN

Hello -
New Ruby user here.
I was wondering if you could point me in the right direction:

- I save the output of a WMIC query to a temp text file on Windows 2008
using Ruby:

system("wmic MEMORYCHIP get CAPACITY /VALUE > tmp")

- Next, I try to read this file back in order to extract a value from
the output:

contents = File.open('tmp', 'r:') { |f| f.read }

However, when I run this in irb, the file is imported with the following
leading characters: \x00 and others. If I open the actual tmp file in
Windows, it displays the correct text.

Is there a way to just import that text only? Is this something to do
with encoding?

Thank you for your help.

Here is the entire session in irb:

irb(main):002:0> system("wmic MEMORYCHIP get CAPACITY /VALUE >
tmp")
=> true
irb(main):003:0> contents = File.open('tmp', 'r:') { |f| f.read }
=>
"\xFF\xFE\n\x00\n\x00N\x00o\x00d\x00e\x00,\x00C\x00a\x00p\x00a\x00c\x00i\x00t
\x00y\x00\n\x00\n\x00P\x00S\x00,\x008\x005\x008\x009\x009\x003\x004\x005\x009\x0
02\x00"
irb(main):004:0>
 
J

Jonathan Hudson

Hello -
New Ruby user here.
I was wondering if you could point me in the right direction:

- I save the output of a WMIC query to a temp text file on Windows 2008
using Ruby:

system("wmic MEMORYCHIP get CAPACITY /VALUE > tmp")

- Next, I try to read this file back in order to extract a value from
the output:

contents = File.open('tmp', 'r:') { |f| f.read }

However, when I run this in irb, the file is imported with the following
leading characters: \x00 and others. If I open the actual tmp file in
Windows, it displays the correct text.

Is there a way to just import that text only? Is this something to do
with encoding?

Thank you for your help.

Here is the entire session in irb:

irb(main):002:0> system("wmic MEMORYCHIP get CAPACITY /VALUE >
tmp")
=> true
irb(main):003:0> contents = File.open('tmp', 'r:') { |f| f.read }
=>
"\xFF\xFE\n\x00\n\x00N\x00o\x00d\x00e\x00,\x00C\x00a\x00p\x00a\x00c\x00i\x00t
\x00y\x00\n\x00\n\x00P\x00S\x00,\x008\x005\x008\x009\x009\x003\x004\x005\x009\x0
02\x00"
irb(main):004:0>

As it starts with a UTF-16 LE Byte order marker, that's a pretty good
clue as to the encoding.

-jh
 
A

Angelo NN

Jonathan Hudson wrote in post #991289:
As it starts with a UTF-16 LE Byte order marker, that's a pretty good
clue as to the encoding.

-jh

Thank you.

Can you suggest where I can read/etc. about how to change the encoding
for the imported file?
 
J

Jonathan Hudson

Jonathan Hudson wrote in post #991289:

Thank you.

Can you suggest where I can read/etc. about how to change the encoding
for the imported file?

I'm not at all familiar with dealing with encodings on Windows, but
assuming you're using a 1.9x ruby,

contents = File.open('tmp', 'r:utf-16') { |f| f.read }

or perhaps

contents = File.open('tmp', 'r:utf-16le') { |f| f.read }

Given the BOM, I'd hope that the former might work.

-jh
 
A

Angelo NN

Jonathan Hudson wrote in post #991291:
I'm not at all familiar with dealing with encodings on Windows, but
assuming you're using a 1.9x ruby,

contents = File.open('tmp', 'r:utf-16') { |f| f.read }

or perhaps

contents = File.open('tmp', 'r:utf-16le') { |f| f.read }

Given the BOM, I'd hope that the former might work.

-jh

Thanks - I tried utf-16. Unfortunately it gives a "Unsupported encoding
utf-16 ignored" message. Maybe it's time to switch to another Operating
System for me :)
(irb):15: warning: Unsupported encoding utf-16 ignored
=>
"\xFF\xFE\n\x00\n\x00\n\x00\n\x00C\x00a\x00p\x00a\x00c\x00i\x00t\x00y\x00=\x0
08\x005\x008\x009\x009\x003\x004\x005\x009\x002\x00\n\x00\n\x00\n\x00\n\x00\n\x0
0\n\x00"
 
J

Jonathan Hudson

Jonathan Hudson wrote in post #991291:

Thanks - I tried utf-16. Unfortunately it gives a "Unsupported encoding
utf-16 ignored" message. Maybe it's time to switch to another Operating
System for me :)

(irb):15: warning: Unsupported encoding utf-16 ignored
=>
"\xFF\xFE\n\x00\n\x00\n\x00\n\x00C\x00a\x00p\x00a\x00c\x00i\x00t\x00y\x00=\x0
08\x005\x008\x009\x009\x003\x004\x005\x009\x002\x00\n\x00\n\x00\n\x00\n\x00\n\x0
0\n\x00"

I think the old ways still work:

require 'iconv'
content=File.binread('tmp')
# TO FROM (set TO to 'native encoding')
text = Iconv::conv("utf-8",'utf-16', content)
puts text

-jh
 
A

Angelo NN

Jonathan Hudson wrote in post #991297:
I think the old ways still work:

require 'iconv'
content=File.binread('tmp')
# TO FROM (set TO to 'native encoding')
text = Iconv::conv("utf-8",'utf-16', content)
puts text

-jh

Wow - Awesome.
That worked.

Thanks Jonathan!
 
F

F. Senault

Le 6 avril 2011 à 21:58, Angelo NN a écrit :
Thanks - I tried utf-16. Unfortunately it gives a "Unsupported encoding
utf-16 ignored" message. Maybe it's time to switch to another Operating
System for me :)

If you want the list of encodings that ruby supports, try :

ruby -e 'puts Encoding.list'

I think you are looking for 'UTF16-LE'.
(irb):15: warning: Unsupported encoding utf-16 ignored
=>
"\xFF\xFE\n\x00\n\x00\n\x00\n\x00C\x00a\x00p\x00a\x00c\x00i\x00t\x00y\x00=\x0
08\x005\x008\x009\x009\x003\x004\x005\x009\x002\x00\n\x00\n\x00\n\x00\n\x00\n\x0
0\n\x00"
=> "\uFEFF\n\nNode,Capacity\n\nPS,8589934592"

If you want to ignore the BOM (byte order mark), just skip the two first
bytes :
=> "\n\nNode,Capacity\n\nPS,8589934592"

HTH,

Fred
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top