Reading binary data with String or StringIO?

G

Gianni Jacklone

I need to read a binary string into memory and parse it using methods
similar to Java's DataInputStream
(http://java.sun.com/j2se/1.4.2/docs/api/java/io/DataInputStream.html)

I stumbled upon StringIO which I think is the closet thing to what I
need? Anyone know of any good documentation for using StringIO, or
dealing with binary data and parsing in general?

Many thanks in advance, hope someone can shed some light...
Gianni
 
A

Ara.T.Howard

I need to read a binary string into memory and parse it using methods
similar to Java's DataInputStream
(http://java.sun.com/j2se/1.4.2/docs/api/java/io/DataInputStream.html)

I stumbled upon StringIO which I think is the closet thing to what I
need? Anyone know of any good documentation for using StringIO, or
dealing with binary data and parsing in general?

Many thanks in advance, hope someone can shed some light...
Gianni

if on *nix - look into guy's mmap module, the builtin strscan, and
String#unpack. mmap is blindingly fast.

-a
--
===============================================================================
| EMAIL :: Ara [dot] T [dot] Howard [at] noaa [dot] gov
| PHONE :: 303.497.6469
| A flower falls, even though we love it;
| and a weed grows, even though we do not love it.
| --Dogen
===============================================================================
 
M

Michael Neumann

--------------030400010207090408010009
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit

Gianni said:
I need to read a binary string into memory and parse it using methods
similar to Java's DataInputStream
(http://java.sun.com/j2se/1.4.2/docs/api/java/io/DataInputStream.html)

I stumbled upon StringIO which I think is the closet thing to what I
need? Anyone know of any good documentation for using StringIO, or
dealing with binary data and parsing in general?

Many thanks in advance, hope someone can shed some light...
Gianni

Something like this (appended)?

Regards,

Michael

--------------030400010207090408010009
Content-Type: text/plain;
name="ext2.rb"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
filename="ext2.rb"

module ByteOrder
Native = :Native
Big = BigEndian = Network = :BigEndian
Little = LittleEndian = :LittleEndian

# examines the locale byte order on the running machine
def byte_order
if [0x12345678].pack("L") == "\x12\x34\x56\x78"
BigEndian
else
LittleEndian
end
end
alias byteorder byte_order
module_function :byte_order, :byteorder

def little_endian?
byte_order == LittleEndian
end

def big_endian?
byte_order == BigEndian
end

alias little? little_endian?
alias big? big_endian?
alias network? big_endian?

module_function :little_endian?, :little?
module_function :big_endian?, :big?, :network?
end

def assert(cond=nil, &block)
raise "assertion failed" unless (block ? block.call : cond)
end

# Requires method read(n) to be defined
module BinaryReaderMixin

# == 8 bit

# no byteorder for 8 bit!

def read_word8
ru(1, 'C')
end

def read_int8
ru(1, 'c')
end

# == 16 bit

# === Unsigned

def read_word16_native
ru(2, 'S')
end

def read_word16_little
ru(2, 'v')
end

def read_word16_network
ru(2, 'n')
end

# === Signed

def read_int16_native
ru(2, 's')
end

def read_int16_little
str = readn(2)
str.reverse! if ByteOrder.network? # swap bytes as native=network (and we want little)
str.unpack('s').first
end

def read_int16_network
str = readn(2)
str.reverse! if ByteOrder.little? # swap bytes as native=little (and we want network)
str.unpack('s').first
end

# == 32 bit

# === Unsigned

def read_word32_native
ru(4, 'L')
end

def read_word32_little
ru(4, 'V')
end

def read_word32_network
ru(4, 'N')
end

# === Signed

def read_int32_native
ru(4, 'l')
end

def read_int32_little
str = readn(4)
str.reverse! if ByteOrder.network? # swap bytes as native=network (and we want little)
str.unpack('l').first
end

def read_int32_network
str = readn(4)
str.reverse! if ByteOrder.little? # swap bytes as native=little (and we want network)
str.unpack('l').first
end

# == Aliases

# add some short-cut functions
%w(word16 int16 word32 int32).each do |typ|
eval %{
alias read_#{typ}_big read_#{typ}_network
def read_#{typ}(byte_order = ByteOrder::Native)
case byte_order
when ByteOrder::Native then read_#{typ}_native
when ByteOrder::Little then read_#{typ}_little
when ByteOrder::Network then read_#{typ}_network
else raise ArgumentError
end
end
}
end

# == Template

def read_template(*tmpl, &block)
arr = [] if block.nil?
rep = 1
tmpl.each { |spec|
case spec
when Fixnum
rep = spec
when Symbol, String
rep.times do
val = send("read_#{spec}")
if block.nil?
arr << val
else
block.call(val)
end
end
rep = 1
else
raise
end
}
arr
end

def read_template_to_hash(*tmpl)
hash = {}
tmpl.each do |key, typ|
hash[key] = send("read_#{typ}")
end
hash
end

def read_cstring
str = ""
while (c=readn(1)) != "\0"
str << c
end
str
end

# read exactly n characters, otherwise raise an exception.
def readn(n)
str = read(n)
raise "couldn't read #{n} characters" if str.nil? or str.size != n
str
end

private

# shortcut method
def ru(size, template)
readn(size).unpack(template).first
end

end


module BinaryWriterMixin

# == 8 bit

# no byteorder for 8 bit!

def write_word8(val)
pw(val, 'C')
end

def write_int8(val)
pw(val, 'c')
end

# == 16 bit

# === Unsigned

def write_word16_native(val)
pw(val, 'S')
end

def write_word16_little(val)
str = [val].pack('S')
str.reverse! if ByteOrder.network? # swap bytes as native=network (and we want little)
write(str)
end

def write_word16_network(val)
str = [val].pack('S')
str.reverse! if ByteOrder.little? # swap bytes as native=little (and we want network)
write(str)
end

# === Signed

def write_int16_native(val)
pw(val, 's')
end

def write_int16_little(val)
pw(val, 'v')
end

def write_int16_network(val)
pw(val, 'n')
end

# == 32 bit

# === Unsigned

def write_word32_native(val)
pw(val, 'L')
end

def write_word32_little(val)
str = [val].pack('L')
str.reverse! if ByteOrder.network? # swap bytes as native=network (and we want little)
write(str)
end

def write_word32_network(val)
str = [val].pack('L')
str.reverse! if ByteOrder.little? # swap bytes as native=little (and we want network)
write(str)
end

# === Signed

def write_int32_native(val)
pw(val, 'l')
end

def write_int32_little(val)
pw(val, 'V')
end

def write_int32_network(val)
pw(val, 'N')
end

# add some short-cut functions
%w(word16 int16 word32 int32).each do |typ|
eval %{
alias write_#{typ}_big write_#{typ}_network
def write_#{typ}(val, byte_order = ByteOrder::Native)
case byte_order
when ByteOrder::Native then write_#{typ}_native(val)
when ByteOrder::Little then write_#{typ}_little(val)
when ByteOrder::Network then write_#{typ}_network(val)
else raise ArgumentError
end
end
}
end

# == Other methods

private

# shortcut for pack and write
def pw(val, template)
write([val].pack(template))
end
end

class String
def unpackv(*template)
self.dup.read_template(*template)
end

private

include BinaryReaderMixin

def read(n)
@pos ||= 0
str = self[@pos, n]
@pos += n
str
end
end


--------------030400010207090408010009--
 
G

Gianni Jacklone

Michael said:
Gianni said:
I need to read a binary string into memory and parse it using methods
similar to Java's DataInputStream
(http://java.sun.com/j2se/1.4.2/docs/api/java/io/DataInputStream.html)

I stumbled upon StringIO which I think is the closet thing to what I
need? Anyone know of any good documentation for using StringIO, or
dealing with binary data and parsing in general?

Many thanks in advance, hope someone can shed some light...
Gianni


Something like this (appended)?

Regards,

Michael

------------------------------------------------------------------------

module ByteOrder
Native = :Native
Big = BigEndian = Network = :BigEndian
Little = LittleEndian = :LittleEndian

# examines the locale byte order on the running machine
def byte_order
if [0x12345678].pack("L") == "\x12\x34\x56\x78"
BigEndian
else
LittleEndian
end
end
alias byteorder byte_order
module_function :byte_order, :byteorder

def little_endian?
byte_order == LittleEndian
end

def big_endian?
byte_order == BigEndian
end

alias little? little_endian?
alias big? big_endian?
alias network? big_endian?

module_function :little_endian?, :little?
module_function :big_endian?, :big?, :network?
end

def assert(cond=nil, &block)
raise "assertion failed" unless (block ? block.call : cond)
end

# Requires method read(n) to be defined
module BinaryReaderMixin

# == 8 bit

# no byteorder for 8 bit!

def read_word8
ru(1, 'C')
end

def read_int8
ru(1, 'c')
end

# == 16 bit

# === Unsigned

def read_word16_native
ru(2, 'S')
end

def read_word16_little
ru(2, 'v')
end

def read_word16_network
ru(2, 'n')
end

# === Signed

def read_int16_native
ru(2, 's')
end

def read_int16_little
str = readn(2)
str.reverse! if ByteOrder.network? # swap bytes as native=network (and we want little)
str.unpack('s').first
end

def read_int16_network
str = readn(2)
str.reverse! if ByteOrder.little? # swap bytes as native=little (and we want network)
str.unpack('s').first
end

# == 32 bit

# === Unsigned

def read_word32_native
ru(4, 'L')
end

def read_word32_little
ru(4, 'V')
end

def read_word32_network
ru(4, 'N')
end

# === Signed

def read_int32_native
ru(4, 'l')
end

def read_int32_little
str = readn(4)
str.reverse! if ByteOrder.network? # swap bytes as native=network (and we want little)
str.unpack('l').first
end

def read_int32_network
str = readn(4)
str.reverse! if ByteOrder.little? # swap bytes as native=little (and we want network)
str.unpack('l').first
end

# == Aliases

# add some short-cut functions
%w(word16 int16 word32 int32).each do |typ|
eval %{
alias read_#{typ}_big read_#{typ}_network
def read_#{typ}(byte_order = ByteOrder::Native)
case byte_order
when ByteOrder::Native then read_#{typ}_native
when ByteOrder::Little then read_#{typ}_little
when ByteOrder::Network then read_#{typ}_network
else raise ArgumentError
end
end
}
end

# == Template

def read_template(*tmpl, &block)
arr = [] if block.nil?
rep = 1
tmpl.each { |spec|
case spec
when Fixnum
rep = spec
when Symbol, String
rep.times do
val = send("read_#{spec}")
if block.nil?
arr << val
else
block.call(val)
end
end
rep = 1
else
raise
end
}
arr
end

def read_template_to_hash(*tmpl)
hash = {}
tmpl.each do |key, typ|
hash[key] = send("read_#{typ}")
end
hash
end

def read_cstring
str = ""
while (c=readn(1)) != "\0"
str << c
end
str
end

# read exactly n characters, otherwise raise an exception.
def readn(n)
str = read(n)
raise "couldn't read #{n} characters" if str.nil? or str.size != n
str
end

private

# shortcut method
def ru(size, template)
readn(size).unpack(template).first
end

end


module BinaryWriterMixin

# == 8 bit

# no byteorder for 8 bit!

def write_word8(val)
pw(val, 'C')
end

def write_int8(val)
pw(val, 'c')
end

# == 16 bit

# === Unsigned

def write_word16_native(val)
pw(val, 'S')
end

def write_word16_little(val)
str = [val].pack('S')
str.reverse! if ByteOrder.network? # swap bytes as native=network (and we want little)
write(str)
end

def write_word16_network(val)
str = [val].pack('S')
str.reverse! if ByteOrder.little? # swap bytes as native=little (and we want network)
write(str)
end

# === Signed

def write_int16_native(val)
pw(val, 's')
end

def write_int16_little(val)
pw(val, 'v')
end

def write_int16_network(val)
pw(val, 'n')
end

# == 32 bit

# === Unsigned

def write_word32_native(val)
pw(val, 'L')
end

def write_word32_little(val)
str = [val].pack('L')
str.reverse! if ByteOrder.network? # swap bytes as native=network (and we want little)
write(str)
end

def write_word32_network(val)
str = [val].pack('L')
str.reverse! if ByteOrder.little? # swap bytes as native=little (and we want network)
write(str)
end

# === Signed

def write_int32_native(val)
pw(val, 'l')
end

def write_int32_little(val)
pw(val, 'V')
end

def write_int32_network(val)
pw(val, 'N')
end

# add some short-cut functions
%w(word16 int16 word32 int32).each do |typ|
eval %{
alias write_#{typ}_big write_#{typ}_network
def write_#{typ}(val, byte_order = ByteOrder::Native)
case byte_order
when ByteOrder::Native then write_#{typ}_native(val)
when ByteOrder::Little then write_#{typ}_little(val)
when ByteOrder::Network then write_#{typ}_network(val)
else raise ArgumentError
end
end
}
end

# == Other methods

private

# shortcut for pack and write
def pw(val, template)
write([val].pack(template))
end
end

class String
def unpackv(*template)
self.dup.read_template(*template)
end

private

include BinaryReaderMixin

def read(n)
@pos ||= 0
str = self[@pos, n]
@pos += n
str
end
end
My sincere thanks for this. I came across your BinaryReader class last
night googling for help on this, and I was hoping you might see my post.
Your attachment was very enlightening, I am already putting it to good use.

Thanks for the other suggestions from others as well.

Gianni
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top