confusing string parsing problem

M

Matt Brooks

I don't understand how to do the following, any help is very
appreciated. I have been programming in ruby 1.8.7 for about a month.

I have a string that has in it:
2 byte UINT16
2 byte UINT16
1 byte UINT8 representing how many pairs of the next 2 strings there are
variable length string1 <- Although for now I know this always equals
BFO
variable length string2
2 byte UINT16

I want to display each value from the fields, while also removing it
from the string as I read it out. I believe the strings are terminated
with a 0, so the null character needs to be removed also to correctly
parse other fields. Although part of my problem is I don't understand
how to parse until a null character is found, is this \0 like in C? Or
is it \000 as I have scene some places? or 0.to_chr ?

The string comes from a "rest" variable length field from the bit-struct
class. So there was data before in the rest that I have processed and
taken out off the front of the string, and after this point there is
more data, but for now I am interested in this block of data.

Something else that is known that could help, is the total length of the
whole block is known, so the 7 bytes + however many bytes for the
variable length strings is known.

My attempt is horrible I know, and I don't think it really runs.
writelog() is method to display my value...

iterations_to_do = (bin_msg.element_length - 7) - 4
temparray = []
j = 0
if /^BFO/ =~ bin_msg.optional_sub_elements
writelog("\n\n Parameter Name = BFO")
bin_msg.optional_sub_elements.sub!(/^BFO/, '')

while j < iterations_to_do
bin_msg.optional_sub_elements.each_byte do |character|
unless character == 0.chr
temparray << character
j += 1
end
end
end
#here print first j number of elements in temparray?

Thanks again,
Matt
 
M

Matt Brooks

Also forgot to mention, the 2nd variable length string happens to be
formated as -8000 to 8000, so i have to see the sign and a variable
length of digits, but as ascii characters not as an int.

thanks again,
Matt
 
B

brabuhr

I don't understand how to do the following, any help is very
appreciated. =A0I have been programming in ruby 1.8.7 for about a month.

I have a string that has in it:
2 byte UINT16
2 byte UINT16
1 byte UINT8 representing how many pairs of the next 2 strings there are
variable length string1 <- Although for now I know this always equals
BFO
variable length string2
2 byte UINT16

I want to display each value from the fields, while also removing it
from the string as I read it out...

My attempt is horrible I know, and I don't think it really runs.

Can you post an example of your data and code with the incorrect
output and expected output?
string =3D ""
string << 0.chr << 1.chr
string << 0.chr << 2.chr
string << 2.chr
string << "foo" << "BFO"
string << "bar" << "BFO"
string << 0.chr << 3.chr

p string

p string[0..1], string[0..1].unpack('n')
p string[2..3], string[2..3].unpack('n')
p string[0..3].unpack('n2')

p string[4..4], string[4..4].unpack('C')
p string[0..4].unpack('n2C')

p string[5..-2].split(/BFO/)
ruby -v z.rb
ruby 1.8.7 (2008-08-11 patchlevel 72) [i486-linux]
"\000\001\000\002\002fooBFObarBFO\000\003"
"\000\001"
[1]
"\000\002"
[2]
[1, 2]
"\002"
[2]
[1, 2, 2]
["foo", "bar", "\000"]
 
M

Matt Brooks

unknown said:
2 byte UINT16

I want to display each value from the fields, while also removing it
from the string as I read it out...

My attempt is horrible I know, and I don't think it really runs.

Can you post an example of your data and code with the incorrect
output and expected output?
string = ""
string << 0.chr << 1.chr
string << 0.chr << 2.chr
string << 2.chr
string << "foo" << "BFO"
string << "bar" << "BFO"
string << 0.chr << 3.chr

p string

p string[0..1], string[0..1].unpack('n')
p string[2..3], string[2..3].unpack('n')
p string[0..3].unpack('n2')

p string[4..4], string[4..4].unpack('C')
p string[0..4].unpack('n2C')

p string[5..-2].split(/BFO/)
ruby -v z.rb
ruby 1.8.7 (2008-08-11 patchlevel 72) [i486-linux]
"\000\001\000\002\002fooBFObarBFO\000\003"
"\000\001"
[1]
"\000\002"
[2]
[1, 2]
"\002"
[2]
[1, 2, 2]
["foo", "bar", "\000"]


Thanks a lot for your reply, i will look into it and see if i can
implement using unpack. in the meantime this is a sample output...



My current output is:
1st 2 byte field = 3
2nd 2 byte field = 0
Num Pairs = 1
sub_elements =
"BFO\0000\000\000\000\001\000\000\0005\000\003\000\000\000\00
0\000\000\000\000\000\001\002\000\000\312+\335V\353\262\000\001\001Internal...and
so on for hundreds of bytes more...
I, [2009-09-02 11:56:10 #4864] INFO -- : Pairs to get: 1
I, [2009-09-02 11:56:10 #4864] INFO -- : Param Value Name: BFO


what i want output to be:
I, [2009-09-02 11:56:10 #4864] INFO -- : Pairs to get: 1
I, [2009-09-02 11:56:10 #4864] INFO -- : Param Value Name: BFO
I, [2009-09-02 11:56:10 #4864] INFO -- : String Value: whatever the
string is till it is null terminated
I, [2009-09-02 11:56:10 #4864] INFO -- : Last 2 byte field: 0


And At this point I need the sub_elements string to not contain any of
the above parameters printed out, ie. whatever I display is done with
and needs to be deleted out of the front of the string.


my code:
I was redoing my code, and only up to this point.
#prints first 5 bytes formated nicely
writelog("#{bin_msg.inspect_detailed})")

#How many pairs to expect
writelog("Pairs to get: #{bin_msg.num_optional_mod_params}")

#Remove BFO while printing it as well
writelog("Param Value Name: #{bin_msg.sub_elements.slice!(/^BFO/)}")


#THIS DOESN'T WORK!!! Needs to scan for possible minus sign, plus up to
#variable length number of digits, up to the null termination of the
string
bin_msg.sub_elements.scan(/-?[0-9]+\Z/)

#Needs to then print the next 2 bytes while taking that out of the
string as well
 
M

Matt Brooks

Also, I never made it clear but the string is in network byte order (big
Endian)
 
7

7stud --

Matt said:
Also, I never made it clear but the string is in network byte order (big
Endian)

data = ""

data << 0.chr << 3.chr #2 bytes
data << 0.chr << 0.chr #2 bytes
data << 1.chr #1 byte
data << "BFO" << 0.chr #null terminated string
data << "-8000" << "\000" #null terminated string
data << 0.chr << 0.chr #2 bytes

p data

--output:--
"\000\003\000\000\001BFO\000-8000\000\000\000"

results = data.unpack("n2CZ*Z*n")
p results

--output:--
[3, 0, 1, "BFO", "-8000", 0]
 
M

Matt Brooks

7stud said:
Matt said:
Also, I never made it clear but the string is in network byte order (big
Endian)

data = ""

data << 0.chr << 3.chr #2 bytes
data << 0.chr << 0.chr #2 bytes
data << 1.chr #1 byte
data << "BFO" << 0.chr #null terminated string
data << "-8000" << "\000" #null terminated string
data << 0.chr << 0.chr #2 bytes

p data

--output:--
"\000\003\000\000\001BFO\000-8000\000\000\000"

results = data.unpack("n2CZ*Z*n")
p results

--output:--
[3, 0, 1, "BFO", "-8000", 0]



7stud, Wow that is an elegant way! thank you for your time. I do
appreciate both responses very much. The first one clued me into using
unpack...correctly

I was just now excited to get my very verbose way to work, I will
probably switch to how you are doing it. Just for comment, here is what
i just got working...


#prints first 5 bytes formatted nicely
writelog("Sub Element Message: #{bin_msg.inspect_detailed})")

#How many pairs to iterate through
writelog("Pairs to get: #{bin_msg.pairs}")

times_iterated = 0
while times_iterated < bin_msg.pairs

#Get Value Name, variable length
current_unpacked_char =
bin_msg.optional_sub_elements.unpack("Z*")
#Print it
writelog("\n\nParam Value Name = #{current_unpacked_char}\n")
#Remove Name from front of string
bin_msg.optional_sub_elements =
bin_msg.optional_sub_elements.sub(/^#{current_unpacked_char}\000/, '')

#Get Value, variable length
current_unpacked_char =
bin_msg.optional_sub_elements.unpack("Z*")
#Print it
writelog("\n\nParam Value = #{current_unpacked_char}\n")
#Remove Value from front of string
bin_msg.optional_sub_elements =
bin_msg.optional_sub_elements.sub(/^#{current_unpacked_char}\000/, '')

#Get Last two byte value
current_unpacked_char =
bin_msg.optional_sub_elements.unpack("n")
#Print it
writelog("\n\nAudio Inversion Frequency =
#{current_unpacked_char}\n")
current_packed_char = current_unpacked_char.pack("n")
#Remove Last two byte value
bin_msg.optional_sub_elements =
bin_msg.optional_sub_elements.sub(/^#{current_packed_char}/, '')

times_iterated += 1
end
 
7

7stud --

Matt said:
Last 2 byte field: 0


And At this point I need the sub_elements string to not contain any of
the above parameters printed out, ie. whatever I display is done with
and needs to be deleted out of the front of the string.

You could do that simply like this:

chop_size = 2 +
2 +
1 +
results[3].length + 1 +
results[4].length + 1 +
1

puts chop_size

--output:--
16


For example:


data = ""
data << 0.chr << 3.chr
data << 0.chr << 0.chr
data << 1.chr
data << "BFO" << 0.chr
data << "-8000" << "\000"
data << 0.chr << 0.chr
data << "the rest"


results = data.unpack("n2CZ*Z*n")

chop_size = 2 +
2 +
1 +
results[3].length + 1 +
results[4].length + 1 +
1

puts chop_size

--output:--
16


puts "-->#{data[chop_size..-1]}<---"

--output:--
-->the rest<---
 
7

7stud --

7stud said:
You could do that simply like this:

chop_size = 2 +
2 +
1 +
results[3].length + 1 +
results[4].length + 1 +
1

Whoops. The last field is 2 bytes.

chop_size = 2 +
2 +
1 +
results[3].length + 1 +
results[4].length + 1 +
2
 
M

Matt Brooks

7stud said:
7stud said:
You could do that simply like this:

chop_size = 2 +
2 +
1 +
results[3].length + 1 +
results[4].length + 1 +
1

Whoops. The last field is 2 bytes.

chop_size = 2 +
2 +
1 +
results[3].length + 1 +
results[4].length + 1 +
2

Thanks 7stud, I like that way. Very nice, I will be using this!!!
Slowly Ruby is growing on me!
-Matt
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,577
Members
45,054
Latest member
LucyCarper

Latest Threads

Top