confusing string parsing problem

Matt Brooks · Sep 2, 2009

I don't understand how to do the following, any help is very
appreciated. I have been programming in ruby 1.8.7 for about a month.

I have a string that has in it:
2 byte UINT16
2 byte UINT16
1 byte UINT8 representing how many pairs of the next 2 strings there are
variable length string1 <- Although for now I know this always equals
BFO
variable length string2
2 byte UINT16

I want to display each value from the fields, while also removing it
from the string as I read it out. I believe the strings are terminated
with a 0, so the null character needs to be removed also to correctly
parse other fields. Although part of my problem is I don't understand
how to parse until a null character is found, is this \0 like in C? Or
is it \000 as I have scene some places? or 0.to_chr ?

The string comes from a "rest" variable length field from the bit-struct
class. So there was data before in the rest that I have processed and
taken out off the front of the string, and after this point there is
more data, but for now I am interested in this block of data.

Something else that is known that could help, is the total length of the
whole block is known, so the 7 bytes + however many bytes for the
variable length strings is known.

My attempt is horrible I know, and I don't think it really runs.
writelog() is method to display my value...

iterations_to_do = (bin_msg.element_length - 7) - 4
temparray = []
j = 0
if /^BFO/ =~ bin_msg.optional_sub_elements
writelog("\n\n Parameter Name = BFO")
bin_msg.optional_sub_elements.sub!(/^BFO/, '')

while j < iterations_to_do
bin_msg.optional_sub_elements.each_byte do |character|
unless character == 0.chr
temparray << character
j += 1
end
end
end
#here print first j number of elements in temparray?

Thanks again,
Matt

Matt Brooks · Sep 2, 2009

Also forgot to mention, the 2nd variable length string happens to be
formated as -8000 to 8000, so i have to see the sign and a variable
length of digits, but as ascii characters not as an int.

thanks again,
Matt

brabuhr · Sep 2, 2009

I don't understand how to do the following, any help is very
appreciated. =A0I have been programming in ruby 1.8.7 for about a month.

I have a string that has in it:
2 byte UINT16
2 byte UINT16
1 byte UINT8 representing how many pairs of the next 2 strings there are
variable length string1 <- Although for now I know this always equals
BFO
variable length string2
2 byte UINT16

I want to display each value from the fields, while also removing it
from the string as I read it out...

My attempt is horrible I know, and I don't think it really runs.

Can you post an example of your data and code with the incorrect
output and expected output?

cat z.rb

string =3D ""
string << 0.chr << 1.chr
string << 0.chr << 2.chr
string << 2.chr
string << "foo" << "BFO"
string << "bar" << "BFO"
string << 0.chr << 3.chr

p string

p string[0..1], string[0..1].unpack('n')
p string[2..3], string[2..3].unpack('n')
p string[0..3].unpack('n2')

p string[4..4], string[4..4].unpack('C')
p string[0..4].unpack('n2C')

p string[5..-2].split(/BFO/)

ruby -v z.rb

ruby 1.8.7 (2008-08-11 patchlevel 72) [i486-linux]
"\000\001\000\002\002fooBFObarBFO\000\003"
"\000\001"
[1]
"\000\002"
[2]
[1, 2]
"\002"
[2]
[1, 2, 2]
["foo", "bar", "\000"]

Matt Brooks · Sep 2, 2009

unknown said:
2 byte UINT16

I want to display each value from the fields, while also removing it
from the string as I read it out...

My attempt is horrible I know, and I don't think it really runs.

Click to expand...

Can you post an example of your data and code with the incorrect
output and expected output?

cat z.rb

Click to expand...

string = ""
string << 0.chr << 1.chr
string << 0.chr << 2.chr
string << 2.chr
string << "foo" << "BFO"
string << "bar" << "BFO"
string << 0.chr << 3.chr

p string

p string[0..1], string[0..1].unpack('n')
p string[2..3], string[2..3].unpack('n')
p string[0..3].unpack('n2')

p string[4..4], string[4..4].unpack('C')
p string[0..4].unpack('n2C')

p string[5..-2].split(/BFO/)

ruby -v z.rb

Click to expand...

ruby 1.8.7 (2008-08-11 patchlevel 72) [i486-linux]
"\000\001\000\002\002fooBFObarBFO\000\003"
"\000\001"
[1]
"\000\002"
[2]
[1, 2]
"\002"
[2]
[1, 2, 2]
["foo", "bar", "\000"]

Thanks a lot for your reply, i will look into it and see if i can
implement using unpack. in the meantime this is a sample output...

My current output is:
1st 2 byte field = 3
2nd 2 byte field = 0
Num Pairs = 1
sub_elements =
"BFO\0000\000\000\000\001\000\000\0005\000\003\000\000\000\00
0\000\000\000\000\000\001\002\000\000\312+\335V\353\262\000\001\001Internal...and
so on for hundreds of bytes more...
I, [2009-09-02 11:56:10 #4864] INFO -- : Pairs to get: 1
I, [2009-09-02 11:56:10 #4864] INFO -- : Param Value Name: BFO

what i want output to be:
I, [2009-09-02 11:56:10 #4864] INFO -- : Pairs to get: 1
I, [2009-09-02 11:56:10 #4864] INFO -- : Param Value Name: BFO
I, [2009-09-02 11:56:10 #4864] INFO -- : String Value: whatever the
string is till it is null terminated
I, [2009-09-02 11:56:10 #4864] INFO -- : Last 2 byte field: 0

And At this point I need the sub_elements string to not contain any of
the above parameters printed out, ie. whatever I display is done with
and needs to be deleted out of the front of the string.

my code:
I was redoing my code, and only up to this point.
#prints first 5 bytes formated nicely
writelog("#{bin_msg.inspect_detailed})")

#How many pairs to expect
writelog("Pairs to get: #{bin_msg.num_optional_mod_params}")

#Remove BFO while printing it as well
writelog("Param Value Name: #{bin_msg.sub_elements.slice!(/^BFO/)}")

#THIS DOESN'T WORK!!! Needs to scan for possible minus sign, plus up to
#variable length number of digits, up to the null termination of the
string
bin_msg.sub_elements.scan(/-?[0-9]+\Z/)

#Needs to then print the next 2 bytes while taking that out of the
string as well

Matt Brooks · Sep 2, 2009

Also, I never made it clear but the string is in network byte order (big
Endian)

7stud -- · Sep 2, 2009

Matt said:
Also, I never made it clear but the string is in network byte order (big
Endian)

data = ""

data << 0.chr << 3.chr #2 bytes
data << 0.chr << 0.chr #2 bytes
data << 1.chr #1 byte
data << "BFO" << 0.chr #null terminated string
data << "-8000" << "\000" #null terminated string
data << 0.chr << 0.chr #2 bytes

p data

--output:--
"\000\003\000\000\001BFO\000-8000\000\000\000"

results = data.unpack("n2CZ*Z*n")
p results

--output:--
[3, 0, 1, "BFO", "-8000", 0]

Matt Brooks · Sep 2, 2009

7stud said:
Matt said:

Also, I never made it clear but the string is in network byte order (big
Endian)

Click to expand...

data = ""

data << 0.chr << 3.chr #2 bytes
data << 0.chr << 0.chr #2 bytes
data << 1.chr #1 byte
data << "BFO" << 0.chr #null terminated string
data << "-8000" << "\000" #null terminated string
data << 0.chr << 0.chr #2 bytes

p data

--output:--
"\000\003\000\000\001BFO\000-8000\000\000\000"

results = data.unpack("n2CZ*Z*n")
p results

--output:--
[3, 0, 1, "BFO", "-8000", 0]

7stud, Wow that is an elegant way! thank you for your time. I do
appreciate both responses very much. The first one clued me into using
unpack...correctly

I was just now excited to get my very verbose way to work, I will
probably switch to how you are doing it. Just for comment, here is what
i just got working...

#prints first 5 bytes formatted nicely
writelog("Sub Element Message: #{bin_msg.inspect_detailed})")

#How many pairs to iterate through
writelog("Pairs to get: #{bin_msg.pairs}")

times_iterated = 0
while times_iterated < bin_msg.pairs

#Get Value Name, variable length
current_unpacked_char =
bin_msg.optional_sub_elements.unpack("Z*")
#Print it
writelog("\n\nParam Value Name = #{current_unpacked_char}\n")
#Remove Name from front of string
bin_msg.optional_sub_elements =
bin_msg.optional_sub_elements.sub(/^#{current_unpacked_char}\000/, '')

#Get Value, variable length
current_unpacked_char =
bin_msg.optional_sub_elements.unpack("Z*")
#Print it
writelog("\n\nParam Value = #{current_unpacked_char}\n")
#Remove Value from front of string
bin_msg.optional_sub_elements =
bin_msg.optional_sub_elements.sub(/^#{current_unpacked_char}\000/, '')

#Get Last two byte value
current_unpacked_char =
bin_msg.optional_sub_elements.unpack("n")
#Print it
writelog("\n\nAudio Inversion Frequency =
#{current_unpacked_char}\n")
current_packed_char = current_unpacked_char.pack("n")
#Remove Last two byte value
bin_msg.optional_sub_elements =
bin_msg.optional_sub_elements.sub(/^#{current_packed_char}/, '')

times_iterated += 1
end

7stud -- · Sep 2, 2009

Matt said:
Last 2 byte field: 0

And At this point I need the sub_elements string to not contain any of
the above parameters printed out, ie. whatever I display is done with
and needs to be deleted out of the front of the string.

You could do that simply like this:

chop_size = 2 +
2 +
1 +
results[3].length + 1 +
results[4].length + 1 +
1

puts chop_size

--output:--
16

For example:

data = ""
data << 0.chr << 3.chr
data << 0.chr << 0.chr
data << 1.chr
data << "BFO" << 0.chr
data << "-8000" << "\000"
data << 0.chr << 0.chr
data << "the rest"

results = data.unpack("n2CZ*Z*n")

chop_size = 2 +
2 +
1 +
results[3].length + 1 +
results[4].length + 1 +
1

puts chop_size

--output:--
16

puts "-->#{data[chop_size..-1]}<---"

--output:--
-->the rest<---

7stud -- · Sep 2, 2009

7stud said:
You could do that simply like this:

chop_size = 2 +
2 +
1 +
results[3].length + 1 +
results[4].length + 1 +
1

Whoops. The last field is 2 bytes.

chop_size = 2 +
2 +
1 +
results[3].length + 1 +
results[4].length + 1 +
2

Matt Brooks · Sep 2, 2009

7stud said:
7stud said:

You could do that simply like this:

chop_size = 2 +
2 +
1 +
results[3].length + 1 +
results[4].length + 1 +
1

Click to expand...

Whoops. The last field is 2 bytes.

chop_size = 2 +
2 +
1 +
results[3].length + 1 +
results[4].length + 1 +
2

Thanks 7stud, I like that way. Very nice, I will be using this!!!
Slowly Ruby is growing on me!
-Matt

Dynamic block parsing + scrolling	0	May 30, 2024
Dynamic block parsing + scrolling	0	May 30, 2024
Problem Splitting Text String	2	Dec 28, 2022
Rearranging .ply file via C++ String Parsing	0	Dec 14, 2019
Java matrix problem	3	Sep 10, 2023
SQL Connection string regex pattern to parse sections	1	May 9, 2024
AES-128 Clipboard Protector: Auto-Encrypt Ctrl+C, Smart-Decrypt Ctrl+V (C++ Windows Hook)	7	Mar 24, 2026
iterate values of hex string	1	Sep 30, 2012

confusing string parsing problem

Matt Brooks

Matt Brooks

brabuhr

Matt Brooks

Matt Brooks

7stud --

Matt Brooks

7stud --

7stud --

Matt Brooks

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads