problem with logic in reading a binary file

B

Bryan.Fodness

Hello,

I am having trouble writing the code to read a binary string. I would
like to extract the values for use in a calculation.

Any help would be great.

Here is my function that takes in a string.

def parseSequence(data, start):

group_num = data[start:start+2]
element_num = data[start+2:start+4]
vl_field = data[start+4:start+8]
length = struct.unpack('hh', vl_field)[0]
value = data[start+8:(start+8+length)]
pos = start+8+length
element = (group_num+element_num)

if element == '\xfe\xff\x00\xe0':
data = value

while start < length:
group_num = data[start:start+2]
element_num = data[start+2:start+4]
vl_field = data[start+4:start+8]
length = struct.unpack('hh', vl_field)[0]
value = data[start+8:(start+8+length)]
start = start+8+length
element = (group_num+element_num)

if element == '\xfe\xff\x00\xe0':
data = value

while start < length:
group_num = data[start:start+2]
element_num = data[start+2:start+4]
vl_field = data[start+4:start+8]
length = struct.unpack('hh', vl_field)[0]
value = data[start+8:(start+8+length)]
start = start+8+length
element = (group_num+element_num)
return element, start, value

else:
return element, start, value

else:
return element, pos, value

And, here is a sample string (I have split up and indented for
readability). There is an identifier (\xfe\xff\x00\xe0) followed by
the length of the nested values.


'\xfe\xff\x00\xe0\x18\x02\x00\x00 -length=536
\n0q\x00\x02\x00\x00\x001
\n0x\x00\x02\x00\x00\x0010
\n0\x80\x00\x02\x00\x00\x004
\n0\xa0\x00\x02\x00\x00\x000
\x0c0\x04\x00\xe8\x01\x00\x00
\xfe\xff\x00\xe0p\x00\x00\x00 -length=112
\n0\x82\x002\x00\x00\x0042.9068704277562\\-392.3545926477\
\189.182112099444
\n0\x84\x00\x0c\x00\x00\x008.9617062e-1
\n0\x86\x00\x10\x00\x00\x00127.378510918301
\x0c0\x06\x00\x02\x00\x00\x001
\xfe\xff\x00\xe0p\x00\x00\x00 -length=112
\n0\x82\x002\x00\x00\x0042.9068704277562\\-392.3545926477\
\189.182112099444
\n0\x84\x00\x0c\x00\x00\x001.629998e-1
\n0\x86\x00\x10\x00\x00\x0023.159729257873
\x0c0\x06\x00\x02\x00\x00\x004
\xfe\xff\x00\xe0t\x00\x00\x00 -length=116
\n0\x82\x002\x00\x00\x0042.9068704277562\\-392.3545926477\
\189.182112099444
\n0\x84\x00\x10\x00\x00\x001.26285318894435
\n0\x86\x00\x10\x00\x00\x00227.690980638769
\x0c0\x06\x00\x02\x00\x00\x003
\xfe\xff\x00\xe0t\x00\x00\x00 -length=116
\n0\x82\x002\x00\x00\x0042.9068704277562\\-392.3545926477\
\189.182112099444
\n0\x84\x00\x10\x00\x00\x001.52797639111557
\n0\x86\x00\x10\x00\x00\x00263.433384670643
\x0c0\x06\x00\x02\x00\x00\x002 ')
 
G

Gary Herron

Hello,

I am having trouble writing the code to read a binary string. I would
like to extract the values for use in a calculation.

Any help would be great.

Without having looked at your code an any detail, may I humbly suggest
that you throw it all out and use the struct module:

http://docs.python.org/lib/module-struct.html

It is meant to solve this kind of problem, and it is quite easy to use.

Gary Herron
Here is my function that takes in a string.

def parseSequence(data, start):

group_num = data[start:start+2]
element_num = data[start+2:start+4]
vl_field = data[start+4:start+8]
length = struct.unpack('hh', vl_field)[0]
value = data[start+8:(start+8+length)]
pos = start+8+length
element = (group_num+element_num)

if element == '\xfe\xff\x00\xe0':
data = value

while start < length:
group_num = data[start:start+2]
element_num = data[start+2:start+4]
vl_field = data[start+4:start+8]
length = struct.unpack('hh', vl_field)[0]
value = data[start+8:(start+8+length)]
start = start+8+length
element = (group_num+element_num)

if element == '\xfe\xff\x00\xe0':
data = value

while start < length:
group_num = data[start:start+2]
element_num = data[start+2:start+4]
vl_field = data[start+4:start+8]
length = struct.unpack('hh', vl_field)[0]
value = data[start+8:(start+8+length)]
start = start+8+length
element = (group_num+element_num)
return element, start, value

else:
return element, start, value

else:
return element, pos, value

And, here is a sample string (I have split up and indented for
readability). There is an identifier (\xfe\xff\x00\xe0) followed by
the length of the nested values.


'\xfe\xff\x00\xe0\x18\x02\x00\x00 -length=536
\n0q\x00\x02\x00\x00\x001
\n0x\x00\x02\x00\x00\x0010
\n0\x80\x00\x02\x00\x00\x004
\n0\xa0\x00\x02\x00\x00\x000
\x0c0\x04\x00\xe8\x01\x00\x00
\xfe\xff\x00\xe0p\x00\x00\x00 -length=112
\n0\x82\x002\x00\x00\x0042.9068704277562\\-392.3545926477\
\189.182112099444
\n0\x84\x00\x0c\x00\x00\x008.9617062e-1
\n0\x86\x00\x10\x00\x00\x00127.378510918301
\x0c0\x06\x00\x02\x00\x00\x001
\xfe\xff\x00\xe0p\x00\x00\x00 -length=112
\n0\x82\x002\x00\x00\x0042.9068704277562\\-392.3545926477\
\189.182112099444
\n0\x84\x00\x0c\x00\x00\x001.629998e-1
\n0\x86\x00\x10\x00\x00\x0023.159729257873
\x0c0\x06\x00\x02\x00\x00\x004
\xfe\xff\x00\xe0t\x00\x00\x00 -length=116
\n0\x82\x002\x00\x00\x0042.9068704277562\\-392.3545926477\
\189.182112099444
\n0\x84\x00\x10\x00\x00\x001.26285318894435
\n0\x86\x00\x10\x00\x00\x00227.690980638769
\x0c0\x06\x00\x02\x00\x00\x003
\xfe\xff\x00\xe0t\x00\x00\x00 -length=116
\n0\x82\x002\x00\x00\x0042.9068704277562\\-392.3545926477\
\189.182112099444
\n0\x84\x00\x10\x00\x00\x001.52797639111557
\n0\x86\x00\x10\x00\x00\x00263.433384670643
\x0c0\x06\x00\x02\x00\x00\x002 ')
 
C

castironpi

I am having trouble writing the code to read a binary string.  I would
like to extract the values for use in a calculation.
Any help would be great.

Without having looked at your code an any detail, may I humbly suggest
that you throw it all out and use the struct module:

   http://docs.python.org/lib/module-struct.html

It is meant to solve this kind of problem, and it is quite easy to use.

Gary Herron


Here is my function that takes in a string.
def parseSequence(data, start):
    group_num = data[start:start+2]
    element_num = data[start+2:start+4]
    vl_field = data[start+4:start+8]
    length = struct.unpack('hh', vl_field)[0]
    value = data[start+8:(start+8+length)]
    pos = start+8+length
    element = (group_num+element_num)
    if element == '\xfe\xff\x00\xe0':
        data = value
        while start < length:
            group_num = data[start:start+2]
            element_num = data[start+2:start+4]
            vl_field = data[start+4:start+8]
            length = struct.unpack('hh', vl_field)[0]
            value = data[start+8:(start+8+length)]
            start = start+8+length
            element = (group_num+element_num)
            if element == '\xfe\xff\x00\xe0':
                data = value
                while start < length:
                    group_num = data[start:start+2]
                    element_num = data[start+2:start+4]
                    vl_field = data[start+4:start+8]
                    length = struct.unpack('hh', vl_field)[0]
                    value = data[start+8:(start+8+length)]
                    start = start+8+length
                    element = (group_num+element_num)
                    return element, start, value
            else:
                return element, start, value
    else:
        return  element, pos, value
And, here is a sample string (I have split up and indented for
readability).  There is an identifier (\xfe\xff\x00\xe0) followed by
the length of the nested values.
'\xfe\xff\x00\xe0\x18\x02\x00\x00     -length=536
     \n0q\x00\x02\x00\x00\x001
     \n0x\x00\x02\x00\x00\x0010
     \n0\x80\x00\x02\x00\x00\x004
     \n0\xa0\x00\x02\x00\x00\x000
     \x0c0\x04\x00\xe8\x01\x00\x00
     \xfe\xff\x00\xe0p\x00\x00\x00     -length=112
          \n0\x82\x002\x00\x00\x0042.9068704277562\\-392.3545926477\
\189.182112099444
          \n0\x84\x00\x0c\x00\x00\x008.9617062e-1
          \n0\x86\x00\x10\x00\x00\x00127.378510918301
          \x0c0\x06\x00\x02\x00\x00\x001
     \xfe\xff\x00\xe0p\x00\x00\x00     -length=112
          \n0\x82\x002\x00\x00\x0042.9068704277562\\-392.3545926477\
\189.182112099444
          \n0\x84\x00\x0c\x00\x00\x001.629998e-1
          \n0\x86\x00\x10\x00\x00\x0023.159729257873
          \x0c0\x06\x00\x02\x00\x00\x004
     \xfe\xff\x00\xe0t\x00\x00\x00      -length=116
          \n0\x82\x002\x00\x00\x0042.9068704277562\\-392.3545926477\
\189.182112099444
          \n0\x84\x00\x10\x00\x00\x001.26285318894435
          \n0\x86\x00\x10\x00\x00\x00227.690980638769
          \x0c0\x06\x00\x02\x00\x00\x003
     \xfe\xff\x00\xe0t\x00\x00\x00      -length=116
          \n0\x82\x002\x00\x00\x0042.9068704277562\\-392.3545926477\
\189.182112099444
          \n0\x84\x00\x10\x00\x00\x001.52797639111557
          \n0\x86\x00\x10\x00\x00\x00263.433384670643
          \x0c0\x06\x00\x02\x00\x00\x002 ')- Hide quoted text -

Binaries can come from computers as from people. Synth sound &
graphics. Start structuring primitive binaries: What operation can
you run in real-time?

I would probably have to learn natural language to make any sense of
his keystrokes. Designing interface-first, you want another person to
be pressing keys. Can we get Simon to teach us a couple distinct
patterns? (That's we teach it; (that means:); no words: that's
faster.) Get a couple ring tones, customiz-ing-, and you play a
game.)

Multi-pad consolled the PC. Can we give keystrokes its own thread?
Sadly, our first one: get spatial delay timing up to speed. The
sturdy keys (the discretes) have whisper roger that. Watch moving
target? over raise riggings.
 
H

hdante

Hello,

I am having trouble writing the code to read a binary string. I would
like to extract the values for use in a calculation.

Any help would be great.

I'm too lazy to debug your binary string, but I suggest that you
completely throw away the binary file and restart with a database or
structured text. See, for example:

http://pyyaml.org/wiki/PyYAML

If you have some legacy binary file that you need to process, try
creating a C program that freads the binary file and printfs a text
equivalent.

If the decision of using binary files is not yours, then

Here is my function that takes in a string.

def parseSequence(data, start):

group_num = data[start:start+2]
element_num = data[start+2:start+4]
vl_field = data[start+4:start+8]
length = struct.unpack('hh', vl_field)[0]
value = data[start+8:(start+8+length)]
pos = start+8+length
element = (group_num+element_num)

if element == '\xfe\xff\x00\xe0':
data = value

while start < length:
group_num = data[start:start+2]
element_num = data[start+2:start+4]
vl_field = data[start+4:start+8]
length = struct.unpack('hh', vl_field)[0]
value = data[start+8:(start+8+length)]
start = start+8+length
element = (group_num+element_num)

if element == '\xfe\xff\x00\xe0':
data = value

while start < length:
group_num = data[start:start+2]
element_num = data[start+2:start+4]
vl_field = data[start+4:start+8]
length = struct.unpack('hh', vl_field)[0]
value = data[start+8:(start+8+length)]
start = start+8+length
element = (group_num+element_num)
return element, start, value

else:
return element, start, value

else:
return element, pos, value

And, here is a sample string (I have split up and indented for
readability). There is an identifier (\xfe\xff\x00\xe0) followed by
the length of the nested values.

'\xfe\xff\x00\xe0\x18\x02\x00\x00 -length=536
\n0q\x00\x02\x00\x00\x001
\n0x\x00\x02\x00\x00\x0010
\n0\x80\x00\x02\x00\x00\x004
\n0\xa0\x00\x02\x00\x00\x000
\x0c0\x04\x00\xe8\x01\x00\x00
\xfe\xff\x00\xe0p\x00\x00\x00 -length=112
\n0\x82\x002\x00\x00\x0042.9068704277562\\-392.3545926477\
\189.182112099444
\n0\x84\x00\x0c\x00\x00\x008.9617062e-1
\n0\x86\x00\x10\x00\x00\x00127.378510918301
\x0c0\x06\x00\x02\x00\x00\x001
\xfe\xff\x00\xe0p\x00\x00\x00 -length=112
\n0\x82\x002\x00\x00\x0042.9068704277562\\-392.3545926477\
\189.182112099444
\n0\x84\x00\x0c\x00\x00\x001.629998e-1
\n0\x86\x00\x10\x00\x00\x0023.159729257873
\x0c0\x06\x00\x02\x00\x00\x004
\xfe\xff\x00\xe0t\x00\x00\x00 -length=116
\n0\x82\x002\x00\x00\x0042.9068704277562\\-392.3545926477\
\189.182112099444
\n0\x84\x00\x10\x00\x00\x001.26285318894435
\n0\x86\x00\x10\x00\x00\x00227.690980638769
\x0c0\x06\x00\x02\x00\x00\x003
\xfe\xff\x00\xe0t\x00\x00\x00 -length=116
\n0\x82\x002\x00\x00\x0042.9068704277562\\-392.3545926477\
\189.182112099444
\n0\x84\x00\x10\x00\x00\x001.52797639111557
\n0\x86\x00\x10\x00\x00\x00263.433384670643
\x0c0\x06\x00\x02\x00\x00\x002 ')
 
J

John Machin

I'm too lazy to debug your binary string, but I suggest that you
completely throw away the binary file and restart with a database or
structured text. See, for example:

http://pyyaml.org/wiki/PyYAML

If you have some legacy binary file that you need to process, try
creating a C program that freads the binary file and printfs a text
equivalent.

.... and that couldn't be done faster and better in Python??
 
H

hdante

... and that couldn't be done faster and better in Python??

No. A C struct is done faster and better than python (thus, the
correctness check is faster in C). Also, chances are high that there's
already an include file with the binary structure.
 
D

Diez B. Roggisch

hdante said:
No. A C struct is done faster and better than python (thus, the
correctness check is faster in C). Also, chances are high that there's
already an include file with the binary structure.

That is utter nonsense. There is no "correctness check" in C. and using
printf & thus creating strings that you then need to parse in python
just doubles the effort needlessly.

The standard-lib module "struct" is exactly what you need, nothing else.
it sure is faster than any parsing of preprocessed data, doesn't
introduce a language-mixture and is prototyped/tested much faster
because of it being python - and not C-compiler and C-debugger.

Alternatively, *IF* there were C-structure-declarations available for
the binary format, the usage of ctypes would allow for roughly the same,
even reducing the effort to create the structure definition a great deal.

Diez
 
H

hdante

hdante schrieb:





That is utter nonsense. There is no "correctness check" in C. and using
printf & thus creating strings that you then need to parse in python
just doubles the effort needlessly.

The standard-lib module "struct" is exactly what you need, nothing else.
it sure is faster than any parsing of preprocessed data, doesn't
introduce a language-mixture and is prototyped/tested much faster
because of it being python - and not C-compiler and C-debugger.

Alternatively, *IF* there were C-structure-declarations available for
the binary format, the usage of ctypes would allow for roughly the same,
even reducing the effort to create the structure definition a great deal.

Diez

Whatever you say.
 
J

Jorgen Grahn

No. A C struct is done faster and better than python (thus, the
correctness check is faster in C). Also, chances are high that there's
already an include file with the binary structure.

If a C struct defines the file format, he is probably screwed already.
There are no guarantees that even different compilers on the same
machine have the same struct layout.

I have never seen this done by a serious program.

/Jorgen
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top