Reading unformatted big-endian files

A

Andrea Gavana

Hello NG,

that may sound a silly question, but I didn't find anything really
clear about the issue of reading unformatted big endian files with
Python. What I was doing till now, was using Fortran to read those
files and compile this Fortran extension using F2PY. Now that it seems
that no possible combinations of Fortran/C compilers actually *work*
with Python 2.4 on Windows XP, I was trying to translate the Fortran
subroutine to Python. Basically, what I do (in Fortran, I hope to
explain the code clearly) is:

! Declare an integer
integer number

! Declare a 4-chars character
character*4 keytype

! Declare a 8-chars character
character*8 keyword

! feof is not very important here
logical feof
feof = .false.

! Open the file as unformatted big-endian
open(unit = , file = filename, form = 'UNFORMATTED', convert = 'BIG_ENDIAN')

! loop until you find a particular keyword
! here "end=10" means that if the routine finds the EOF, it should go to
! the label "10 continue". "err=8" means that, if an error occours in
reading the file,
! it should go to the label "8 continue" and continue reading the file

do while(.not.feof)

! Read the 3 variables keyword, number and keytype
read(1, end=10, err=8) keyword, number, keytype

! If the keyword is 'DIMENS', break the loop and go to the end
if (keyword == 'DIMENS') then
read(1, end=10, err=8) dimens
goto 10
endif
8 continue
enddo

10 continue

! Close the file
close(1)


Well, does anyone have some suggestion about which kind of
material/tutorial on similar things I should read? How can I deal in
Python with variables that must be 8-chars or 4-chars in order to read
correctly the file? Am I missing something else?

Thank you very much for every suggestion.

Andrea.
 
J

John Machin

Andrea said:
"err=8" means that, if an error occours in
reading the file,
it should go to the label "8 continue" and continue reading the file

Silently ignoring errors when reading a file doesn't sound like a good
idea to me at all, especially if different records have different
formats.
Well, does anyone have some suggestion about which kind of
material/tutorial on similar things I should read? How can I deal in
Python with variables that must be 8-chars or 4-chars in order to read
correctly the file?

(a) read the docs on the struct module
(b) eyeball this rough untested translation:
8<---
def filereader(filename):
import struct
f = open(fname, 'rb') # 'rb' is read binary, very similar to C
stdio
fmt = '>8si4s'
# Assuming unformatted means binary,
# and integer means integer*4, which is signed.
# Also assuming that the 3-variable records are fixed-length.
fmtsz = struct.calcsize(fmt)
while True:
buff = f.read(fmtsz)
if not buff: # EOF
break
keyword, number, keytype = struct.unpack(fmt)
keyword = keyword.rstrip() # remove trailing spaces
keytype = keytype.rstrip()
if keyword == 'DIMENS':
# 'dimens' is neither declared nor initialised in the
FORTRAN
# so I'm just guessing here ...
buff2 = f.read(4)
dimens = struct.unpack('>i', buff2)
break
print keyword, number, keytype # or whatever
# reached end of file (dimens *NOT* defined),
# or gave up (dimens should have a value)
f.close() # not absolutely necessary especially when only reading

if __name__ == "__main__":
import sys
filereader(sys.argv[1])
8<---

If this doesn't work, and it's not obvious how to fix it, it might be a
good idea when you ask again if you were to supply a
FORTRAN-independent layout of the file, and/or a dump of a short test
file that includes the DIMENS/dimens caper -- you can get such a dump
readily with the *x od command or failing that, use Python:

#>>>repr(open('thetestfile', 'rb').read(100)) # yes, I said *short*

HTH,
John
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top