Reading binary data

D

David M

OK so here is my task. I want to get at the data stored in
/var/account/pacct, which stores process accounting data, so that I can
make it into a more human understandable format then what the program
sa can do. The thing is, its in a binary format and an example program
that reads some data from the file is done in C using a struct defined
in sys/acct.h.

http://www.linuxjournal.com/articles/lj/0104/6144/6144l2.html

So I was wondering how can I do the same thing, but in python? I'm
still learning so please be gentle.

David
 
F

Fredrik Lundh

David said:
OK so here is my task. I want to get at the data stored in
/var/account/pacct, which stores process accounting data, so that I can
make it into a more human understandable format then what the program
sa can do. The thing is, its in a binary format and an example program
that reads some data from the file is done in C using a struct defined
in sys/acct.h.

http://www.linuxjournal.com/articles/lj/0104/6144/6144l2.html

So I was wondering how can I do the same thing, but in python? I'm
still learning so please be gentle.

outline:

1. load the data

f = open(filename, "rb")

data = f.read()

2. parse it:

http://docs.python.org/lib/module-struct.html

</F>
 
G

Grant Edwards

OK so here is my task. I want to get at the data stored in
/var/account/pacct, which stores process accounting data, so that I can
make it into a more human understandable format then what the program
sa can do. The thing is, its in a binary format and an example program
that reads some data from the file is done in C using a struct defined
in sys/acct.h.

http://www.linuxjournal.com/articles/lj/0104/6144/6144l2.html

So I was wondering how can I do the same thing, but in python? I'm
still learning so please be gentle.

use the struct module

http://www.python.org/doc/current/lib/module-struct.html
 
D

David M

Thanks but the C Struct describing the data doesn't match up with the
list on the module-struct page.

this is the acct.h file

#ifndef _SYS_ACCT_H
#define _SYS_ACCT_H 1

#include <features.h>

#define __need_time_t
#include <time.h>
#include <sys/types.h>

__BEGIN_DECLS

#define ACCT_COMM 16

/*
comp_t is a 16-bit "floating" point number with a 3-bit base 8
exponent and a 13-bit fraction. See linux/kernel/acct.c for the
specific encoding system used.
*/

typedef u_int16_t comp_t;

struct acct
{
char ac_flag; /* Accounting flags. */
u_int16_t ac_uid; /* Accounting user ID. */
u_int16_t ac_gid; /* Accounting group ID. */
u_int16_t ac_tty; /* Controlling tty. */
u_int32_t ac_btime; /* Beginning time. */
comp_t ac_utime; /* Accounting user time. */
comp_t ac_stime; /* Accounting system time. */
comp_t ac_etime; /* Accounting elapsed time. */
comp_t ac_mem; /* Accounting average memory usage. */
comp_t ac_io; /* Accounting chars transferred. */
comp_t ac_rw; /* Accounting blocks read or written. */
comp_t ac_minflt; /* Accounting minor pagefaults. */
comp_t ac_majflt; /* Accounting major pagefaults. */
comp_t ac_swaps; /* Accounting number of swaps. */
u_int32_t ac_exitcode; /* Accounting process exitcode. */
char ac_comm[ACCT_COMM+1]; /* Accounting command name. */
char ac_pad[10]; /* Accounting padding bytes. */
};

enum
{
AFORK = 0x01, /* Has executed fork, but no exec. */
ASU = 0x02, /* Used super-user privileges. */
ACORE = 0x08, /* Dumped core. */
AXSIG = 0x10 /* Killed by a signal. */
};

#define AHZ 100


/* Switch process accounting on and off. */
extern int acct (__const char *__filename) __THROW;

__END_DECLS

#endif /* sys/acct.h */

What are u_ini16_t and comp_t? And what about the enum section?
 
G

Grant Edwards

In said:
Thanks but the C Struct describing the data doesn't match up with the
list on the module-struct page.

Then you're going to have to do some bit-bashing.
comp_t is a 16-bit "floating" point number with a 3-bit base 8
exponent and a 13-bit fraction. See linux/kernel/acct.c for the
specific encoding system used.

Yoinks! That's just a bit too clever.

You'll have to impliment that yourself.
What are u_ini16_t and comp_t?

Dunno, I guess you'll have to look at the sources and find the
typedefs.
And what about the enum section?

It's probably a 32-bit integer, but it may take some
experimentation to confrim that.
 
F

Fredrik Lundh

David M said:
What are u_ini16_t and comp_t?

comp_t is explained in the file you posted:
/*
comp_t is a 16-bit "floating" point number with a 3-bit base 8
exponent and a 13-bit fraction. See linux/kernel/acct.c for the
specific encoding system used.
*/

typedef u_int16_t comp_t;

as the comment says, comp_t is a 16-bit value. you can read it in as
an integer, but you have to convert it to a floating point according to
the encoding mentioned above.

the typedef says that comp_t is stored as a u_int16_t, which means
that it's 16-bit value too. judging from the name, and the fields using
it, it's safe to assume that it's an unsigned 16-bit integer.
And what about the enum section?

it just defines a bunch of symbolic values; AFORK is 1, ASU is 2, etc.
enum
{
AFORK = 0x01, /* Has executed fork, but no exec. */
ASU = 0x02, /* Used super-user privileges. */
ACORE = 0x08, /* Dumped core. */
AXSIG = 0x10 /* Killed by a signal. */
};

at this point, you should be able to do a little experimentation. read in
a couple of bytes (64 bytes should be enough), print them out, and try
to see if you can match the bytes with the description above.

import struct

f = open(filename, "rb")

data = f.read(64)

# hex dump
print data.encode("hex")

# list of decimal byte values
print map(ord, data)

# struct test (keep adding type codes until you're sorted everything out)
format = "BHHHHHH"
print struct.unpack(format, struct.calcsize(format))

</F>
 
T

Thomas Heller

David M said:
Thanks but the C Struct describing the data doesn't match up with the
list on the module-struct page.

this is the acct.h file
[...]

Tooting my ctypes horn (sorry for that):

thomas@linux:~/ctypes> locate acct.h
/usr/include/linux/acct.h
/usr/include/sys/acct.h
thomas@linux:~/ctypes> python ctypes/wrap/h2xml.py sys/acct.h -o acct.xml
creating xml output file ...
running: gccxml /tmp/tmpSWogJs.cpp -fxml=acct.xml
thomas@linux:~/ctypes> python ctypes/wrap/xml2py.py acct.xml -o acct.py
thomas@linux:~/ctypes> python
Python 2.4.1a0 (#1, Oct 23 2004, 15:48:15)
[GCC 3.3.1 (SuSE Linux)] on linux2
Type "help", "copyright", "credits" or "license" for more information.thomas@linux:~/ctypes>

But it won't help you to decode/encode the comp_t fields into floats.
Note that the h2xml.py script requires gccxml.

Thomas
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,012
Latest member
RoxanneDzm

Latest Threads

Top