Reading binary data

Discussion in 'Python' started by David M, Nov 23, 2005.

  1. David M

    David M Guest

    OK so here is my task. I want to get at the data stored in
    /var/account/pacct, which stores process accounting data, so that I can
    make it into a more human understandable format then what the program
    sa can do. The thing is, its in a binary format and an example program
    that reads some data from the file is done in C using a struct defined
    in sys/acct.h.

    http://www.linuxjournal.com/articles/lj/0104/6144/6144l2.html

    So I was wondering how can I do the same thing, but in python? I'm
    still learning so please be gentle.

    David
     
    David M, Nov 23, 2005
    #1
    1. Advertising

  2. David M wrote:

    > OK so here is my task. I want to get at the data stored in
    > /var/account/pacct, which stores process accounting data, so that I can
    > make it into a more human understandable format then what the program
    > sa can do. The thing is, its in a binary format and an example program
    > that reads some data from the file is done in C using a struct defined
    > in sys/acct.h.
    >
    > http://www.linuxjournal.com/articles/lj/0104/6144/6144l2.html
    >
    > So I was wondering how can I do the same thing, but in python? I'm
    > still learning so please be gentle.


    outline:

    1. load the data

    f = open(filename, "rb")

    data = f.read()

    2. parse it:

    http://docs.python.org/lib/module-struct.html

    </F>
     
    Fredrik Lundh, Nov 23, 2005
    #2
    1. Advertising

  3. On 2005-11-23, David M <> wrote:
    > OK so here is my task. I want to get at the data stored in
    > /var/account/pacct, which stores process accounting data, so that I can
    > make it into a more human understandable format then what the program
    > sa can do. The thing is, its in a binary format and an example program
    > that reads some data from the file is done in C using a struct defined
    > in sys/acct.h.
    >
    > http://www.linuxjournal.com/articles/lj/0104/6144/6144l2.html
    >
    > So I was wondering how can I do the same thing, but in python? I'm
    > still learning so please be gentle.


    use the struct module

    http://www.python.org/doc/current/lib/module-struct.html

    --
    Grant Edwards grante Yow! O.K.! Speak with a
    at PHILADELPHIA ACCENT!! Send
    visi.com out for CHINESE FOOD!! Hop
    a JET!
     
    Grant Edwards, Nov 23, 2005
    #3
  4. David M

    David M Guest

    Thanks but the C Struct describing the data doesn't match up with the
    list on the module-struct page.

    this is the acct.h file

    #ifndef _SYS_ACCT_H
    #define _SYS_ACCT_H 1

    #include <features.h>

    #define __need_time_t
    #include <time.h>
    #include <sys/types.h>

    __BEGIN_DECLS

    #define ACCT_COMM 16

    /*
    comp_t is a 16-bit "floating" point number with a 3-bit base 8
    exponent and a 13-bit fraction. See linux/kernel/acct.c for the
    specific encoding system used.
    */

    typedef u_int16_t comp_t;

    struct acct
    {
    char ac_flag; /* Accounting flags. */
    u_int16_t ac_uid; /* Accounting user ID. */
    u_int16_t ac_gid; /* Accounting group ID. */
    u_int16_t ac_tty; /* Controlling tty. */
    u_int32_t ac_btime; /* Beginning time. */
    comp_t ac_utime; /* Accounting user time. */
    comp_t ac_stime; /* Accounting system time. */
    comp_t ac_etime; /* Accounting elapsed time. */
    comp_t ac_mem; /* Accounting average memory usage. */
    comp_t ac_io; /* Accounting chars transferred. */
    comp_t ac_rw; /* Accounting blocks read or written. */
    comp_t ac_minflt; /* Accounting minor pagefaults. */
    comp_t ac_majflt; /* Accounting major pagefaults. */
    comp_t ac_swaps; /* Accounting number of swaps. */
    u_int32_t ac_exitcode; /* Accounting process exitcode. */
    char ac_comm[ACCT_COMM+1]; /* Accounting command name. */
    char ac_pad[10]; /* Accounting padding bytes. */
    };

    enum
    {
    AFORK = 0x01, /* Has executed fork, but no exec. */
    ASU = 0x02, /* Used super-user privileges. */
    ACORE = 0x08, /* Dumped core. */
    AXSIG = 0x10 /* Killed by a signal. */
    };

    #define AHZ 100


    /* Switch process accounting on and off. */
    extern int acct (__const char *__filename) __THROW;

    __END_DECLS

    #endif /* sys/acct.h */

    What are u_ini16_t and comp_t? And what about the enum section?
     
    David M, Nov 23, 2005
    #4
  5. In comp.lang.python, "David M" wrote:

    > Thanks but the C Struct describing the data doesn't match up with the
    > list on the module-struct page.


    Then you're going to have to do some bit-bashing.

    > comp_t is a 16-bit "floating" point number with a 3-bit base 8
    > exponent and a 13-bit fraction. See linux/kernel/acct.c for the
    > specific encoding system used.


    Yoinks! That's just a bit too clever.

    You'll have to impliment that yourself.

    > What are u_ini16_t and comp_t?


    Dunno, I guess you'll have to look at the sources and find the
    typedefs.

    > And what about the enum section?


    It's probably a 32-bit integer, but it may take some
    experimentation to confrim that.

    --
    Grant Edwards grante Yow! I invented skydiving
    at in 1989!
    visi.com
     
    Grant Edwards, Nov 23, 2005
    #5
  6. "David M" wrote:

    > What are u_ini16_t and comp_t?


    comp_t is explained in the file you posted:

    > /*
    > comp_t is a 16-bit "floating" point number with a 3-bit base 8
    > exponent and a 13-bit fraction. See linux/kernel/acct.c for the
    > specific encoding system used.
    > */
    >
    > typedef u_int16_t comp_t;


    as the comment says, comp_t is a 16-bit value. you can read it in as
    an integer, but you have to convert it to a floating point according to
    the encoding mentioned above.

    the typedef says that comp_t is stored as a u_int16_t, which means
    that it's 16-bit value too. judging from the name, and the fields using
    it, it's safe to assume that it's an unsigned 16-bit integer.

    > And what about the enum section?


    it just defines a bunch of symbolic values; AFORK is 1, ASU is 2, etc.

    > enum
    > {
    > AFORK = 0x01, /* Has executed fork, but no exec. */
    > ASU = 0x02, /* Used super-user privileges. */
    > ACORE = 0x08, /* Dumped core. */
    > AXSIG = 0x10 /* Killed by a signal. */
    > };


    at this point, you should be able to do a little experimentation. read in
    a couple of bytes (64 bytes should be enough), print them out, and try
    to see if you can match the bytes with the description above.

    import struct

    f = open(filename, "rb")

    data = f.read(64)

    # hex dump
    print data.encode("hex")

    # list of decimal byte values
    print map(ord, data)

    # struct test (keep adding type codes until you're sorted everything out)
    format = "BHHHHHH"
    print struct.unpack(format, struct.calcsize(format))

    </F>
     
    Fredrik Lundh, Nov 23, 2005
    #6
  7. "David M" <> writes:

    > Thanks but the C Struct describing the data doesn't match up with the
    > list on the module-struct page.
    >
    > this is the acct.h file

    [...]

    Tooting my ctypes horn (sorry for that):

    thomas@linux:~/ctypes> locate acct.h
    /usr/include/linux/acct.h
    /usr/include/sys/acct.h
    thomas@linux:~/ctypes> python ctypes/wrap/h2xml.py sys/acct.h -o acct.xml
    creating xml output file ...
    running: gccxml /tmp/tmpSWogJs.cpp -fxml=acct.xml
    thomas@linux:~/ctypes> python ctypes/wrap/xml2py.py acct.xml -o acct.py
    thomas@linux:~/ctypes> python
    Python 2.4.1a0 (#1, Oct 23 2004, 15:48:15)
    [GCC 3.3.1 (SuSE Linux)] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import acct
    >>> acct.acct

    <class 'acct.acct'>
    >>> acct.AFORK

    1
    >>> acct.ASU

    2
    >>> acct.acct.ac_comm

    <Field type=c_char_Array_17, ofs=36, size=17>
    >>> acct.acct.ac_utime

    <Field type=c_ushort, ofs=12, size=2>
    >>> from ctypes import sizeof
    >>> sizeof(acct.acct)

    64
    >>> acct.acct.ac_flag

    <Field type=c_char, ofs=0, size=1>
    >>>

    thomas@linux:~/ctypes>

    But it won't help you to decode/encode the comp_t fields into floats.
    Note that the h2xml.py script requires gccxml.

    Thomas
     
    Thomas Heller, Nov 23, 2005
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Denise Smith
    Replies:
    2
    Views:
    664
    Denise Smith
    Nov 22, 2003
  2. My Name
    Replies:
    9
    Views:
    10,424
    Roedy Green
    Jul 21, 2004
  3. Brad Marts

    Reading binary data from file

    Brad Marts, Dec 8, 2003, in forum: C++
    Replies:
    1
    Views:
    441
    Victor Bazarov
    Dec 8, 2003
  4. Dimitri Papoutsis

    Problems with reading binary data files

    Dimitri Papoutsis, Mar 10, 2005, in forum: C++
    Replies:
    4
    Views:
    398
    Old Wolf
    Mar 11, 2005
  5. Replies:
    1
    Views:
    332
    Peter Hansen
    Oct 21, 2004
Loading...

Share This Page