C question

Discussion in 'C Programming' started by Kenny McCormack, Mar 2, 2013.

  1. I'm thinking of writing a tool to analyze C structs - specifically to
    generate a mapping between each struct member and its offset (from the
    beginning of the struct). The reason for this is that I need to access C
    structs from another language that doesn't have structs - it only has
    offsets. I can use the offsetof(3) macro to generate the offsets; the
    actual problem is generating the list of all the members. In particular, if
    members are themselves structs, then you will need to recursively expand
    them out.

    This doesn't look to be a very difficult project, but I'm curious if there
    is already something out there that does it. I.e., to avoid wheel
    re-invention...

    --
    The motto of the GOP "base": You can't *be* a billionaire, but at least you
    can vote like one.
     
    Kenny McCormack, Mar 2, 2013
    #1
    1. Advertising

  2. Kenny McCormack

    BartC Guest

    "Kenny McCormack" <> wrote in message
    news:kgtq6m$bhm$...
    > I'm thinking of writing a tool to analyze C structs - specifically to
    > generate a mapping between each struct member and its offset (from the
    > beginning of the struct). The reason for this is that I need to access C
    > structs from another language that doesn't have structs - it only has
    > offsets. I can use the offsetof(3) macro to generate the offsets; the
    > actual problem is generating the list of all the members. In particular,
    > if
    > members are themselves structs, then you will need to recursively expand
    > them out.
    >
    > This doesn't look to be a very difficult project,


    On the contrary, it seems to me that it *is* difficult, since you'd have to
    create half a compiler for it to do the job (for example, needing to analyse
    and expand a few thousand lines of headers in order to work out the size of
    a typedef used for a particular struct member).

    Slightly simpler, but not much, is for the analyser to take the C struct
    definition, and to create a program using it which when run, prints out all
    the members and offsets involved. It depends also on how these definitions
    are presented: whether an arbitrary C program is an input, and the output is
    a detailed list of all the structs encountered.

    > but I'm curious if there
    > is already something out there that does it. I.e., to avoid wheel
    > re-invention...


    I'd be interested too because I have a similar problem, but in my case the
    other language does have structs. I solve it by manually constructing a
    matching struct in this other language which exactly corresponds, in member
    types, sizes, order and offsets, to the original C version. That's not so
    simple; it can involve running a dummy C program to display the sizes and
    offsets where they are not obvious.

    (I guess gcc doesn't have an option to print out such a list, or you would
    have known of it.)

    --
    Bartc
     
    BartC, Mar 2, 2013
    #2
    1. Advertising

  3. Kenny McCormack

    Ian Collins Guest

    Kenny McCormack wrote:
    > I'm thinking of writing a tool to analyze C structs - specifically to
    > generate a mapping between each struct member and its offset (from the
    > beginning of the struct). The reason for this is that I need to access C
    > structs from another language that doesn't have structs - it only has
    > offsets. I can use the offsetof(3) macro to generate the offsets; the
    > actual problem is generating the list of all the members. In particular, if
    > members are themselves structs, then you will need to recursively expand
    > them out.


    If they're your structs, generate both the C and other language code
    form an alternative source.

    > This doesn't look to be a very difficult project, but I'm curious if there
    > is already something out there that does it. I.e., to avoid wheel
    > re-invention...


    It would probably involve a good percentage of gcc....

    --
    Ian Collins
     
    Ian Collins, Mar 2, 2013
    #3
  4. Kenny McCormack

    Shao Miller Guest

    On 3/2/2013 16:16, Kenny McCormack wrote:
    > I'm thinking of writing a tool to analyze C structs - specifically to
    > generate a mapping between each struct member and its offset (from the
    > beginning of the struct). The reason for this is that I need to access C
    > structs from another language that doesn't have structs - it only has
    > offsets. I can use the offsetof(3) macro to generate the offsets; the
    > actual problem is generating the list of all the members. In particular, if
    > members are themselves structs, then you will need to recursively expand
    > them out.
    >
    > This doesn't look to be a very difficult project, but I'm curious if there
    > is already something out there that does it. I.e., to avoid wheel
    > re-invention...
    >


    I started tackling something possibly related to this:


    http://git.zytor.com/?p=users/sha0/...7;hb=95e4b5dedc01d6392ea4401fa1016a4c1418c467

    But haven't yet finished it. The goal was to use macro magic to both
    generate the structure type definitions as well as to generate
    "descriptors" that could be used to serialize and deserialize such
    structures.

    Unfortunately, it means that IDEs or tools which try to parse such
    source code in order to offer auto-completion information like:

    foo.
    ^ - bar
    - baz

    are less likely to be able to, due to the macro obfuscation. If I
    recall correctly, several people have suggested to me that they would
    actually prefer a tool which:

    1. Parses a file whose format is _your_ design
    2. Generates C source code from that file
    - Including the structure type definition, in a header
    - And maybe even a non-header file for serializing/deserializing
    such structures

    That way, the C source looks like C source, instead of a submission for
    IOCCC, and that way, tools which can process mundane C source can
    stomach it.

    --
    - Shao Miller
    --
    "Thank you for the kind words; those are the kind of words I like to hear.

    Cheerily," -- Richard Harter
     
    Shao Miller, Mar 2, 2013
    #4
  5. In article <kgtq6m$bhm$>,
    (Kenny McCormack) wrote:

    > I'm thinking of writing a tool to analyze C structs - specifically to
    > generate a mapping between each struct member and its offset (from the
    > beginning of the struct). The reason for this is that I need to access C
    > structs from another language that doesn't have structs - it only has
    > offsets. I can use the offsetof(3) macro to generate the offsets; the
    > actual problem is generating the list of all the members. In particular, if
    > members are themselves structs, then you will need to recursively expand
    > them out.
    >
    > This doesn't look to be a very difficult project, but I'm curious if there
    > is already something out there that does it. I.e., to avoid wheel
    > re-invention...


    Are you sure you're not reinventing the wheel? Because C is so
    ubiquitous, most other languages have tools for "foreign calling", so
    that they can be linked to C libraries.

    What's the other language?

    --
    Barry Margolin,
    Arlington, MA
    *** PLEASE post questions in newsgroups, not directly to me ***
     
    Barry Margolin, Mar 2, 2013
    #5
  6. In article <>,
    Ian Collins <> wrote:
    ....
    >If they're your structs, generate both the C and other language code
    >form an alternative source.


    I'm primarily interested in the system structs - the ones used in system
    calls - e.g., like "stat()".

    >> This doesn't look to be a very difficult project, but I'm curious if there
    >> is already something out there that does it. I.e., to avoid wheel
    >> re-invention...

    >
    >It would probably involve a good percentage of gcc....


    The overall plan is to use cc (specifically, gcc, although there's probably
    not much specific to gcc here) to do as much of the work as possible. I
    can use something like "gcc -E /usr/inclue/whatever.h" to get me the fully
    parsed version of the struct. Then I can use a scripting language (e.g.,
    AWK, Perl, etc) to parse that into a C program that uses offsetof(3) on each
    struct member and prints the result. Then, finally, compile and run the C
    program to generate a table of "member, offset" for each member.

    It is, of course, then the middle part that requires work on my part.

    --
    Windows 95 n. (Win-doze): A 32 bit extension to a 16 bit user interface for
    an 8 bit operating system based on a 4 bit architecture from a 2 bit company
    that can't stand 1 bit of competition.

    Modern day upgrade --> Windows XP Professional x64: Windows is now a 64 bit
    tweak of a 32 bit extension to a 16 bit user interface for an 8 bit
    operating system based on a 4 bit architecture from a 2 bit company that
    can't stand 1 bit of competition.
     
    Kenny McCormack, Mar 3, 2013
    #6
  7. Kenny McCormack

    BartC Guest

    "Kenny McCormack" <> wrote in message
    news:kgu4op$q7q$...
    > In article <>,
    > Ian Collins <> wrote:
    > ...
    >>If they're your structs, generate both the C and other language code
    >>form an alternative source.

    >
    > I'm primarily interested in the system structs - the ones used in system
    > calls - e.g., like "stat()".


    How many system structs are there likely to be? It might be simpler to
    hardcode, by hand, a list of struct and member names, and use the scripting
    language on that. If you're not interested in member types and sizes, that
    simplifies it further.

    (I'm assuming system structs aren't going to change much.)

    --
    Bartc
     
    BartC, Mar 3, 2013
    #7
  8. (Kenny McCormack) writes:

    > I'm thinking of writing a tool to analyze C structs - specifically to
    > generate a mapping between each struct member and its offset (from the
    > beginning of the struct). The reason for this is that I need to access C
    > structs from another language that doesn't have structs - it only has
    > offsets. I can use the offsetof(3) macro to generate the offsets; the
    > actual problem is generating the list of all the members. In particular, if
    > members are themselves structs, then you will need to recursively expand
    > them out.
    >
    > This doesn't look to be a very difficult project, but I'm curious if there
    > is already something out there that does it. I.e., to avoid wheel
    > re-invention...


    I suggest starting with gcc -fdump-translation-unit. That will give you
    GCC’s parse tree. I’m not sure if the syntax is documented anywhere but
    it doesn’t look particularly unclear.

    It does actually include the offsets but those will be tied to the code
    generation target of the GCC you use; if you’re generating anything
    intended to be even slightly portable that won’t be of any use to you
    and your plan to use offsetof is better.

    You don’t want to write a GCC-compatible C parser if you can reasonably
    avoid it.

    --
    http://www.greenend.org.uk/rjk/
     
    Richard Kettlewell, Mar 3, 2013
    #8
  9. Kenny McCormack

    Guest

    The first step you need to do is pre-process code. There are c pre-processors available or you can use preprocessor of compiler itself. Second is to analyze the structures. Now this can be tedious. You will have to parse the code by BNF grammar much like a compiler. However, I am sure you want to avoid that. If you are not opposed to using GCC then GCC exposes entire AST to you of a program. From that AST you can write a plugin to find offset.

    Does not look like a very difficult task.


    Best regards,
    Shiv

    http://libreprogramming.org
     
    , Mar 3, 2013
    #9
  10. In article <>,
    <> wrote:
    >The first step you need to do is pre-process code. There are c
    >pre-processors available or you can use preprocessor of compiler itself.
    >Second is to analyze the structures. Now this can be tedious. You will
    >have to parse the code by BNF grammar much like a compiler. However, I
    >am sure you want to avoid that. If you are not opposed to using GCC then
    >GCC exposes entire AST to you of a program. From that AST you can write
    >a plugin to find offset.
    >
    >Does not look like a very difficult task.


    That was my take, as you can see from the OP. And it may come to that - to
    re-inventing this wheel, one more time (myself).

    But I'm sure that this task has been done thousands of times, by hundreds of
    people, for hundreds of reasons. The trick is finding their work. That was
    the reason for the NG post.

    --
    The motto of the GOP "base": You can't *be* a billionaire, but at least you
    can vote like one.
     
    Kenny McCormack, Mar 3, 2013
    #10
  11. In article <>,
    Richard Kettlewell <> wrote:
    ....
    >I suggest starting with gcc -fdump-translation-unit. That will give you
    >GCC’s parse tree. I’m not sure if the syntax is documented
    >anywhere but it doesn’t look particularly unclear.


    I see the length of the field, but not the offset. Does this mean I have to
    manually add up the lengths, as I go, to find the offset?

    @3140 identifier_node strg: XX_YYYY lngt: 7

    >It does actually include the offsets but those will be tied to the code
    >generation target of the GCC you use; if you’re generating anything
    >intended to be even slightly portable that won’t be of any use to you
    >and your plan to use offsetof is better.


    Not sure I get your drift here, but, no, portability is not an issue.

    >You don’t want to write a GCC-compatible C parser if you can reasonably
    >avoid it.


    Indeed not.

    --
    Religion is regarded by the common people as true,
    by the wise as foolish,
    and by the rulers as useful.

    (Seneca the Younger, 65 AD)
     
    Kenny McCormack, Mar 3, 2013
    #11
  12. Kenny McCormack

    Jorgen Grahn Guest

    ["Followup-To:" header set to comp.lang.c.]

    On Sat, 2013-03-02, Kenny McCormack wrote:
    > I'm thinking of writing a tool to analyze C structs - specifically to
    > generate a mapping between each struct member and its offset (from the
    > beginning of the struct). The reason for this is that I need to access C

    ....
    > This doesn't look to be a very difficult project, but I'm curious if there
    > is already something out there that does it. I.e., to avoid wheel
    > re-invention...


    Look at c2ph/pstruct which comes with Perl.

    /Jorgen

    --
    // Jorgen Grahn <grahn@ Oo o. . .
    \X/ snipabacken.se> O o .
     
    Jorgen Grahn, Mar 3, 2013
    #12
  13. (Kenny McCormack) writes:
    > Richard Kettlewell <> wrote:


    >>I suggest starting with gcc -fdump-translation-unit. That will give you
    >>GCC’s parse tree. I’m not sure if the syntax is documented
    >>anywhere but it doesn’t look particularly unclear.

    >
    > I see the length of the field, but not the offset. Does this mean I have to
    > manually add up the lengths, as I go, to find the offset?
    >
    > @3140 identifier_node strg: XX_YYYY lngt: 7


    That’s just the name, you need to go up one level to the corresponding
    field_decl and follow the bpos link to find the offset (in bits). Or
    rather, you walk the tree from the root down to the structure definition
    (record_type) and follow the chain of fields through. (But as discussed
    below you’re better off only extracting the names and treating the
    compile’s idea of the offset as hearsay anyway.)

    >>It does actually include the offsets but those will be tied to the code
    >>generation target of the GCC you use; if you’re generating anything
    >>intended to be even slightly portable that won’t be of any use to you
    >>and your plan to use offsetof is better.

    >
    > Not sure I get your drift here, but, no, portability is not an issue.


    Suppose your struct is this:

    struct foo {
    void *a;
    int b;
    };

    If you ask a compiler targetting a 64-bit ISA then the offset of ‘b’ is
    8; for a 32-bit ISA it’ll be 4. If you build those values directly into
    your code then it’ll only work on platforms with the same bitness (and
    possibly more narrowly than that). Any numbers you extract from
    compiler intermediate output will suffer from this problem.

    --
    http://www.greenend.org.uk/rjk/
     
    Richard Kettlewell, Mar 3, 2013
    #13
  14. In article <>,
    Jorgen Grahn <> wrote:
    >["Followup-To:" header set to comp.lang.c.]
    >
    >On Sat, 2013-03-02, Kenny McCormack wrote:
    >> I'm thinking of writing a tool to analyze C structs - specifically to
    >> generate a mapping between each struct member and its offset (from the
    >> beginning of the struct). The reason for this is that I need to access C

    >...
    >> This doesn't look to be a very difficult project, but I'm curious if there
    >> is already something out there that does it. I.e., to avoid wheel
    >> re-invention...

    >
    >Look at c2ph/pstruct which comes with Perl.


    Thank you for this. This seems (*) to be exactly what I am looking for and
    precisely in line with my reason for posting. I was sure that this had been
    done before. Note that the Perl need is pretty much in line with my own
    need - that is, a need to access structs by offset rather than by name.

    (*) I say "seems" because, unfortunately, in the test case that I did, it
    didn't work right. Which is strange, because I think the Perl guys (TC in
    particular) do good work. Strange that it would break (give wrong results)
    in my first and only test case. Oh well, this is a QOI issue; I may or may
    not investigate further.

    In any case, if anyone has any other pointers-to-existing-work, please
    continue to send them in. I.e., the Perl solution *is* the sort of thing
    I'm looking for - but I'd like something that actually works correctly.

    --
    Religion is regarded by the common people as true,
    by the wise as foolish,
    and by the rulers as useful.

    (Seneca the Younger, 65 AD)
     
    Kenny McCormack, Mar 4, 2013
    #14
  15. In article <4e3Zs.16616$>,
    Scott Lurndal <> wrote:
    ....
    >>This doesn't look to be a very difficult project, but I'm curious if there
    >>is already something out there that does it. I.e., to avoid wheel
    >>re-invention...

    >
    >pahole(1)
    >
    >http://lwn.net/Articles/365844/


    pahole does indeed look good. As I've said all along, I'm sure someone has
    done this - so it just needed to be found. Thenks for pointing me to it.

    Two comments:

    1) I wish there were more documentation. The man page, in true Unix style,
    doesn't tell you anything unless you already understands what most of the
    options do.

    2) It doesn't do the one thing I had most hoped for - namely, recusively
    expanding out the structs. I.e., my struct contains some other struct and
    all I get in the output of pahole is a reference to the other struct (and
    its correct length) - not an explicit enumeration of the elements of the
    sub-struct. Oh well. I can live with this.

    --
    (This discussion group is about C, ...)

    Wrong. It is only OCCASIONALLY a discussion group
    about C; mostly, like most "discussion" groups, it is
    off-topic Rorsharch [sic] revelations of the childhood
    traumas of the participants...
     
    Kenny McCormack, Mar 7, 2013
    #15
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. sean
    Replies:
    1
    Views:
    614
    Cowboy \(Gregory A. Beamer\)
    Oct 20, 2003
  2. =?Utf-8?B?UnlhbiBTbWl0aA==?=

    Quick Question - Newby Question

    =?Utf-8?B?UnlhbiBTbWl0aA==?=, Feb 14, 2005, in forum: ASP .Net
    Replies:
    4
    Views:
    682
    Iain Norman
    Feb 16, 2005
  3. =?Utf-8?B?YW5kcmV3MDA3?=

    question row filter (more of sql query question)

    =?Utf-8?B?YW5kcmV3MDA3?=, Oct 5, 2005, in forum: ASP .Net
    Replies:
    2
    Views:
    964
    Scott Allen
    Oct 6, 2005
  4. Philip Meyer
    Replies:
    0
    Views:
    438
    Philip Meyer
    Nov 30, 2003
  5. Bit Byte
    Replies:
    1
    Views:
    870
    Teemu Keiski
    Jan 28, 2007
Loading...

Share This Page