Stuffing bytes into structs and endians

Discussion in 'C++' started by Steven T. Hatton, Aug 5, 2005.

  1. I'm trying to parse an ELF file to build a human readable representation. I
    know it's been done before, and there are tools such as objdump, nm,
    readelf, c++filt, etc. I can look at the source to see how they are
    implemented. In most cases, that means reading C, not C++. I do have an
    example of C++ code that works. I thought it might be a good idea to take
    a different approach than that code takes.

    Rather than processing the data from in std::ifstream as ELFIO does, I
    wanted to read it into a buffer, and work on it from there. One option is
    to use a std::stringstream, and treat it like and do basically the same
    thing ELFIO does with the std::ifstream. What I currently have is a
    std::vector<char>, which I thought would be a good thing to use. But if I
    try to do that, I have to find a different way to do things such as

    m_pStream->read( reinterpret_cast<char*>( &m_header ), sizeof( m_header ) );

    My intent was to work on the file from the point of view of it just being
    data, rather than a stream. Is that simply "not the way it's done"?

    Another question I have is this: The ELF file starts with an identification
    tag which carries such information as encoding type (big endian, little
    endian), the class of the machine(32 or 64-bit), etc. That's all in
    specifically ordered bytes. That leaves a question opened. Doesn't the
    host encoding determine which byte is treated as 0? IOW, the first four
    bytes are specified as '0x7f', 'E', 'L', 'F'. If that were read by a
    machine that uses the opposite endian, would that not come across as
    'F','L','E','0x7f'? Even more perplexing is this: the rest of the ELF
    header is formed of heterogeneous data types. See below. Does that mean
    some of the data might be put into the wrong location by the
    reinterpret_cast<> above?

    // Platform specific definitions.
    typedef unsigned long Elf32_Addr;
    typedef unsigned short Elf32_Half;
    typedef unsigned long Elf32_Off;
    typedef signed long Elf32_Sword;
    typedef unsigned long Elf32_Word;

    // ELF file header
    struct Elf32_Ehdr {
    unsigned char e_ident[EI_NIDENT];
    Elf32_Half e_type;
    Elf32_Half e_machine;
    Elf32_Word e_version;
    Elf32_Addr e_entry;
    Elf32_Off e_phoff;
    Elf32_Off e_shoff;
    Elf32_Word e_flags;
    Elf32_Half e_ehsize;
    Elf32_Half e_phentsize;
    Elf32_Half e_phnum;
    Elf32_Half e_shentsize;
    Elf32_Half e_shnum;
    Elf32_Half e_shstrndx;
    Steven T. Hatton, Aug 5, 2005
    1. Advertisements

  2., Aug 6, 2005
    1. Advertisements

  3. OK, now I know what my mistake was. The only ordering significance has to
    do the how the individual bytes are interpreted as part of a larger number.
    Byte [0] is still byte [0] no matter what endian I have. So byte [5] will
    always contain the data encoding byte that tells me which endian to use.
    Steven T. Hatton, Aug 6, 2005
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.