Stuffing bytes into structs and endians

  • Thread starter Steven T. Hatton
  • Start date
S

Steven T. Hatton

I'm trying to parse an ELF file to build a human readable representation. I
know it's been done before, and there are tools such as objdump, nm,
readelf, c++filt, etc. I can look at the source to see how they are
implemented. In most cases, that means reading C, not C++. I do have an
example of C++ code that works. I thought it might be a good idea to take
a different approach than that code takes.

Rather than processing the data from in std::ifstream as ELFIO does, I
wanted to read it into a buffer, and work on it from there. One option is
to use a std::stringstream, and treat it like and do basically the same
thing ELFIO does with the std::ifstream. What I currently have is a
std::vector<char>, which I thought would be a good thing to use. But if I
try to do that, I have to find a different way to do things such as

m_pStream->read( reinterpret_cast<char*>( &m_header ), sizeof( m_header ) );

My intent was to work on the file from the point of view of it just being
data, rather than a stream. Is that simply "not the way it's done"?


Another question I have is this: The ELF file starts with an identification
tag which carries such information as encoding type (big endian, little
endian), the class of the machine(32 or 64-bit), etc. That's all in
specifically ordered bytes. That leaves a question opened. Doesn't the
host encoding determine which byte is treated as 0? IOW, the first four
bytes are specified as '0x7f', 'E', 'L', 'F'. If that were read by a
machine that uses the opposite endian, would that not come across as
'F','L','E','0x7f'? Even more perplexing is this: the rest of the ELF
header is formed of heterogeneous data types. See below. Does that mean
some of the data might be put into the wrong location by the
reinterpret_cast<> above?

// Platform specific definitions.
typedef unsigned long Elf32_Addr;
typedef unsigned short Elf32_Half;
typedef unsigned long Elf32_Off;
typedef signed long Elf32_Sword;
typedef unsigned long Elf32_Word;


// ELF file header
struct Elf32_Ehdr {
unsigned char e_ident[EI_NIDENT];
Elf32_Half e_type;
Elf32_Half e_machine;
Elf32_Word e_version;
Elf32_Addr e_entry;
Elf32_Off e_phoff;
Elf32_Off e_shoff;
Elf32_Word e_flags;
Elf32_Half e_ehsize;
Elf32_Half e_phentsize;
Elf32_Half e_phnum;
Elf32_Half e_shentsize;
Elf32_Half e_shnum;
Elf32_Half e_shstrndx;
};
 
S

Steven T. Hatton

EventHelix.com said:
The following article should answer your questions about byte alignment
and ordering:

http://www.eventhelix.com/RealtimeMantra/ByteAlignmentAndOrdering.htm

OK, now I know what my mistake was. The only ordering significance has to
do the how the individual bytes are interpreted as part of a larger number.
Byte [0] is still byte [0] no matter what endian I have. So byte [5] will
always contain the data encoding byte that tells me which endian to use.
Thanks.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,482
Members
44,901
Latest member
Noble71S45

Latest Threads

Top