Discussion in 'C Programming' started by lancer6238, Nov 16, 2009.

1. ### lancer6238Guest

Hi all,

I have a file that I'm trying to read into memory. I want to store the
hexadecimal values of the file content into a buffer.

Say the hexadecimal value is 12 00 00 27 09 10, I want to read the 3rd
and 4th bytes as the length, i.e. 0x0027 = 39 in decimal and assign
length = 39.

So far, I'm using

char clength[3];
sprintf(clength, "%x%x", file_buffer[2], file_buffer[3]);
length = strtol(clength, NULL, 16);

And I am able to get the correct results length = 39.

Then I want to add 14 to the length, and put the hexadecimal value of
the result into a character array (storing hexadecimal values).

This character array is initialized to

array[] = {0x00, 0x00, 0xff, 0x00, 0x00, 0x00, 0xff, 0x00, 0x00,
0x00};

I want the hexadecimal value of (length+14 = 53 = 0x35) to replace the
2 0xff in array[], so I tried

sprintf(clength, "%x", length+14);
array[2] = clength[0];
array[6] = clength[0];

But I got 0x33 instead of 0x35.

But
array[2] = clength[1];
array[6] = clength[1];
gets me the correct 0x35 value.

Why is that?

Also, when I printed out the contents of the array[] before any
modifications were done, using

for (i = 0 ; i < 10 ; i++)
printf("%x ", array);

I get "0 0 ffffffff 0 0 0 ffffffff 0 0 0". Why didn't I get "0 0 ff 0
0 0 ff 0 0 0"?

Thank you.

Regards,
Rayne

lancer6238, Nov 16, 2009

2. ### SeebsGuest

No, you don't.

You are becoming confused and thinking that you care about representation.
You don't.
This is wrong in several ways.
Lucky!

Here is the thing. First off, consider what happens if the two
values are 0x02 and 0x03. You will get "23", so you'll treat
0x0203 as if it were 0x23. You want %02x%02x.

Secondly, "%x" may often produce at least two characters, and
you need another byte for the trailing null. Assuming that you
never see values outside the range 0..255, you still need at
least 5 bytes.

Finally, WHY WHY WHY WHY are you carefully converting the
integer values you want into a string, then converting them back?

Try:
length = (file_buffer[2] * 256) + file_buffer[3];
Right.

Because you have populated clength with the STRING "35".

So you're stashing the CHARACTER '3' in your array. And since you
appear to be using ASCII, it happens that '3' is the same value
as '\x33'.
Because '5' == '\x35' in your environment.

In short, pure coincidence. If your value had come out 0x40, then
you would be seeing 0x34 and 0x30 for the two values, and you'd
have a better guess as to what's wrong.

Because you're on a machine where characters are signed, and when
a signed -1 is promoted to int, it stays -1, and on your machine int
is 32-bits, and -1 is 0xffffffff.

Summary:

1. If you wanna work with raw bit values, use unsigned char, don't
rely on plain char.
2. Don't convert to and from hex when what you have is raw data.

Print stuff in hex or decimal or whatever you want for human readers,
but if you have a value stored in two adjacent bytes, you do not
need to "convert" it.

-s

Seebs, Nov 16, 2009

3. ### Ben BacarisseGuest

(and an excellent analysis of the problem, to boot)
I'd only add that you might need to convert it because it might be in
the wrong order. You may well be right in this case about the OP,
their system's integer byte order, and this length that is sitting the
file, but sometimes you *do* have to convert and you always need to do
*something* (even if it is no-op on some systems) if the code is to be
fully portable.

Ben Bacarisse, Nov 16, 2009
4. ### SeebsGuest

Yeah, you can end up needing to do some funky stuff, but for the
most part, it comes down to some variant of
(msb * 256) + (lsb)
or
(msb * 256 * 256 * 256) + (nmsb * 256 * 256) + (nlsb * 256) + (lsb)

Which is to say, needing to go to text for an intermediate representation.

I'm not arguing that the OP should just grab a pair of bytes from the
file and treat that as an int or anything -- but since the OP seems to
think that it's foo[2] that's the more significant byte and foo[3] that's
less significant...

-s
("nmsb" = "next most significant byte")

Seebs, Nov 16, 2009
5. ### lancer6238Guest

Why would "%x" often produce at least two characters?

And thank you for the clear explanation!

lancer6238, Nov 16, 2009
6. ### SeebsGuest

char foo[5];
sprintf(foo, "%x", 0x23);

This will populate the first three characters of "foo" with:
0x32 0x33 0x00

It's actually writing the STRING "23" -- which is two characters.
Plus there's another one (the null byte at the end), but that's just
one per string, not one per character printed.

Note that the "0x32" there is a description of the value. 0x32 is a single
character -- and if you're using ASCII, it's a '2'. So that's three
characters, even though I took a bunch to write it.

It can be fussy trying to get at the difference between our depiction of data
and the raw data. When we dump the contents of memory, we tend to use hex
because it's nicely regular; you can display everything with two characters
and it looks pretty. (This all assumes 8-bit bytes, etcetera.) So it's easy
to think that the memory is "really" in hex, but it's not; it's just raw
numbers, we can format them however we want.

-s

Seebs, Nov 16, 2009