Two's Complement, Serialization, etc.

B

Beej Jorgensen

Here's what I'm trying to do: portably serialize signed 16-bit integers.
I'd like some ideas for improvement here.

1 void pack16(unsigned char buf[2], int16_t v)
2 {
3 uint16_t v2 = v;
4 buf[0] = v2 >> 8;
5 buf[1] = v2;
6 }

7 int16_t unpack16(unsigned char buf[2])
8 {
9 uint16_t v2 = buf[0] << 8 | buf[1];
10
11 if (v2 > 0x7fffu) { // check for negative number
12 return -(0xffffu - v2 + 1u);
13 }
14
15 return v2;
16 }

Comments/Questions:

1. I believe the direct assignment to the unsigned variable in pack16()
line 3 should give us the two's complement negative by C99-6.3.1.3p2.

2. By the same paragraph, I think "&0xff"s on lines 4 and 5 are
unnecessary.

3. Are casts required on line 12 for any of the constants? I'm thinking
the answer is "no" by C99-6.4.4.1p5.

4. Are casts required on the return values? Or anywhere on the
expression on line 12? I'm thinking no, but can't find a passage to
cite.

5. In unpack16(), I cannot portably assign the (buf[0]<<8)|buf[1]) into an
int16_t as it might overflow (C99-6.3.1.3p3.) And in pack16(), I
cannot portably assign v>>8 into buf[0] if v is negative
(C99-6.3.1.3p4.)

6. In the code, I assume the implementation supports int16_t and
uint16_t, which it is not required to do. But I don't see why the
above code shouldn't work with the "least16" variants.

7. Endianess issues? sizeof-related issues? sizeof-related issues if
least16-type variables were used? CHAR_BIT-related issues? Integer
promotion-related issues?

8. Any way to take advantage of the fact that int16_t is two's
complement in unpack16()?

Anything else? Something feels wrong or oversimplified or
overcomplicated or lame.

-Beej
 
B

Ben Bacarisse

Beej Jorgensen said:
Here's what I'm trying to do: portably serialize signed 16-bit integers.
I'd like some ideas for improvement here.

1 void pack16(unsigned char buf[2], int16_t v)
2 {
3 uint16_t v2 = v;
4 buf[0] = v2 >> 8;
5 buf[1] = v2;
6 }

7 int16_t unpack16(unsigned char buf[2])
8 {
9 uint16_t v2 = buf[0] << 8 | buf[1];
10
11 if (v2 > 0x7fffu) { // check for negative number
12 return -(0xffffu - v2 + 1u);
13 }
14
15 return v2;
16 }

Comments/Questions:

Anything else? Something feels wrong or oversimplified or
overcomplicated or lame.

One thing that may be useful. I've tested code that does this sort of
thing using signed bit fields of odd sizes. The idea is you define

struct char9 { signed int v: 9; };
struct int17 { signed int v:17; };

and with some macro magic you can (on some compilers) try out signed
9-bits chars and 17-bit ints.
 
B

Beej Jorgensen

Eric Sosman said:
Beej said:
1 void pack16(unsigned char buf[2], int16_t v)
2 {
3 uint16_t v2 = v;
4 buf[0] = v2 >> 8;
5 buf[1] = v2;
6 }

7 int16_t unpack16(unsigned char buf[2])
8 {
9 uint16_t v2 = buf[0] << 8 | buf[1];
10
11 if (v2 > 0x7fffu) { // check for negative number
12 return -(0xffffu - v2 + 1u);
13 }
14
15 return v2;
16 }
2. By the same paragraph, I think "&0xff"s on lines 4 and 5 are
unnecessary.

Unnecessary in line 4, but required in line 5 if CHAR_BIT > 8.

I wonder if since this is for networking where all bytes are octets it
wouldn't just be better to use uint8_t for the individual bytes and be
done with it. Sorry, CHAR_BIT==9!
Portable representations of binary data have received a lot of
study, and you might consider adopting an existing solution instead
of rolling your own freshly-smoothed wheel ...

I agree. I actually recommend a couple of them (which would, aside from
being correct, probably also be faster), but people are always curious,
so I like to provide them a simple example of what kind of mechanisms
are used in cases like this.

Packing up multibyte ints into an array is a fairly simple example, but
recently someone took the stuff and had it not work with negative
numbers on a different architecture, and so now I am compelled to make
it righter--there are several portability flaws in the current code.
(This is basically Exercise 9-1 in POP).

Or I could just go the other way and just make everything unsigned. But
where's the fun in that? ;)

Thanks for the feedback!

-Beej
 
B

Beej Jorgensen

Ben Bacarisse said:
struct char9 { signed int v: 9; };
struct int17 { signed int v:17; };

and with some macro magic you can (on some compilers) try out signed
9-bits chars and 17-bit ints.

Oh, cool!

-Beej
 
M

Mark

Eric said:
1 void pack16(unsigned char buf[2], int16_t v)
2 {
3 uint16_t v2 = v;
4 buf[0] = v2 >> 8;
5 buf[1] = v2;
6 }
1. I believe the direct assignment to the unsigned variable in
pack16() line 3 should give us the two's complement negative by
C99-6.3.1.3p2.

The conversion "stuff" still confuses me a lot. Am I right assuming that in

int16_t v = 3;
uint16_t v2 = v;

the rules of integer promotions are applied? So, 'v' in this example is
promoted to uint16_t, because int16_t can't represent all the values of the
original type and should be converted to 'unsigned'. But it applies only to
'int' and 'unsigned int' operands.
In truth, it can't give a negative of any description, since
an unsigned type cannot express a negative value. However, you're
right that the bit pattern in `v2' will be the same that `v' would
have if `v' used two's complement (whether it actually does or not).
.... and the bit pattern in `v2' remains because 6.3.1.1p3 says so, is that
correct ?
 
E

Eric Sosman

Mark said:
Eric said:
1 void pack16(unsigned char buf[2], int16_t v)
2 {
3 uint16_t v2 = v;
4 buf[0] = v2 >> 8;
5 buf[1] = v2;
6 }
1. I believe the direct assignment to the unsigned variable in
pack16() line 3 should give us the two's complement negative by
C99-6.3.1.3p2.

The conversion "stuff" still confuses me a lot. Am I right assuming that in

int16_t v = 3;
uint16_t v2 = v;

the rules of integer promotions are applied? So, 'v' in this example is
promoted to uint16_t, because int16_t can't represent all the values of
the original type and should be converted to 'unsigned'. But it applies
only to 'int' and 'unsigned int' operands.

No promotions appear to be called for, but there are two
conversions (promotions are conversions, but not all conversions
are promotions):

- The `int' value 3 is converted from `int' to `int16_t'.
Since 3 is within the range of the latter type, nothing
peculiar happens: `v' is initialized with the value three,
type `int16_t'.

- The `int16_t' value of `v' is converted to `uint16_t'.
Since the value of `v' (three) is within the range of
`uint16_t', nothing weird happens and `v2' is initialized
with the value three, type `uint16_t'.
... and the bit pattern in `v2' remains because 6.3.1.1p3 says so, is
that correct ?

I don't think so, because that paragraph concerns the integer
promotions, and no promotions occur. I think the applicable text
is 6.3.1.3p1, which describes integer conversions where the "target"
type can represent the value of the "source."

The other applicable pieces are 6.2.6.2p1, which describes the
representation of unsigned integer types, and 7.18.1.1.2, which
forbids padding bits in exact-width unsigned integers.

(And 7.18.1.1p1 requires padding-free two's complement for
the exact-width signed integers, making my parenthetical remark
vacuous. I've already dope-slapped myself for that blunder; see
elsethread.)
 
M

Mark

Eric said:
int16_t v = 3;
uint16_t v2 = v;
[skip]
No promotions appear to be called for, but there are two
conversions (promotions are conversions, but not all conversions
are promotions):

I carefully read the Standards, it says:
"The integer promotions are applied only: as part of the usual arithmetic
conversions, to certain argument expressions, to the operands of the unary
+, -, and ~ operators, and to both operands of the shift operators, as
specified by their respective subclauses"

I understand that promotion occurs for function arguments, for operands of
+/-/~ and 'shift' operator. Any other "converting activity" would be a
*conversion*. Is that more correct wording ?
- The `int' value 3 is converted from `int' to `int16_t'.
Since 3 is within the range of the latter type, nothing
peculiar happens: `v' is initialized with the value three,
type `int16_t'.

- The `int16_t' value of `v' is converted to `uint16_t'.
Since the value of `v' (three) is within the range of
`uint16_t', nothing weird happens and `v2' is initialized
with the value three, type `uint16_t'.

Do the conversions you described have to do with usual arithmetic
conversion? Also I don't understand why 6.3.1.3 is seperate from 6.3.1.8,
their purpose is very close.
 
E

Eric Sosman

Mark said:
Eric said:
int16_t v = 3;
uint16_t v2 = v;
[skip]
No promotions appear to be called for, but there are two
conversions (promotions are conversions, but not all conversions
are promotions):

I carefully read the Standards, it says:
"The integer promotions are applied only: as part of the usual
arithmetic conversions, to certain argument expressions, to the operands
of the unary +, -, and ~ operators, and to both operands of the shift
operators, as specified by their respective subclauses"

I understand that promotion occurs for function arguments, for operands
of +/-/~ and 'shift' operator. Any other "converting activity" would be
a *conversion*. Is that more correct wording ?

The broad topic is "Conversions" (6.3). All these things
are conversions. Some subsets of all the possible conversions
are given special names:

1) The conversions described in 6.3.1.1p2 are called the
"integer promotions." They are still conversions, just
a specially designated subset of all conversions.

2) The integer promotions plus float-to-double are called
the "default argument promotions" (6.5.2.2p6). They,
too, are still conversions, just a slightly larger
designated subset.

3) The "usual arithmetic conversions" (6.3.1.8) is more
than just a subset of all possible conversions; it's
accompanied by an algorithm that chooses which conversions
to apply in particular circumstances. If we ignore the
algorithm, though, what's left are the conversions it can
choose from, and those are a subset of all conversions.
Do the conversions you described have to do with usual arithmetic
conversion?

Not as far as I can see. The purpose of the U.A.C. is to
find "common ground" for the operands of an operator, and the
examples have no operators (they are initializations). Even if
we think of them as assignments (initialization occurs "as if"
by assignment), the assignment operator doesn't use the U.A.C.
(see 6.5.16.1). So I don't think the U.A.C. enter in at all.
Also I don't understand why 6.3.1.3 is seperate from
6.3.1.8, their purpose is very close.

6.3.1.3 describes the effect of conversion between integers
of various types; it says "what happens" when conversion is
performed. 6.3.1.8 describes some conversions that are done
automatically; it says "which" conversions are chosen.
6.3.1.8 tells you to whether to bake potatoes or brownies;
6.3.1.3 has the brownie recipe.
 
J

Jef Driesen

Eric said:
Unnecessary in line 4, but required in line 5 if CHAR_BIT > 8.

What happens if you try to write a character to disk (or socket, or
anything else that expects 8bit octets) on such a system? Are all bits
written, or only 8 of them? If everything is written, how can you even
communicate with 8bit based systems, given that even the smallest
datatype is larger?
 
B

Beej Jorgensen

Beej Jorgensen said:
7 int16_t unpack16(unsigned char buf[2])
8 {
9 uint16_t v2 = buf[0] << 8 | buf[1];

Looks like I also need to cast these to the result type when it's
possible they don't fit in an int, due to the usual arithmetic
conversions? Is that right?

uint32_t v2 = ((uint32_t)buf[0]<<24) | ((uint32_t)buf[1]<<16) |
(buf[2]<<8) | buf[1];

I get warnings if I don't when messing with 64-bit values on this
machine (Athlon64, gcc, 32-bit ints).

-Beej
 
B

Beej Jorgensen

Eric Sosman said:
You're right: You need a cast on buf[0] before shifting it, to
make sure it doesn't promote to a signed 16-bit int.
Gah--thanks!

I'm not sure what "messing with" means. Warnings are at the
compiler's whim; some are important, some are not.

I just included the warnings as evidence I was doing something wrong in
this particular case.

-Beej
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,065
Latest member
OrderGreenAcreCBD

Latest Threads

Top