Unsinged types

Jase Schick · Jul 19, 2012

Hi Does C still need unsigned types? My preferred language Java manages
perfectly well without them. Do many people ever use unsigned types
nowadays and if so why? In a 64-bit world, the extra range is rarely
worth the hastle it seems to me.

Jase

Ben Bacarisse · Jul 19, 2012

Jase Schick said:
Hi Does C still need unsigned types?

Yes. C needs them because of how it defines signed integer types.
Signed integer values can be represented in one of three ways, and the
language allows signed integer arithmetic to overflow in an undefined
way. Also, some operations on signed types are undefined (some shifts
for example). All this is to permit the very widest possible range of
implementations without the need to simulate an arithmetic type that the
hardware does not have.

There is a language not entirely unlike C which would no longer need
them, but it would have more constrained signed integer types.

My preferred language Java manages
perfectly well without them. Do many people ever use unsigned types
nowadays and if so why? In a 64-bit world, the extra range is rarely
worth the hastle it seems to me.

It's very hard to avoid using them! malloc's argument is an unsigned
integer, the result of sizeof and strlen are unsigned values, etc.

Your argument about the 64-bit world will ring hollow to people using C
in the 16- and 32-bit words! Whilst C99 requires a 64-bit type, there
are many environments without a C99 compiler, and probably some where
64-bit arithmetic needs to be implemented in software.

jacob navia · Jul 19, 2012

Le 19/07/12 22:29, Ben Bacarisse a écrit :

Your argument about the 64-bit world will ring hollow to people using C
in the 16- and 32-bit words! Whilst C99 requires a 64-bit type, there
are many environments without a C99 compiler, and probably some where
64-bit arithmetic needs to be implemented in software.

Those are just efficiency details unworthy of the attention of a Java
programmer that will always use the biggest types combined with the
slowest possible programming. Java leads to that kind of mindset.

Java is another world.

Ben Pfaff · Jul 19, 2012

Jase Schick said:
Hi Does C still need unsigned types? My preferred language Java manages
perfectly well without them. Do many people ever use unsigned types
nowadays and if so why? In a 64-bit world, the extra range is rarely
worth the hastle it seems to me.

Many quantities are naturally unsigned; for example, counts and
sizes. These quantities are most naturally modeled with unsigned
types.

Eric Sosman · Jul 20, 2012

Many quantities are naturally unsigned; for example, counts and
sizes. These quantities are most naturally modeled with unsigned
types.

Indeed. If you use a singed type to hold an inherently
unsinged value, you're playing with fire.

Ben Bacarisse · Jul 20, 2012

Eric Sosman said:
Indeed. If you use a singed type to hold an inherently
unsinged value, you're playing with fire.

I agree with the "indeed", but I am not sure I see why you added that
one would be "playing with fire". Can you say a bit more about that?

Stephen Sprunk · Jul 20, 2012

I agree with the "indeed", but I am not sure I see why you added that
one would be "playing with fire". Can you say a bit more about that?

It's a clever reference to "singed" types. Note the spelling.

I was trying to work up something similar, but he beat me to it.

S

Eric Sosman · Jul 20, 2012

I agree with the "indeed", but I am not sure I see why you added that
one would be "playing with fire". Can you say a bit more about that?

It's a pnu.

Ben Bacarisse · Jul 20, 2012

Stephen Sprunk said:
It's a clever reference to "singed" types. Note the spelling.

I was trying to work up something similar, but he beat me to it.

Well, I'm dyslexic so that's just made my day!

Nick Keighley · Jul 20, 2012

Hi Does C still need unsigned types? My preferred language Java manages
perfectly well without them. Do many people ever use unsigned types
nowadays and if so why? In a 64-bit world, the extra range is rarely
worth the hastle it seems to me.

raw bytes (or octets) are often emulated with "unsigned char" (I know,
a char doesn't have to be 8 bits but I've never actually used a
system
where it wasn't (I understand some DSPs do this)). And raw bytes are
usefukl for comms stuff, encryption, compression, signal processing
etc. etc.

Andrew Cooper · Jul 20, 2012

Hi Does C still need unsigned types? My preferred language Java manages
perfectly well without them. Do many people ever use unsigned types
nowadays and if so why? In a 64-bit world, the extra range is rarely
worth the hastle it seems to me.

Jase

Yes. Without any doubt whatsoever.

A common but subtle cause of security bugs is to use a signed index into
an array rather than an unsigned one, then have said index based on user
input.

What the vast majority of programmers don't understand is that the
signed vs unsigned is not about "what's the largest number I can
represent", but with the different semantics that the two types have.

~Andrew

Keith Thompson · Jul 20, 2012

Eric Sosman said:
It's a pnu.

So 24-bit systems are also subject to endianness issues. Good to know.

Seungbeom Kim · Jul 20, 2012

Many quantities are naturally unsigned; for example, counts and
sizes. These quantities are most naturally modeled with unsigned
types.

Things that you assume should be non-negative often turn out to be
not always so. For example, though no one thinks of a negative day
of the month, the tm_* members of struct tm are signed, because you
sometimes need to be able to represent out-of-range values.

In addition, "unsigned" in C does not only mean non-negativity,
but also implies modulo arithmetic; you can't even get the difference
between two unsigned values in the most natural way (a - b). And the
unsignedness is contagious, so (unsigned_value > -1) may not be true
and ((unsigned)negative_value > 100) may be true to your surprise.

For these reasons, I had the feeling that unsigned was dangerous and
should be limited to values which bitwise operations were intended for
(e.g. bitmasks) or values on nominal/ordinal scales (e.g. character
codes, TCP port numbers).

The reality is, however, that everyone thinks differently, similar
debates are endless, and that unsigned is widely used for other things
(e.g. counts and sizes) even in the standard library, so you have to
live with unsigned being everywhere, and learn to be careful when
mixing signed and unsigned.

Eric Sosman · Jul 20, 2012

Things that you assume should be non-negative often turn out to be
not always so. For example, though no one thinks of a negative day
of the month, the tm_* members of struct tm are signed, because you
sometimes need to be able to represent out-of-range values.

It's also a way to help with some calculations. For example,
you can start from a given date, subtract fourteen from tm_mday,
re-normalize, and easily find out "What date was a fortnight
before January 5?"

In addition, "unsigned" in C does not only mean non-negativity,
but also implies modulo arithmetic; you can't even get the difference
between two unsigned values in the most natural way (a - b).

You can't do that with signed integers, either. And in the
cases where naive subtraction doesn't work, you don't just get a
possibly surprising but predictable outcome: You get undefined
behavior. (Ever seen someone write a qsort() comparator that just
subtracts two `int' keys and returns the difference? Ever seen
the chaos that can result?)

And the
unsignedness is contagious, so (unsigned_value > -1) may not be true
and ((unsigned)negative_value > 100) may be true to your surprise.

Oh, come on! You might just as well complain about double-ness
being "contagious."

Besides, why blame the surprises on the unsigned operand? They
don't arise from either one of the operands in isolation, but from
the combination of the two -- so the signed operand is every bit as
much to blame as the unsigned. If you blame one, you should blame
the other equally.[*]

[*] Okay, that doesn't always happen in real life: Doheny was
acquitted of offering the bribe that Fall was convicted of taking.
But Roaring Twenties jurisprudence is a poor model for programming!

For these reasons, I had the feeling that unsigned was dangerous and
should be limited to values which bitwise operations were intended for
(e.g. bitmasks) or values on nominal/ordinal scales (e.g. character
codes, TCP port numbers).

The reality is, however, that everyone thinks differently, similar
debates are endless, and that unsigned is widely used for other things
(e.g. counts and sizes) even in the standard library, so you have to
live with unsigned being everywhere, and learn to be careful when
mixing signed and unsigned.

... or when mixing signed integer with long double complex, or
when mixing unsigned long with pointer-to-pointer-to-T, or ... In
fact, your recommendation to "be careful when..." can be improved
by deleting "when" and everything after it. Just be careful, okay?

Malcolm McLean · Jul 21, 2012

×‘×ª××¨×™×š ×™×•× ×©×™×©×™, 20 ×‘×™×•×œ×™ 2012 10:00:55 UTC+1, ×ž××ª Andrew Cooper:

A common but subtle cause of security bugs is to use a signed index into
an array rather than an unsigned one, then have said index based on user
input.

What the vast majority of programmers don't understand is that the
signed vs unsigned is not about 'what's the largest number I can
represent', but with the different semantics that the two types have.

The problem is it's a plugs and adapeters system.

Consider this

void getcursorposition(unsigned int *x, unsigned int *y)

an x, y index into a raster is necessarily unsigned, right?

Now we want to draw an octogon round the cursor. So we'll build it on top of a function void drawpolygon(). What signature would you give drawpolygon,and how would you write this code?

Ben Bacarisse · Jul 21, 2012

Malcolm McLean said:
×‘×ª××¨×™×š ×™×•× ×©×™×©×™, 20 ×‘×™×•×œ×™ 2012 10:00:55 UTC+1, ×ž××ª Andrew Cooper:
The problem is it's a plugs and adapeters system.

Consider this

void getcursorposition(unsigned int *x, unsigned int *y)

an x, y index into a raster is necessarily unsigned, right?

Yes, but not a cursor position. I would be very unhappy with an API
that conflated these two concepts because, as your example shows, it's
natural to represents positions as signed quantities. (In fact I'd want
a position to be represented as some kind of "point" but that's another
matter.)

Now we want to draw an octogon round the cursor. So we'll build it on
top of a function void drawpolygon(). What signature would you give
drawpolygon, and how would you write this code?

struct point { int x, y; };

void drawpolygon(struct point *pt, size_t np, bool closed);

struct point octagon[8];
unsigned int cx, cy;
getcursorposition(&cx, &cy);
for (int v = 0; v < 8; v++) {
octagon[v].x = cx;
octagon[v].y = cy;
move_point_polar(&octagon[v], 360/8 * v, radius);
}
drawpolygon(octagon, 8, true);

Having another prototype for getcursorposition would not make very much
difference, though I'd probably "correct" the API like this:

static inline void getcursorposition_as_point(struct point *pt)
{
// Why is there not a function to do this already?
unsigned int x, y;
getcursorposition(&x, &y);
pt->x = x;
pt->y = y;
}

struct point center, octagon[8];
getcursorposition_as_point(&center);
for (int v = 0; v < 8; v++) {
octagon[v] = center;
move_point_polar(&octagon[v], 360/8 * v, radius);
}
drawpolygon(octagon, 8, true);

How would you write it with a "better" prototype for getcursorposition?

Malcolm McLean · Jul 21, 2012

Malcolm McLean <[email protected]> writes:

struct point { int x, y; };

void drawpolygon(struct point *pt, size_t np, bool closed);

unsigned int cx, cy;
snip code

octagon[v].x = cx;
octagon[v].y = cy;

On many compilers this line will generate a warning. Not unreasonably, because unsigned int can go up to 4 billion, singed int only to 2 billion. The compielr has no way of knowing that a cursor position of 3 billion pixels is completely ridiculous.

You can of course simply add a cast. But once you start doing that you're working against the language instead of with it.

If getcursorpsoition writes to unsigned, there;s no nice way of writing drawpolygon(). You can define an interface that takes unsigneds, because everydrawable polygon will necessarily be described by unsigned x, y, positions.. But then you need horrible code in caller to adjust the polygon to take care of the corner cases. Or you can say that negative points within the polygon are legitimate but undrawable. That's the sane solution. But then you no longer want any unsigned co-ordinates cluttering up up the code.

Ben Bacarisse · Jul 21, 2012

Malcolm McLean said:
×‘×ª××¨×™×š ×™×•× ×©×‘×ª, 21 ×‘×™×•×œ×™ 2012 15:24:38 UTC+1, ×ž××ª Ben Bacarisse:

Malcolm McLean <[email protected]> writes:

struct point { int x, y; };

void drawpolygon(struct point *pt, size_t np, bool closed);

unsigned int cx, cy;
snip code

octagon[v].x = cx;
octagon[v].y = cy;

Click to expand...

On many compilers this line will generate a warning. Not unreasonably,
because unsigned int can go up to 4 billion, singed int only to 2
billion. The compielr has no way of knowing that a cursor position of
3 billion pixels is completely ridiculous.

You can of course simply add a cast. But once you start doing that
you're working against the language instead of with it.

That's why the API is wrong. You did not comment on my other solution
which is to fix the API with a point-filling version. The use of a cast
in such a function is not "working against the language" it's using the
language to fix a dubious API.

"You can't do this neatly in C" posts should include all the "rules" and
all the language features that you want to arbitrarily exclude: "no
warnings from most compilers and no casts, please". That will have the
advantage that I probably won't reply to them.

If getcursorpsoition writes to unsigned, there;s no nice way of
writing drawpolygon(). You can define an interface that takes
unsigneds, because every drawable polygon will necessarily be
described by unsigned x, y, positions. But then you need horrible code
in caller to adjust the polygon to take care of the corner cases. Or
you can say that negative points within the polygon are legitimate but
undrawable. That's the sane solution. But then you no longer want any
unsigned co-ordinates cluttering up up the code.

These are arguments are about code that no one has, or should, write.
It's a straw man. Why not just show how much simpler your code is than
mine when getcursorposition uses int *s rather than unsigned int *s?
That will make it clear just how much damage the using of unsigned has
introduced.

BartC · Jul 21, 2012

Malcolm McLean said:
×‘×ª××¨×™×š ×™×•× ×©×‘×ª, 21 ×‘×™×•×œ×™ 2012 15:24:38 UTC+1, ×ž××ª Ben Bacarisse:

Malcolm McLean <[email protected]> writes:

struct point { int x, y; };

void drawpolygon(struct point *pt, size_t np, bool closed);

unsigned int cx, cy;
snip code

octagon[v].x = cx;
octagon[v].y = cy;

Click to expand...

On many compilers this line will generate a warning. Not unreasonably,
because unsigned int can go up to 4 billion, singed int only to 2 billion.
The compielr has no way of knowing that a cursor position of 3 billion
pixels is completely ridiculous.

Not completely. A 1 pixel x 3 billion pixel black & white image only uses
375MB.

That would need 32-bit unsigned, or 64-bit signed, to address. (But I've
lost track of whether you're arguing for or against the use of unsigned
quantities.)

Malcolm McLean · Jul 21, 2012

Malcolm McLean (e-mail address removed) writes:

These are arguments are about code that no one has, or should, write.
It's a straw man. Why not just show how much simpler your code is than
mine when getcursorposition uses int *s rather than unsigned int *s?
That will make it clear just how much damage the using of unsigned has
introduced.

void drawoctogonroundcursor(void)
{
int octx[8];
int octy[8];
int cx, cy;
int d = 3; /* this gives the size of the octogon step */

getcursorposition(&cx, &cy);
octx[0] = cx-d; octy[0] = cy-2*d;
octx[1] = cx+d; octy[1] = cy-2*d;
octx[2] = cx+2*d; octy[2] = cy-d;
octx[3] = cx+2*d; octy[3] = cy+d;
octx[4] = cx+d; octy[4] = cy+2*d;
octx[5] = cx-d; octy[5] = cy+2*d;
octx[6] = cx-2*d; octy[6] = cy+d;
octx[7] = cx-2*d; octy[7] = cy-d;

drawpolygon(octx, octy, 8);

}

There's no messing about. We can focus completely on the drawing logic, which I might have got wrong.

Unsinged types	11	Jul 2, 2010
Types	58	Dec 10, 2006
Integer types in embedded systems	22	May 2, 2008
I'm tempted to quit out of frustration	1	Aug 13, 2023
Deriving from concrete types like std::list	11	Jul 7, 2011
size_t, ssize_t and ptrdiff_t	56	Oct 12, 2013
OT: Dynamic-CC, GC, Dynamic Types, and Prototype OO in C	53	Oct 10, 2007
[META] Talking about talking about C.	113	Oct 23, 2010

Unsinged types

Jase Schick

Ben Bacarisse

jacob navia

Ben Pfaff

Eric Sosman

Ben Bacarisse

Stephen Sprunk

Eric Sosman

Ben Bacarisse

Nick Keighley

Andrew Cooper

Keith Thompson

Seungbeom Kim

Eric Sosman

Malcolm McLean

Ben Bacarisse

Malcolm McLean

Ben Bacarisse

BartC

Malcolm McLean

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads