undefined behavior or not undefined behavior? That is the question

  • Thread starter Mantorok Redgormor
  • Start date
D

Dan Pop

In said:
Motorola/IBM PowerPC

Practically any new architecture designed for general computing purposes
is bi-endian: MIPS, SPARC(v9), Alpha, PPC, Itanium.

MIPS was/is used in big endian mode by SGI, but was used in little endian
mode by DEC.

Alpha was used in little endian mode by DEC and in big endian mode by
Cray.

Itanium is used in big endian mode by HP and in little endian mode by
everyone else.

PPC is used in big endian mode by IBM and Apple, but it was used in little
endian mode during the short lived Windows NT on PPC project.

Dan
 
D

Dan Pop

In said:
well if an lvalue does not designate an object
then it invokes undefined behavior.

In your example, the lvalue designated an object.
you need type information for a type to not be
an incomplete type.

OTOH, a pointer to an incomplete type is a complete type. Take pointers
to void as a canonical example.

Dan
 
D

Dan Pop

In said:
I suspect it was one of the few ASCII punctuation marks left over when
C (or one of its predecessors) was being designed.

I agree: they've probably used all the ones with obvious meanings
and tried to do the best with what was left: ^, ~, !, %, ?, :.

I've never understood why @ and $ were left unused, while & and * have
been overloaded. @ would have done nicely instead of unary & and $
instead of unary *.

` was used in the earliest C versions, for a purpose with little relevance
outside the BTL (GECOS string literals).

Dan
 
A

Arjan Kenter

Dan said:
I agree: they've probably used all the ones with obvious meanings
and tried to do the best with what was left: ^, ~, !, %, ?, :.

I've never understood why @ and $ were left unused, while & and * have
been overloaded. @ would have done nicely instead of unary & and $
instead of unary *.

Maybe K and R didn't want to use $ for fear that people outside the USA
would get into trouble (did traditional British keyboards have dollar signs
or pound signs? Dutch typewriters had both dollar and 'florin' signs).

Btw, Apollo C allowed $ in identifiers, library functions were often
named said:
` was used in the earliest C versions, for a purpose with little relevance
outside the BTL (GECOS string literals).

--
ir. H.J.H.N. Kenter ^^
Electronic Design & Tools oo ) Philips Research Labs
Building WAY 3.23 =x= \ (e-mail address removed)
Prof. Holstlaan 4 (WAY31) | \ tel. +31 40 27 45334
5656 AA Eindhoven /|__ \ tfx. +31 40 27 44626
The Netherlands (____)_/ http://www.kenter.demon.nl/

Famous last words: Segmentation Fault (core dumped)
 
C

CBFalconer

Dan said:
Then, where is NULL defined? Have I ever recommended engaging
the brain before posting? ;-)

Cavil: I didn't say he didn't have to compensate. Replacing NULL
by 0 would suffice. I would have thought you capable of this
extension without detailed prompting :)
 
M

Mark McIntyre

Possibly because these characters didn't appear on the keyboard they were
using?
Also possibly because @ and $ were already overloaded with meaning in the
shell they were using.
Maybe K and R didn't want to use $ for fear that people outside the USA
would get into trouble (did traditional British keyboards have dollar signs
or pound signs? Dutch typewriters had both dollar and 'florin' signs).

For as long as I can recall, UK keyboards have had $ on shift-4. However
its certainly possible that other layouts might not have had it there.
 
C

Christian Bau

[email protected] (Dan Pop) said:
The real world has excluded them, though, otherwise the terms big endian
and little endian wouldn't be that popular.

The most notable machine being neither little endian nor big endian
was the PDP-11. A long consisted of two words stored in big endian order.
Each word consisted of two bytes, stored in little endian order.

It is not just a matter of the architecture, it is also a matter of the
compiler. Many architectures in use today are based on 32 bit words. To
implement "long long" and "unsigned long long", the compiler typically
uses two 32 bit words. There is no technical reason why one of those
words should be first in memory and not the other, so 64 bit integers
could easily be mixed-endian on many current architectures.
 
C

Chris Torek

I've never understood why @ and $ were left unused [in C]

"$" is perhaps harder to explain, but "@" was the V6 Unix line-kill
character (today usually ^U or ^X).
` was used in the earliest C versions, for a purpose with little relevance
outside the BTL (GECOS string literals).

Backquote for BCD literals for GECOS no doubt did not appear until
the GECOS compilers; I suspect it is debatable whether to call this
`earliest' versions of C. :)
 
M

Mantorok Redgormor

In your example, the lvalue designated an object.


OTOH, a pointer to an incomplete type is a complete type. Take pointers
to void as a canonical example.

Dan

I mixed them up. I guess what I really wanted to ask
without providing any examples, since I am unsure
of where this would be applied. Is where does one
have a need for lvalues that are incomplete types?
 
D

Dan Pop

In said:
It is not just a matter of the architecture, it is also a matter of the
compiler. Many architectures in use today are based on 32 bit words. To
implement "long long" and "unsigned long long", the compiler typically
uses two 32 bit words. There is no technical reason why one of those
words should be first in memory and not the other, so 64 bit integers
could easily be mixed-endian on many current architectures.

Show us one concrete example.

C implementors are usually not the most stupid people around and they
understand quite well the need for a consistent byte order among all the
types.

All the realistic examples of mixed endianness are historical, from an
age where the byte order issues were not as well understood as they have
been for the last quarter of century.

The PDP-11's successor, the VAX, had fixed that hardware design blunder
of its ancestor and was little endian over all the range of supported
types.

There are exactly two byte orders that have survived and, by no
coincidence, they are the two entirely consistent ones.

Dan
 
D

Dan Pop

In said:
I mixed them up. I guess what I really wanted to ask
without providing any examples, since I am unsure
of where this would be applied. Is where does one
have a need for lvalues that are incomplete types?

Unless C99 has completely screwed the concept of lvalue, lvalues
cannot have incomplete types.

Dan
 
D

Dan Pop

In said:
I've never understood why @ and $ were left unused [in C]

"$" is perhaps harder to explain, but "@" was the V6 Unix line-kill
character (today usually ^U or ^X).

And '#' was the erase character (IIRC), yet this didn't prevent its
usage in the language...
Backquote for BCD literals for GECOS no doubt did not appear until
the GECOS compilers; I suspect it is debatable whether to call this
`earliest' versions of C. :)

For easy to figure reasons, GECOS was one of the first porting targets.
By the time K&R1 went to print, ` was gone from the language.

Dan
 
J

Jeremy Yallop

Dan said:
Unless C99 has completely screwed the concept of lvalue, lvalues
cannot have incomplete types.

Unfortunately, they can:

An lvalue is an expression with an object type or an incomplete type
other than void; if an lvalue does not designate an object when it
is evaluated, the behavior is undefined.

I'm not sure why the second part is there: the only example of an
expression of incomplete type that I know is an array name where the
definition of the array is not in scope, and even that is
questionable.

Jeremy.
 
D

Dan Pop

In said:
Unfortunately, they can:

An lvalue is an expression with an object type or an incomplete type
other than void; if an lvalue does not designate an object when it
is evaluated, the behavior is undefined.

I'm not sure why the second part is there: the only example of an
expression of incomplete type that I know is an array name where the
definition of the array is not in scope, and even that is
questionable.

Here's another example:

struct foo;
struct foo *bar(void);
struct foo *p = bar();

*p is now an expression of incomplete type. But this still doesn't
explain why such an expression should qualify as an lvalue.

Dan
 
C

Chris Torek

[on "PDP-endian"-ness]

The PDP-11's successor, the VAX, had fixed that hardware design blunder
of its ancestor and was little endian over all the range of supported
types.

Well, except D-float -- but this is not an integral type and it has
"bit" as well as "byte" order issues, when one takes it apart
(because the bit-fields do not fall on "natural" byte boundaries).
There are exactly two byte orders that have survived and, by no
coincidence, they are the two entirely consistent ones.

I am willing to bet that at least one CPU architecture will resurrect
mixed-endian integers in its 32-to-64-bit transition. (I suspect
it will be an existing little-endian architecture that is heavily
stack-oriented, too.)
 
J

Jeremy Yallop

Dan said:
Here's another example:

struct foo;
struct foo *bar(void);
struct foo *p = bar();

*p is now an expression of incomplete type. But this still doesn't
explain why such an expression should qualify as an lvalue.

I thought that dereferencing a pointer to an incomplete type was a
constraint violation, but I can't find any text in the standard to
that effect. The only thing I can find, in fact, is the above text,
which means that there's no requirement to diagnose such things.

This seems wrong: there's no need to dereference such pointers and
doing so can trivially be detected at compile time. This seems to me
to be a glaring flaw in the standard; in fact, the whole lvalue thing
seems a horrible mess in C99.

Jeremy.
 
A

Arthur J. O'Dwyer

Even simpler:
extern struct foo F;
F is now an lvalue of incomplete type.

Because it's quite obviously not an rvalue; it has a well-defined
address, for example.
I thought that dereferencing a pointer to an incomplete type was a
constraint violation,

It is. However, the expression *p doesn't necessarily dereference
p; you could have written &(*p), or (*p)[0] (both of which are legal
operations in C), or you could have written sizeof(*p) or (*p)[42]
(both of which are constraint violations in C, but still don't
involve dereferencing p). Or a*p->b. ;-)
but I can't find any text in the standard to
that effect. The only thing I can find, in fact, is the above text,
which means that there's no requirement to diagnose such things.

I'm almost certain it *is* a constraint violation to do anything
to an incompletely-typed object other than take its address; but
I have not looked for textual support yet. I'll let you know in
a few hours, if nobody else does first.

HTH,
-Arthur
 
P

Papadopoulos Giannis

Chris said:
[on "PDP-endian"-ness]

The PDP-11's successor, the VAX, had fixed that hardware design blunder
of its ancestor and was little endian over all the range of supported
types.


Well, except D-float -- but this is not an integral type and it has
"bit" as well as "byte" order issues, when one takes it apart
(because the bit-fields do not fall on "natural" byte boundaries).

There are exactly two byte orders that have survived and, by no
coincidence, they are the two entirely consistent ones.


I am willing to bet that at least one CPU architecture will resurrect
mixed-endian integers in its 32-to-64-bit transition. (I suspect
it will be an existing little-endian architecture that is heavily
stack-oriented, too.)[/QUOTE]

Have in mind though, that mixed-mode architectures make things more
difficult than it should be...

--
#include <stdio.h>
#define p(s) printf(#s" endian")
int main(void){int v=1;*(char*)&v?p(Little):p(Big);return 0;}

Giannis Papadopoulos
http://dop.users.uth.gr/
University of Thessaly
Computer & Communications Engineering dept.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,014
Latest member
BiancaFix3

Latest Threads

Top