Puzzling program

C

Christopher Benson-Manica

Joe Wright said:
Please allow me..
Likewise...

#include <stdio.h>
int main(void) {
char *s = "abc";
int *i = (int *)s;
printf("%08x", *i);

printf("%08x\n", *i);
 
C

Christopher Benson-Manica

Lew Pitcher said:
The answer is that the program is poorly written and invokes behaviour that is
not defined by the C standard. As "undefined behaviour" means that anything
can happen, the output value of 636\261 is just as valid (and just as likely)
as any other output.

Equally valid, yes, but hardly as likely, unless there are an infinite
number of DS9K implementations to smooth out the spike at 636261
produced by non-evil little endian ASCII implementations.
 
R

Richard Tobin

Christopher Benson-Manica said:
Equally valid, yes, but hardly as likely, unless there are an infinite
number of DS9K implementations to smooth out the spike at 636261
produced by non-evil little endian ASCII implementations.

It's not the number of implementations that counts. It's the enormous
silent majority of DS9K users.

-- Richard
 
R

Richard Tobin

But there are an infinite number of DS9K implementations. (They're
not required to exist physically, are they?)

I don't know about infinite, but I've never seen a DS9K that didn't
have at least a dozen different C compilers installed.

-- Richard
 
K

Keith Thompson

Christopher Benson-Manica said:
Equally valid, yes, but hardly as likely, unless there are an infinite
number of DS9K implementations to smooth out the spike at 636261
produced by non-evil little endian ASCII implementations.

But there are an infinite number of DS9K implementations. (They're
not required to exist physically, are they?)
 
M

Mark McIntyre

You could point out all the errors and nonportable assumptions in the code.
But you are not interviewing the interviewer, probably.

Euh, interviews are two-way. Do you really want to work for a company
which employs people who don't care about code errors?
He's deciding whether he wants you,

And you him.
Companies want "team players",

They also typically want competent people.
--
Mark McIntyre

"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it."
--Brian Kernighan
 
P

pete

Keith said:
6.2.5p9 is what I was thinking of (though I hadn't bothered to look it
up). It guarantees the same representation, but the statement that

The same representation and alignment requirements are meant to
imply interchangeability as arguments to functions, return values
from functions, and members of unions.

is relegated to a footnote; thus my use of the word "almost". I
hadn't been aware of 6.5.2.2p6, which makes the guarantee explicitly
in normative text.

The wording of the footnote is wishywashy.
Would the footnote mean something different
if it was written this way?:

The same representation and alignment requirements
imply interchangeability as arguments to functions, return values
from functions, and members of unions.
 
P

pete

Keith said:
For context, here are the relevant lines from the original cruddy
program:

char *s = "abc";
int *i = (int *) s;

It's certainly true that there's no guarantee that the string has any
particular alignment, but I think most implementations do tend to
store strings (and, more generally, arrays) at a stricter alignment
than is required. Doing so can have some performance advantages. For
example, an implementation of memcpy(), or even strcpy(), might use
word moves whenever both the source and the target are suitably
aligned.

It's yet another instance of undefined behavior that's very likely to
work as expected (unfortunately).

There's also a problem when it comes time to read *i,
if sizeof(*i) is greater than sizeof"abc".
 
K

Keith Thompson

pete said:
Keith said:
Army1987 said:
On Wed, 08 Aug 2007 16:18:04 -0700, Keith Thompson wrote: [...]
Also, calling printf with no prototype in scope invokes undefined
behavior (the fix is to add a '#include <stdio.h>' to the top of the
source file). And "%x" is the wrong format for printing an int;
probably 'i' should have been declared as an unsigned int* (the
standard *almost* says that it will work anyway, but it's not
something I'd be comfortable counting on).
For sufficiently large values of "almost". See 6.5.2.2p6, the part
with in n1124.pdf is on the beginning of the next page.
Also see the first sentence of 6.2.5p9.
You were thinking about the fact that *i could be negative?

6.2.5p9 is what I was thinking of (though I hadn't bothered to look it
up). It guarantees the same representation, but the statement that

The same representation and alignment requirements are meant to
imply interchangeability as arguments to functions, return values
from functions, and members of unions.

is relegated to a footnote; thus my use of the word "almost". I
hadn't been aware of 6.5.2.2p6, which makes the guarantee explicitly
in normative text.

The wording of the footnote is wishywashy.
Would the footnote mean something different
if it was written this way?:

The same representation and alignment requirements
imply interchangeability as arguments to functions, return values
from functions, and members of unions.
[...]

I'm not sure. Since it's a footnote, it's non-normative, so there's
not a whole lot of difference between "are meant to imply" and
"imply". On the other hand, I suppose a less ambiguous statement
would imply that the footnote is merely clarifying something that's
actually stated in normative text; the "are meant to" wording seems to
acknowledge that the normative text doesn't really make this claim.

A hypothetical implementation in which signed int and unsigned int
have exactly the same "representation and alignment requirements", but
are passed as arguments in different registers, would violate the
footnote but not, as far as I can tell, any normative requirement.

If the standard really intended to *require* interchangeability, it
should have said so in normative text. Since it doesn't, my guess is
that the intent is to allow the types *not* to be interchangeable in
some exotic implemntation, but to suggest that they should be if
possible. (But even if my interpretation is correct, I'd be happier
if the standard were more explicit about it.)
 
C

Charlton Wilbur

MMcI> On Thu, 9 Aug 2007 00:26:19 +0100, in comp.lang.c , "Malcolm
MMcI> McLean"

MMcI> Euh, interviews are two-way. Do you really want to work for
MMcI> a company which employs people who don't care about code
MMcI> errors?

Given the quality of the code in his book, do you really want an
answer to that question?

Charlton
 
K

Kenneth Brody

Rajeet said:
Good day group.

I was asked in an interview to explain the behavior of this program.

void main()
{
char *s = "abc";
int *i = (int *) s;
printf("%x", *i);
}

The question is: why is the output 636261? I don't think I know enough
about C to understand how the conversions between pointer types are
occurring.

Short answer:

UB can do *anything*.

Answer the interviewer was probably looking for:

The implementation uses 32-bit integers, is little-endian, and
uses the ASCII character set.

Consider:

What are the raw bytes that s points to?

"abc" --> 0x61 0x62 0x63 0x00

Given the above criteria (little-endian, 32-bit ints), what
would be the value of *i?

0x00636261

Howver, the output could be different if:

The implementation is not ASCII.

sizeof int != 4

CHAR_BITS != 8

The system is not little-endian.

s is not properly aligned for an int.

The bit-pattern at *i happens to be a trap representation.

The system does not support the non-standard "void main()".

The system treats varadic functions differently, and doesn't
properly compile your printf() call without a prototype in
scope.

The program's output is discarded because it didn't end in '\n'.

--
+-------------------------+--------------------+-----------------------+
| Kenneth J. Brody | www.hvcomputer.com | #include |
| kenbrody/at\spamcop.net | www.fptech.com | <std_disclaimer.h> |
+-------------------------+--------------------+-----------------------+
Don't e-mail me at: <mailto:[email protected]>
 
K

Kenneth Brody

Mark said:
Euh, interviews are two-way. Do you really want to work for a company
which employs people who don't care about code errors?

Well, this may be more a case of "we're only coding for one specific
platform, and don't care about portability". (Not that that's
necessarily a good thing, but at least the code becomes "system
specific behavior" rather than "errors". Of course, what happens
when they decided to port to a similar, but 64-bit, platform?)

[...]

--
+-------------------------+--------------------+-----------------------+
| Kenneth J. Brody | www.hvcomputer.com | #include |
| kenbrody/at\spamcop.net | www.fptech.com | <std_disclaimer.h> |
+-------------------------+--------------------+-----------------------+
Don't e-mail me at: <mailto:[email protected]>
 
K

Keith Thompson

Kenneth Brody said:
Well, this may be more a case of "we're only coding for one specific
platform, and don't care about portability". (Not that that's
necessarily a good thing, but at least the code becomes "system
specific behavior" rather than "errors". Of course, what happens
when they decided to port to a similar, but 64-bit, platform?)

Here's the original program:

void main()
{
char *s = "abc";
int *i = (int *) s;
printf("%x", *i);
}


That's not just system-specific, it's *bad*. Even if you're assuming
a particular set of characteristics for the platform, I can think
of at least four corrections that should be made (add '#include
<stdio.h>', use 'int main(void)', use unsigned int rather than int,
and add a 'return 0;') Not caring about portability is no excuse
for these errors.
 
C

Charlton Wilbur

KB> Well, this may be more a case of "we're only coding for one
KB> specific platform, and don't care about portability". (Not
KB> that that's necessarily a good thing, but at least the code
KB> becomes "system specific behavior" rather than "errors". Of
KB> course, what happens when they decided to port to a similar,
KB> but 64-bit, platform?)

There are many valid reasons to write nonportable code, but by far the
most common reason I've seen, and it's *not* a valid reason, is
complete ignorance.

If the company doesn't even realize they're writing nonportable code,
it's not a company I want to work for.

Charlton
 
A

Alan Curry

Answer the interviewer was probably looking for:

The implementation uses 32-bit integers, is little-endian, and
uses the ASCII character set.

How optimistic of you. The answer they were looking for could have been:

Integers are laid out in memory with the least significant byte first.

In other words, not just "this code is running on a little-endian platform"
but "the whole world is little-endian, there is no such thing as big-endian,
and no need for us to use or understand the descriptive term little-endian
because there's no alternative from which it needs to be distinguished"
 
K

Keith Thompson

How optimistic of you. The answer they were looking for could have been:

Integers are laid out in memory with the least significant byte first.

In other words, not just "this code is running on a little-endian
platform" but "the whole world is little-endian, there is no such
thing as big-endian, and no need for us to use or understand the
descriptive term little-endian because there's no alternative from
which it needs to be distinguished"

Maybe so. If that's really their position, I don't want to work for
them. But if they seem willing to listen when I explain the facts, I
might conclude "Hey, these guys really need me!".
 
F

Francine.Neary

use unsigned int rather than int,

I believe it's undefined behavior to cast a char * to an unsigned *
and then dereference it, no less than casting to an int * and then
dereferencing.
 
K

Keith Thompson

I believe it's undefined behavior to cast a char * to an unsigned *
and then dereference it, no less than casting to an int * and then
dereferencing.

Sure, but it might happen to work for a given implementation. It
might even be guaranteed to work by the implementation's
documentation. For example, some systems have no strict alignment
requirements. You might choose to take advantage of that *if* you
don't care at all about portability.

On the other hand, printing an int value using a "%x" format is just
silly. For the program in question, using unsigned int rather than
int corrects a potential flaw and costs nothing.
 
R

Richard Heathfield

(e-mail address removed) said:

I believe it's undefined behavior to cast a char * to an unsigned *
and then dereference it,

Why do you believe this?

<snip>
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,431
Messages
2,571,678
Members
48,796
Latest member
Greg L.

Latest Threads

Top