Puzzling program

J

jaysome

Because:
a) The illiterate return type 'void' for main makes all bets off.
b) The illiterate omission of the required prototype for the variadic
function printf makes all bets off.
c) The illiterate absence of an end-of-line character ending the
last line of output makes all bets off.

What do you mean by "all bets off"? Do you mean undefined behavior? I
don't think it is undefined behavior.

Consider:

#include <stdio.h>
int main(void)
{
printf("Hello world!");
return 0;
}

There is no end-of-line character ending the last line of output. Yet,
as far as the C standard is concerned, this is well-defined behavior.
The definition of the word "flushed" is found in Section 7.19.3.(4):

"A file may be disassociated from a controlling stream by closing the
file. Output streams are flushed (any unwritten buffer contents are
transmitted to the host environment) before the stream is
disassociated from the file."

Then, from Section 7.19.3.(5):

"The file may be subsequently reopened, by the same or another program
execution, and its contents reclaimed or modified (if it can be
repositioned at its start). If the main function returns to its
original caller, or if the exit function is called, all open files are
closed (hence all output streams are flushed) before program
termination."

From these two sections, we know that, before program termination, the
following statement:

printf("Hello world!");

results in any unwritten buffer contents being transmitted to the host
environment.

What the host environment does with a final buffer that does not
contain an "end-of-line" character is completely beyond the
jurisdiction of the C standard, and thusly there is no possibility of
undefined behavior.

Admittedly, it is not a good idea to omit an end-of-line character
ending the last line of output. That's because what you expect to be
output may not be output, and it all depends on the host environment
as to whether or not the program output behaves as expected. Again,
this is not undefined behavior, as far as the C standard is concerned.

Regards
 
R

Richard Bos

While all this is quite reasonable, it's still true that any
experienced C programmer should be able to explain why the output is,
in fact, quite likely to be 636261.

Should be able to, yes. Should be willing to without first explaining
why this output is only likely on modern desktop systems, and why it is
a bad idea to write code like that, no.

Richard
 
K

Keith Thompson

jaysome said:
On Wed, 08 Aug 2007 19:36:27 -0400, Martin Ambuhl


What do you mean by "all bets off"? Do you mean undefined behavior? I
don't think it is undefined behavior.

Consider:

#include <stdio.h>
int main(void)
{
printf("Hello world!");
return 0;
}
[...]

I think it's implementation-defined whether it's undefined or not.

C99 7.19.2p2:

Whether the last line requires a terminating new-line character is
implementation-defined.

There's no mention of what the behavior is if a new-line is required
but is not provided, so in that case the behavior is undefined by
omission.

On the other hand, if the implementation *doesn't* require a
terminating new-line character (it's required to document it either
way), then the behavior is well defined.
 
P

pete

Keith Thompson wrote:
I think it's implementation-defined whether it's undefined or not.

I consider code like that,
to be most simply described as "undefined",
C99 7.19.2p2:

Whether the last line requires a terminating new-line character is
implementation-defined.

There's no mention of what the behavior is if a new-line is required
but is not provided, so in that case the behavior is undefined by
omission.

On the other hand, if the implementation *doesn't* require a
terminating new-line character (it's required to document it either
way), then the behavior is well defined.

since the short answer to the question
"Is the behavior of that code limited by the C standard?" is "No."
 
P

Philip Potter

Lew said:
The answer is that the program is poorly written and invokes behaviour that is
not defined by the C standard. As "undefined behaviour" means that anything
can happen, the output value of 636\261 is just as valid (and just as likely)
as any other output.

This is not true, and it doesn't help explain the danger of undefined
behaviour. Indeed, one of the problems of undefined behaviour is that it
may well compile and do what you expect, and in some cases is very
likely to do so. But because the behaviour is undefined, there are no
guarantees that porting to a different system, or even just getting a
new version of the same compiler, won't break your program.

Phil
 
S

stdazi

Good day group.

I was asked in an interview to explain the behavior of this program.

void main()
{
char *s = "abc";
int *i = (int *) s;
printf("%x", *i);

}

The question is: why is the output 636261? I don't think I know enough
about C to understand how the conversions between pointer types are
occurring.

Thanks for any help.

RD

I'd never like to work for a company where interviewers declare main
as a void function...
 
A

Army1987

Style point: 'i' is a lousy name for a pointer.


Also, calling printf with no prototype in scope invokes undefined
behavior (the fix is to add a '#include <stdio.h>' to the top of the
source file). And "%x" is the wrong format for printing an int;
probably 'i' should have been declared as an unsigned int* (the
standard *almost* says that it will work anyway, but it's not
something I'd be comfortable counting on).
For sufficiently large values of "almost". See 6.5.2.2p6, the part
with in n1124.pdf is on the beginning of the next page.
Also see the first sentence of 6.2.5p9.
You were thinking about the fact that *i could be negative?
 
J

Jalapeno

Good day group.

I was asked in an interview to explain the behavior of this program.

void main()
{
char *s = "abc";
int *i = (int *) s;
printf("%x", *i);

}

The question is: why is the output 636261?

When I compile your program I get this:

* * * * * S O U R C E *
* * * *

LINE
STMT
*...+....1....+....2....+....3....+....4....+....
5....+....6....+....7
1 | void
main()
===========> ..........a.......................................................
*=INFORMATIONAL===> a - CCN3450 Obsolete non-prototype-style function
declaratition.
2 |
{
3 1 | char *s =
"abc";
4 2 | int *i = (int *)
s;
===========> ............a.....................................................
*=INFORMATIONAL===> a - CCN3495 Pointer type conversion
found.
*=INFORMATIONAL===> a - CCN3374 Pointer types "int*" and "char*" are
not compatible.
5 3 | printf("%x",
*i);
===========> ...a..............................................................
*=WARNING=========> a - CCN3304 No function prototype given for
"printf".
6
| }
===========> .a................................................................
*=INFORMATIONAL===> a - CCN3470 Function "main" should return int, not
void.
* * * * * E N D O F S O U R
C E * *



When I run your program I get the following output:

81828300

HTH.
 
R

Richard Tobin

The question is: why is the output 636261?
[/QUOTE]
When I run your program I get the following output:

81828300

"Why is the output 636261" doesn't mean "why is everyone's output 636261".
Its natural interpretation in context is "if the output is 636261, what
can you tell me about the C implementation". Good answers include "you're
not using z/OS".

-- Richard
 
R

Rajeet Dalawal

I'll assume Intel CPU and ASCII characters.
You have s pointing to a four-byte string, in hex, 61 62 63 00.
Intel is little-endian. When you treat those four bytes as an int and
the print it as hex, the result is 00636261.

Here's what I'd expect to happen: s is a pointer to an array of chars,
but the char it actually points to is 'a'. This has value 0x61. So when
this gets typecast to an integer, the standard promotion char -> int
should mean that *i takes the value 0x61 as well.

I don't understand why 'b' and 'c' come into it, or why the order is
reversed.

Thanks for the replies.

RD
 
S

Steffen Buehler

Rajeet said:
Here's what I'd expect to happen: s is a pointer to an array of chars,
but the char it actually points to is 'a'. This has value 0x61. So
when this gets typecast to an integer,

This is not what your program did. You cast a char pointer to an int
pointer.

When you want to know to what value a pointer is pointing, the program
in your example takes *four* bytes starting at the address where "abc"
is stored. It does *not* just take one byte and cast the result to int,
you didn't code that.
I don't understand why 'b' and 'c' come into it, or why the order is
reversed.

As mentioned, google for little and big endians.

Regards
Steffen
 
K

Keith Thompson

Army1987 said:
For sufficiently large values of "almost". See 6.5.2.2p6, the part
with in n1124.pdf is on the beginning of the next page.
Also see the first sentence of 6.2.5p9.
You were thinking about the fact that *i could be negative?

6.2.5p9 is what I was thinking of (though I hadn't bothered to look it
up). It guarantees the same representation, but the statement that

The same representation and alignment requirements are meant to
imply interchangeability as arguments to functions, return values
from functions, and members of unions.

is relegated to a footnote; thus my use of the word "almost". I
hadn't been aware of 6.5.2.2p6, which makes the guarantee explicitly
in normative text.

But after re-reading 6.5.2.2p6 a couple of times, I don't think it
actually applies here. That paragraph talks about calling a function
with no visible prototype. If the function is *not* variadic, you can
pass a signed int argument for an unsigned int parameter (or vice
versa) as long as the value is representable in both types. But the
example involves a call to printf(), which is variadic. Calling a
variadic function with no prototype is undefined behavior regardless
of the arguments; even 'printf("hello")' invokes UB if there's no
prototype.
 
J

Jens Thoms Toerring

Here's what I'd expect to happen: s is a pointer to an array of chars,
but the char it actually points to is 'a'. This has value 0x61. So when
this gets typecast to an integer, the standard promotion char -> int
should mean that *i takes the value 0x61 as well.
I don't understand why 'b' and 'c' come into it, or why the order is
reversed.

's' points to some location in memory where the first of an array
of chars is stored. Now you assign this pointers value to the int
pointer 'i' (with a cast probably to sut up the compiler). When
know 'i' is dereferenced the compiler has to assume that at this
position in memory an int is stored. And on the system this pro-
gram was obviously meant for an int has a size of 4 bytes and
whatever is stored at those 4 bytes (the numerical represen-
tations of the characters 'a', 'b', 'c' and '\0') is then inter-
preted as an int. To get the result the interviewer asked for
two more conditions must be satisfied; a) the numerical repre-
sentation of characters is in ASCII (i.e. 'a' is 0x61 etc.) and
the machine must be a little-endian machine (where the least
significant byte is stored at the lowest address and the most
significant byte at the highest address). So we have already
three conditions that must be satisfied simultaneously to arrive
at this result: sizeof(int) == 4, ASCII character set and a
little-endian machine. But that's not all: you need another
stroke of luck to get this program to work since there are a lot
of machines where an int can't start at arbitrary addresses but
only e.g. at even addresses (or addreses that can be divided by
4 etc.), while a char can always reside at all addresses. If on
such a machine the string does not start at an address where also
an int can start (and there's no way you can make sure of this)
then the program probably will crash with an SIGBUS error when
'i' gets dereferenced.
Regards, Jens
 
A

Army1987

6.2.5p9 is what I was thinking of (though I hadn't bothered to look it
up). It guarantees the same representation, but the statement that

The same representation and alignment requirements are meant to
imply interchangeability as arguments to functions, return values
from functions, and members of unions.

is relegated to a footnote; thus my use of the word "almost". I
hadn't been aware of 6.5.2.2p6, which makes the guarantee explicitly
in normative text.

But after re-reading 6.5.2.2p6 a couple of times, I don't think it
actually applies here. That paragraph talks about calling a function
with no visible prototype. If the function is *not* variadic, you can
pass a signed int argument for an unsigned int parameter (or vice
versa) as long as the value is representable in both types. But the
example involves a call to printf(), which is variadic. Calling a
variadic function with no prototype is undefined behavior regardless
of the arguments; even 'printf("hello")' invokes UB if there's no
prototype.
Right, I had misread it. (The standard also makes the guarantee
referring to va_arg, but doesn't require printf to be implemented
using it.)
 
K

Keith Thompson

But that's not all: you need another
stroke of luck to get this program to work since there are a lot
of machines where an int can't start at arbitrary addresses but
only e.g. at even addresses (or addreses that can be divided by
4 etc.), while a char can always reside at all addresses. If on
such a machine the string does not start at an address where also
an int can start (and there's no way you can make sure of this)
then the program probably will crash with an SIGBUS error when
'i' gets dereferenced.

For context, here are the relevant lines from the original cruddy
program:

char *s = "abc";
int *i = (int *) s;

It's certainly true that there's no guarantee that the string has any
particular alignment, but I think most implementations do tend to
store strings (and, more generally, arrays) at a stricter alignment
than is required. Doing so can have some performance advantages. For
example, an implementation of memcpy(), or even strcpy(), might use
word moves whenever both the source and the target are suitably
aligned.

It's yet another instance of undefined behavior that's very likely to
work as expected (unfortunately).
 
J

Jens Thoms Toerring

For context, here are the relevant lines from the original cruddy
program:
char *s = "abc";
int *i = (int *) s;
It's certainly true that there's no guarantee that the string has any
particular alignment, but I think most implementations do tend to
store strings (and, more generally, arrays) at a stricter alignment
than is required. Doing so can have some performance advantages. For
example, an implementation of memcpy(), or even strcpy(), might use
word moves whenever both the source and the target are suitably
aligned.

It's yet another instance of undefined behavior that's very likely to
work as expected (unfortunately).

Well, we can modify it a bit to increase the odds of it failing on a
machine with stricter alignment requirements;-) One way would be

char *s = "Aabc";
int *i = (int *) (s + 1);
Regards, Jens
 
M

Malcolm McLean

Charlton Wilbur said:
And while the interviewer is deciding whether or not he
wants to hire the candidate, the candidate is deciding whether or not
he wants to work for that company.
Sometimes you are in that fortunate position. But it is still better to get
the offer and turn them down at that stage.
For a lot of people professional fulfillment is low down the list. For
instance for personal reasons my last job had to be in Leeds. There were
exactly three games companies in the city. So I was glad to be offered a job
on my first interview, despite not really liking sports games. Otherwise I
would have been running out of options quite quickly.
 
K

Keith Thompson

Army1987 said:
Right, I had misread it. (The standard also makes the guarantee
referring to va_arg, but doesn't require printf to be implemented
using it.)

Right (C99 7.15.1.1p2), but that guarantee doesn't apply if the
function is called without a visible prototype.

Calling a variadic function with no visible prototype is undefined
behavior, regardless of the arguments.
 
C

Christopher Benson-Manica

Steffen Buehler said:
When you want to know to what value a pointer is pointing, the program
in your example takes *four* bytes starting at the address where "abc"

Not quite right; it takes sizeof(int) bytes, which on OP's machine
(and the interviewer's machine) is 4.
 
C

Christopher Benson-Manica

Rajeet Dalawal said:
Here's what I'd expect to happen: s is a pointer to an array of chars,
but the char it actually points to is 'a'. This has value 0x61. So when
this gets typecast to an integer, the standard promotion char -> int
should mean that *i takes the value 0x61 as well.

This program does what you just described:

#include <stdio.h>

int main(void)
{
char *s = "abc";
int i = *s;
printf("%08x\n", i);
return 0;
}

Note the differences between this and the original program.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,431
Messages
2,571,677
Members
48,796
Latest member
Greg L.

Latest Threads

Top