gcc initialization better or extra line of execution

anish singh · Nov 11, 2013

Thanks!

Seebs, first of all I am thankful.You seem to
have got the problem what I was trying to solve.
I am a new user of C google groups and was of
the opinion that: Let's just post it here and
see what happens without thinking through the whole
program and what kind of reactions that would
generate.However that perception has changed.
I am sure I will do better in next post.

#define ERROR 1

Click to expand...

#define OK 0

/*userspace wants us to copy size bytes*/

Click to expand...

int foo(char *buf, int size)

Click to expand...

This makes a huge difference, I'd point out: That the size

of the provided string is precomputed changes all sorts of

things. For instance, consider the earlier versions:

/* 1 */

memcpy(space, buf, strlen(buf));

/* 2 */

memcpy(space, buf, strlen(buf));

buf[strlen(buf)] = '\0';

Here, the second one makes two consecutive calls to strlen()

on the same string, so it's going to need computational time

on the close order of 3x the length of the string. The first

is only making one strlen call, and one memcpy call.

But if we change them to:

/* 1 */

memcpy(space, buf, size);

/* 2 */

memcpy(space, buf, size);

buf[size] = '\0';

Now each is only iterating over the string once. Very different!

{

Click to expand...

char string[128];

Click to expand...

if(size >= 127) {

Click to expand...

return -ERROR;

* what I want here is the null terminated string

Click to expand...

* eventhough the caller of foo has not provided

Click to expand...

* a null terminated string.If size is 126 then

Click to expand...

* I will set 127 byte to 0.

snprintf(string, size+1, "%s", buf);

Click to expand...

printf("%s\n", string);

Click to expand...

return OK;

Click to expand...

}

Click to expand...

My biggest concern about this is that typically, when kernel code

is using a local buffer, there is some possibility that the buffer may

later be exposed to user code. If you don't zero out the whole

buffer, you can be exposing stack contents to user code.

int main()

char *string;

Click to expand...

int size=0;

Click to expand...

string = malloc(256*sizeof(char));

Click to expand...

strcpy(string, "anish"); /*set in userspace*/

Click to expand...

/*set in userspace--anything can happen*/

foo(string, size); /* check the return value*/

strcpy(string, "anishkumar");

Click to expand...

foo(string, size);/* check the return value*/

Click to expand...

return OK;

Is it a good code?

Click to expand...

It makes a little more sense, anyway.

There are still some issues here.

int foo(char *buf, int size)

{

Click to expand...

char string[128];

Click to expand...

if(size >= 127) {

Click to expand...

return -ERROR;

* what I want here is the null terminated string

Click to expand...

* eventhough the caller of foo has not provided

Click to expand...

* a null terminated string.If size is 126 then

Click to expand...

* I will set 127 byte to 0.

snprintf(string, size+1, "%s", buf);

Click to expand...

printf("%s\n", string);

Click to expand...

return OK;

Click to expand...

}

Click to expand...

This has at least a couple of serious potential bugs lurking. The most

obvious is that "int" is a signed type, which means size can be negative.

This came up in code review.Sorry didn't
do that same change here.

A negative size isn't >= 127, but snprintf(string, (-2)+1, "%s", buf);

would be at risk of overrunning the buffer fairly dramatically, unless

the snprintf implementation is helpfully realizing you can't possibly

have meant that.

There's another, perhaps more subtle, bug which is that your test should

probably be ">127", not ">=127". You can copy 127 characters, plus a

null byte, into a 128-byte string.

I didn't know this.My assumption was this:
0 to 127 = 128 bytes
If we write anything in 128 location then
it might do something weird.May be a crash.

That said, I think you are misunderstanding the point of snprintf.

You should ALWAYS specify the amount of space available to you, not

the amount you think you have, for snprintf. So that line should be:

snprintf(string, sizeof(string), "%s", buf);

Great.I didn't know this either.

If you do this, you are guaranteed that you will not overrun "string",

no matter what value size has. So let's see:

int foo(char *buf, int size)

{

char string[128];

int copied;

if (size > (sizeof(string) - 1)) {

return -ERROR;

}

/* use .* to make snprintf stop immediately if it reaches the end

* of the buffer, so if the user hands us a 32kb string, we don't

* spend a lot of time not-copying it to determine how many bytes

* we'd need to copy.

*/

copied = snprintf(string, sizeof(string), "%.*s", sizeof(string), buf);

if (copied != size) {

/* user lied to us: string was shorter

* or longer than expected. */

return -ERROR;

}

printf("%s\n", string);

return OK;

} Thanks!!

That said... I'd still initialize the local variable. I don't know what

you plan to do with string after copying the user's data into it, but if

there is ANY chance that it would ever be part of something returned to

a user function, it should not have uninitialized stack contents in it.

Good point.

You'll note a lot of use of sizeof(string). That's because it is a really

bad habit to rewrite that number. Let's say you start with:

char string[128];

if(size >= 127) {

return -ERROR;

}

Well, somewhere along the line, someone will add an additional test:

char string[128];

if (!buf) {

return -ERROR;

}

if(size >= 127) {

return -ERROR;

}

Now the "127" and "128" aren't adjacent. Then someone else comes along

and says "hang on, we know what's in this string, and it's impossible

for it to ever exceed 60 bytes. Let's leave a little slop value just

in case, but not that much":

char string[80];

if (!buf) {

return -ERROR;

}

if(size >= 127) {

return -ERROR;

}

And now what happens when someone inserts an 87-byte string? *BOOM*.

I guess the key point I want to get at here is:

C does not care what you are trying to illustrate, or what you think is

important, or anything else. C does what you told it to, exactly, whether

or not that is what you want. If you are going to work in C, remember that

C *does not care*. It is not trying to help you. It is not going to catch

your harmless mistakes, it is not going to divine your intent, it is not

going to do what you meant.

It does EXACTLY what you said. Mostly. Unless it can do something totally

different and be sure that, if you are not cheating, you won't be able to

tell. Another good point.

Consider the following code:

int main(void) {

char buf[128];

strcpy(buf, "hello, world!");

printf("%s\n", buf);

return 0;

}

Do you think the code generated by the compiler will include a reference

to strcpy()? I think so.

I'll give you a hint: On the implementation I tried on, there was no

call to either strcpy or printf.

I'll give you two lines of assembly, and see whether you can figure out

why:

Sorry no idea.I tried gcc and it did as below:
3 char buf[128];
4
5 strcpy(buf, "hello, world!");
0x08048477 <+27>: lea 0x1c(%esp),%eax
0x0804847b <+31>: movl $0x6c6c6568,(%eax)
0x08048481 <+37>: movl $0x77202c6f,0x4(%eax)
0x08048488 <+44>: movl $0x646c726f,0x8(%eax)
0x0804848f <+51>: movw $0x21,0xc(%eax)

6
7 printf("%s\n", buf);
0x08048495 <+57>: lea 0x1c(%esp),%eax
0x08048499 <+61>: mov %eax,(%esp)
0x0804849c <+64>: call 0x8048340 <puts@plt>

I can see a call for printf but not for strcpy.
I don't have any remote idea about why?

movabsq $8583909746840200552, %rax ## imm = 0x77202C6F6C6C6568

[...]

callq _puts

What do you think that big number ($85839...) represents? Sorry, no idea.

-s

Well it is a serious issue with big companies

James Kuyper · Nov 11, 2013

int foo(char *buf, int size)
{
char string[128];
if(size >= 127) {
return -ERROR;
}
/*
* what I want here is the null terminated string
* eventhough the caller of foo has not provided
* a null terminated string.If size is 126 then
* I will set 127 byte to 0.
*/
snprintf(string, size+1, "%s", buf);
printf("%s\n", string);
return OK;
}

Click to expand...

....
There's another, perhaps more subtle, bug which is that your test should
probably be ">127", not ">=127". You can copy 127 characters, plus a

null byte, into a 128-byte string.

Click to expand...

I didn't know this.My assumption was this:
0 to 127 = 128 bytes
If we write anything in 128 location then
it might do something weird.May be a crash.

Perfectly true - the technical term from the C standard is "undefined
behavior". But if you made the indicated change to the condition, the
additional case that would be allowed (size==127) would NOT result in
anything being written to ch[128]. snprintf(string, size+1, ...) will
print a maximum of size bytes, followed by a null character, so if
size==127, at most 127 bytes of string will be copied, occupying, at
most, positions ch[0] through ch[126]. The null character will get
written at following position - in the most extreme case, that will be
ch[127], the 128th byte of ch, and therefore still permitted.

....

Consider the following code:
int main(void) {
char buf[128];
strcpy(buf, "hello, world!");
printf("%s\n", buf);
return 0;
}

Do you think the code generated by the compiler will include a reference
to strcpy()?

Click to expand...

I think so.

You're radically underestimating the optimizing capabilities of modern
compilers.

I'll give you a hint: On the implementation I tried on, there was no
call to either strcpy or printf.

I'll give you two lines of assembly, and see whether you can figure out
why:

Click to expand...

Sorry no idea.I tried gcc and it did as below:
3 char buf[128];
4
5 strcpy(buf, "hello, world!");
0x08048477 <+27>: lea 0x1c(%esp),%eax
0x0804847b <+31>: movl $0x6c6c6568,(%eax)
0x08048481 <+37>: movl $0x77202c6f,0x4(%eax)
0x08048488 <+44>: movl $0x646c726f,0x8(%eax)
0x0804848f <+51>: movw $0x21,0xc(%eax)

6
7 printf("%s\n", buf);
0x08048495 <+57>: lea 0x1c(%esp),%eax
0x08048499 <+61>: mov %eax,(%esp)
0x0804849c <+64>: call 0x8048340 <puts@plt>

I can see a call for printf but not for strcpy.
I don't have any remote idea about why?

Because the strcpy() call has been optimized in a series of movl
commands. That's a less extreme optimization than the one Seebs
demonstrated.

movabsq $8583909746840200552, %rax ## imm = 0x77202C6F6C6C6568
[...]
callq _puts

What do you think that big number ($85839...) represents?

Click to expand...

Sorry, no idea.

I haven't bothered to check, but I'd expect it to be the decimal number
corresponding to interpreting the memory containing the string "hello,
world!" as if it were sufficiently long unsigned integer.

Stephen Sprunk · Nov 11, 2013

Sorry no idea.I tried gcc and it did as below:
3 char buf[128];
4
5 strcpy(buf, "hello, world!");
0x08048477 <+27>: lea 0x1c(%esp),%eax
0x0804847b <+31>: movl $0x6c6c6568,(%eax)
0x08048481 <+37>: movl $0x77202c6f,0x4(%eax)
0x08048488 <+44>: movl $0x646c726f,0x8(%eax)
0x0804848f <+51>: movw $0x21,0xc(%eax)

6
7 printf("%s\n", buf);
0x08048495 <+57>: lea 0x1c(%esp),%eax
0x08048499 <+61>: mov %eax,(%esp)
0x0804849c <+64>: call 0x8048340 <puts@plt>

I can see a call for printf but not for strcpy.
I don't have any remote idea about why?

The compiler optimized it away; that's what those movl/movw instructions
are for:

0x6c6c6568 = 'lleh'
0x77202c6f = 'w ,o'
0x646c726f = 'dlro'
0x0021 = '\0!'

Also, the compiler replaced your printf() call with a puts() call, which
is generally much cheaper.

You seem to be seriously underestimating your compiler's ability to
optimize your code. This is a common newbie mistake, especially if
learning from an outdated textbook.

Focus on making your code as clear (to human readers) and correct as
possible; let the compiler worry about making it fast.

S

Seebs · Nov 11, 2013

Seebs, first of all I am thankful.You seem to
have got the problem what I was trying to solve.
I am a new user of C google groups and was of
the opinion that: Let's just post it here and
see what happens without thinking through the whole
program and what kind of reactions that would
generate.However that perception has changed.
I am sure I will do better in next post.
Cool.

I didn't know this.

MHO: If you don't know this, you probably shouldn't be writing kernel
code.

My assumption was this:
0 to 127 = 128 bytes

That is true.

If we write anything in 128 location then
it might do something weird.May be a crash.

That is true.

But if the string is 127 bytes, it will go into locations 0 through
126, leaving [127] for the null.

[my code sample for reference]

int main(void) {
char buf[128];
strcpy(buf, "hello, world!");
printf("%s\n", buf);
return 0;
}
Do you think the code generated by the compiler will include a reference
to strcpy()?

Click to expand...

I think so.

Sorry no idea.I tried gcc and it did as below:
3 char buf[128];
4
5 strcpy(buf, "hello, world!");
0x08048477 <+27>: lea 0x1c(%esp),%eax
0x0804847b <+31>: movl $0x6c6c6568,(%eax)
0x08048481 <+37>: movl $0x77202c6f,0x4(%eax)
0x08048488 <+44>: movl $0x646c726f,0x8(%eax)
0x0804848f <+51>: movw $0x21,0xc(%eax)

6
7 printf("%s\n", buf);
0x08048495 <+57>: lea 0x1c(%esp),%eax
0x08048499 <+61>: mov %eax,(%esp)
0x0804849c <+64>: call 0x8048340 <puts@plt>

I can see a call for printf but not for strcpy.

No, you see a call for puts, not printf. This, it turns out, is also a
big clue as to part of what's happening.

I don't have any remote idea about why?

Look closely at the movl and movew values. What do you think is the
significance of 0x68656c6c?

Contrast:
0x0804847b <+31>: movl $0x6c6c6568,(%eax)
0x08048481 <+37>: movl $0x77202c6f,0x4(%eax)

Notice anything similar between these? I'm betting you're on a 32-bit
system, so you're not using 64-bit values.

Sorry, no idea.

So think! What *might* it be?

What do you know about this program's behavior? What needs to have happened
by the time the code is done executing? Where is the string literal you're
supposedly copying from?

You have two unknowns here:
1. Where is the string literal?
2. What are these numbers?

Maybe you could look at them to see whether they are related in some
way.

If you really want to understand the code the compiler generates, you will
have to get good at figuring out what code does, and why it would be related
to the source it was working from.

-s

Seebs · Nov 11, 2013

On 11/10/2013 05:45 PM, anish singh wrote:
...

I apologize for causing this mess by giving a
lot of messed up code and asking something which
doesn't make sense.So I have decided to drop this
topic of deciding which is better code as it
doesn't make sense.However I have this below code
which i think will answer everyone's query.

#define ERROR 1
#define OK 0
/*userspace wants us to copy size bytes*/
int foo(char *buf, int size)
{
char string[128];
if(size >= 127) {
return -ERROR;
}

Click to expand...

If 1 means "ERROR", as suggested by the macro name, what does -1 mean?
If -1 is your actual error return, why not use

#define ERROR (-1)

I can answer this one!

The Linux kernel has an idiom wherein syscalls that want to "return -1, and
set errno", do so by returning a single value to userspace, which is -errno.
So for instance, instead of:
errno = EFAULT;
return -1;
the kernel code would do:
return -EFAULT;

Then the user-side syscall wrapper would do something like:
if (retval < 0) {
errno = -retval;
retval = -1;
}

The 'size' variable in main() is pretty pointless. This example code
would have been slightly simpler and a little bit clearer by just
directly passing 5 or 6 to foo().

I would guess it was abstracted from code that was using other strings
and computing their length or something.

-s

Keith Thompson · Nov 11, 2013

James Kuyper said:
In the following "you" refers to Anish, not me - right? In context, it
seems obvious, but I ordinarily would have expected the use of the third
person when referring to him in a reply to me.

Yes, sorry I didn't make that clearer.

[...]

Ike Naar · Nov 11, 2013

(snip on strncmp)

When I use strncmp(), I usually follow the call by a store of zero
(null) into the last array element. I suppose strncmp() could have been
designed to do that, but maybe there are some cases where it isn't
needed or wanted.

You mean strncpy?

Phil Carmody · Nov 11, 2013

Seebs said:
So here is the problem:

Click to expand...

Yes, yes it is.

Progarm 'A'
main()
{
char array[128] = {0};
memcpy(array, "testing", strlen("testing"));
}

Click to expand...

This program is wrong.

Progarm 'B'
main()
{
char array[128];
memcpy(array, "testing", strlen("testing"));
array[strlen("testing")] = '\0';
}

Click to expand...

This program is also wrong, and frankly sort of excessively so.

Which one is better program?

Click to expand...

They're both bad.

{
char array[128] = "testing";
}

Less is more...

{
char array[] = "testing";
}

(But of course, whoever said the whole thing is a NOP wins.)

Knowing what the issues are without reference to the assembly output is
not "guessing".

So much in agreement here. If your compiler is generating
significantly different code in order to perform what it knows is
exactly the same task, then your compiler is the problem, not the
source, get one with a better QoI, and conclude nothing from its
output. Good compilers are very good at performing obvious mechanical
optimisations, one should be permitted to guess that the compiler
makes use of all of the information to hand, and uses it wisely.

While I was in the recruiting loop recently, I gave more of a thumbs
up to the candidate who wrote simple dumb code with a comment that
there could be performance improvements than I did to the guy who
dived straight in with a seemingly more optimised routine without
stating what assumptions he was making. (obviously no bugs >
no obvious bugs.)

Phil

Phil Carmody · Nov 11, 2013

Seebs said:
int main()
{
/* I can rewrite in several ways but that is not the purpose here */
char ch[128];
char *string = "anish";
strncpy(ch, string, 5);
ch[6] = '\0';
printf("%s\n", ch);
return 0;
}

Click to expand...

The big distinction between the previous code and this is that
you're hand-entering the value 6 (which is wrong, btw, it should be 5).

It's not obvious why you're using strncpy and a terminator rather than
using memcpy, given that the string already has a terminator. For
instance:
memcpy(ch, string, 6);

Or you could look at:
char ch[128] = "anish";

And I guess my point is: Rewriting it in several ways SHOULD be your
purpose if you want to understand this, because as is, the results you're
seeing are entirely dominated by random coincidence and things which
are in no way part of what you're trying to compare.

As you can see that there is more code generated for initialized
code along with more instructions.From the above disassembled code
looks like initialized code is having more code + more cycles(assuming
each line is executing in one cycle which may not be the case).

Click to expand...

That's not even CLOSE to the case. Most noticably, you're ignoring the
function call. The function call into strncpy

.... might not even exist in the emitted code.

One tiny tweak of the compiler settings or defaults (or version, or ...)
might be the difference the code having an actual call, and not having
one. The output of one compilation run tells you nothing about the
general case. (Which is of course a fairly amorphous entity, which we
can only make guesses about!)

Oh, has anyone said "Don't even think about optimising this function
until you have *measured* it to be taking a significant proportion
of the total runtime?" yet?

Phil

Phil Carmody · Nov 11, 2013

Keith Thompson said:
Many programmers assume that strncpy() is just a "safer" version of
strcpy(). It isn't. It's a bizarre function that can easily leave
the target array without a terminating '\0' (i.e., not containing a
string) if you're not very careful. If you add an explicit check
to ensure that the target is null-terminated, you probably might
as well use strcpy() or memcpy().

http://the-flat-trantor-society.blogspot.com/2012/03/no-strncpy-is-not-safer-strcpy.html

Indeed. I used to call it "the cargo cult of 'n'". Any time I saw the
n (or a volatile object or equivalent), I knew that the patchset would
demand extra scrutiny, and almost certainly have a problem with it.

Phil

Keith Thompson · Nov 11, 2013

glen herrmannsfeldt said:
When I use strncmp(), I usually follow the call by a store of zero
(null) into the last array element. I suppose strncmp() could have been
designed to do that, but maybe there are some cases where it isn't
needed or wanted.

You mean strncpy().

When I use strncpy(), I use something else instead.

This:

strncpy(dest, src, sizeof dest);

can often be replaced by this:

dest[0] = '\0'
strncat(dest, src, sizeof dest);

which actually does what a lot of programmers *assume* strncpy() does.

Seebs · Nov 11, 2013

Less is more...

{
char array[] = "testing";
}

(But of course, whoever said the whole thing is a NOP wins.)

I was assuming that the fixed buffer size, coupled with talk of
initialization, meant that this would be handed to something that
wanted access to 128 bytes of stuff.

So much in agreement here. If your compiler is generating
significantly different code in order to perform what it knows is
exactly the same task, then your compiler is the problem, not the
source, get one with a better QoI, and conclude nothing from its
output. Good compilers are very good at performing obvious mechanical
optimisations, one should be permitted to guess that the compiler
makes use of all of the information to hand, and uses it wisely.

I would argue that that's not a problem with the compiler, it's a
problem with the assumption that the code generated is required
to relate to what you thought you asked for.

-s

Keith Thompson · Nov 11, 2013

Phil Carmody said:
{
char array[128] = "testing";
}

Click to expand...

Less is more...

{
char array[] = "testing";
}

It depends on how the array is being used later. If it needs to hold a
longer string later on, 128 might be a reasonable size. (I don't think
we can make any assumptions based on the code we've seen so far, which
is clearly just a snippet.)

Ken Brody · Nov 11, 2013

So here is the problem:

Progarm 'A'
main()
{
char array[128] = {0};
memcpy(array, "testing", strlen("testing"));
}

Progarm 'B'
main()
{
char array[128];
memcpy(array, "testing", strlen("testing"));
array[strlen("testing")] = '\0';
}

Which one is better program?Definition of better: A program is better if
it has less code size and executes in less cycle.

[...]

They both result in identical output on my compiler:

xor eax, eax
ret 0

Edward A. Falk · Nov 11, 2013

So here is the problem:

Progarm 'A'
main()
{
char array[128] = {0};
memcpy(array, "testing", strlen("testing"));
}

Progarm 'B'
main()
{
char array[128];
memcpy(array, "testing", strlen("testing"));
array[strlen("testing")] = '\0';
}

Ok, first of all, I'm going to assume that in a real-world application,
your input string would be some passed-in argument and just "testing".

Thus:

I would reject this out of hand because you're not confirming
that the input string fits in the array. That alone makes the code
completely unacceptable. If you wrote this on a white board
during an interview, you would not get the job.

But assuming you've already confirmed that the string will fit:

They're both pretty bad in terms of efficiency, and I'd either give you
a poor grade or ding you in code review, depending on circumstances.

In the first case, array[] is dynamically allocated,
which means the compiler will have to clear it at runtime.
(E.g. memset(array,0,sizeof(array).)

If you made array[] static, the first approach would be more efficient,
but poorer practice, since if you wind up using it again, with a shorter
input string, you'll have corrupted results.

The second example is poor because you're computing strlen() twice --
three times, actually, since I'm pretending you checked to see if the
string would fit. You should compute strlen() once and cache the value.

Edward A. Falk · Nov 11, 2013

They both result in identical output on my compiler:

xor eax, eax
ret 0

Yes, we all get that a decent compiler would optimize the whole
thing away. Consider it to be a code fragment and not a complete
program.

Seebs · Nov 11, 2013

Yes, we all get that a decent compiler would optimize the whole
thing away. Consider it to be a code fragment and not a complete
program.

While that much is moderately obvious, it turns out that without knowing
more about the larger program, we can't possibly do anything useful to
analyze the code fragment. In the real code later shown, the size is
passed in, so the calls to strlen() are a red herring, and the call to
strncpy() suddenly becomes relevant, because it really will have to
be called; there's no way to inline it, because the argument isn't a
string literal.

Although it turns out my initial evaluation stands: A reviewer who
complains about the performance cost of the initialization is probably
going to result in code that I would rather the kernel did not contain.

-s

Keith Thompson · Nov 11, 2013

Seebs said:
Less is more...

{
char array[] = "testing";
}

(But of course, whoever said the whole thing is a NOP wins.)

Click to expand...

I was assuming that the fixed buffer size, coupled with talk of
initialization, meant that this would be handed to something that
wanted access to 128 bytes of stuff.

So much in agreement here. If your compiler is generating
significantly different code in order to perform what it knows is
exactly the same task, then your compiler is the problem, not the
source, get one with a better QoI, and conclude nothing from its
output. Good compilers are very good at performing obvious mechanical
optimisations, one should be permitted to guess that the compiler
makes use of all of the information to hand, and uses it wisely.

Click to expand...

I would argue that that's not a problem with the compiler, it's a
problem with the assumption that the code generated is required
to relate to what you thought you asked for.

That depends on just what Phil meant (which I'm sure he can clarify
better than I can). My interpretation is that if a compiler
generates significantly different code given two different chunks
of C code that do exactly the same thing, that could indicate a
compiler problem.

For example, a novice programmer might worry about whether
`arr[0] = 42` is better or worse than `*arr = 42` in terms of the
generated code. A decent compiler, at least when asked to optimize,
should generate identical code for both. I don't much care what
that code looks like, or even whether it's completely identical,
as long as it gets the job done with reasonable efficiency. If the
code generated for one is much better (smaller and/or faster) than
the code generated for the other, then the compiler is missing
an opportunity.

Seebs · Nov 11, 2013

That depends on just what Phil meant (which I'm sure he can clarify
better than I can). My interpretation is that if a compiler
generates significantly different code given two different chunks
of C code that do exactly the same thing, that could indicate a
compiler problem.

Oh. That makes *way* more sense, yes.

For example, a novice programmer might worry about whether
`arr[0] = 42` is better or worse than `*arr = 42` in terms of the
generated code. A decent compiler, at least when asked to optimize,
should generate identical code for both. I don't much care what
that code looks like, or even whether it's completely identical,
as long as it gets the job done with reasonable efficiency. If the
code generated for one is much better (smaller and/or faster) than
the code generated for the other, then the compiler is missing
an opportunity.

That seems likely. And if we apply it to the various sample programs
here, it does make some sense. Of course, one of them doesn't zero out
the rest of the buffer, which I tend to regard as probably a bug, but
it produced "shorter" code, allegedly.

But as noted, if I build programs similar to these, I get much,
much, simpler code with no calls to the obvious functions.

I am sort of curious, now. I know that gcc is smart enough to optimize:
printf("%s\n", foo);
to:
puts(foo);

But let's say you have a slightly longer example:
printf("%s", "hello\n");

Will any compilers optimize this to
puts("hello");
?

-s

Ben Bacarisse · Nov 11, 2013

/*userspace wants us to copy size bytes*/
int foo(char *buf, int size)
{
char string[128];
if(size >= 127) {
return -ERROR;
}
/*
* what I want here is the null terminated string
* eventhough the caller of foo has not provided
* a null terminated string.If size is 126 then
* I will set 127 byte to 0.
*/
snprintf(string, size+1, "%s", buf);

This code seems to be been accepted as a solution, but it looks wrong to
me. The OP has stated that buf may not contain a null-terminated
string. In such a case, limiting the number of output bytes in the
snprintf call is not enough to prevent disaster. The %s format can, and
probably will, scan buf until a null byte is found. You certainly can't
assume it won't. For one thing, snprintf returns the number of bytes
that would have been needed had the output not been limited.

Solutions include using %.*s and passing snprintf a length limit for the
format, and using memcpy which just looks like the right way to do it
(and always has).

Adding adressing of IPv6 to program	1	Feb 16, 2023
Weird Behavior with Rays in C and OpenGL	4	Feb 13, 2024
help: gcc compilation difference	25	Feb 24, 2010
Remove extra blanks	34	Oct 15, 2006
Initialization of variable length arrays	3	Jun 15, 2007
Anything in the language to support better recognition of vector operations?	5	Nov 8, 2010
Help in understanding initialization of data members of class (esp. static)	6	Apr 3, 2013
In Search for a better Algorithm	10	Sep 28, 2007

gcc initialization better or extra line of execution

anish singh

James Kuyper

Stephen Sprunk

Seebs

Seebs

Keith Thompson

Ike Naar

Phil Carmody

Phil Carmody

Phil Carmody

Keith Thompson

Seebs

Keith Thompson

Ken Brody

Edward A. Falk

Edward A. Falk

Seebs

Keith Thompson

Seebs

Ben Bacarisse

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads