C perfomance

  • Thread starter Papadopoulos Giannis
  • Start date
M

Mark McIntyre

(misc stuff about x86 assembler, etc. )

Remind me what the hell this has to do with C?
 
P

pete

Christian said:
Tim Prince said:
[...] All the more reason for using a compiler which can
take portable C and choose the best instruction for the target architecture.
The OP assertion, that ++i could be more efficient (when used in subscript
context), was true of gcc on the Mac I once had.

Well ok, then the Mac port of the gcc compiler sucks ass -- but it
also means that the underlying PPC must be somewhat weak not to make
this irrelevant, which I am not sure I believe.

Not for ++i vs. i++, but for *++p vs. *p++ it made a difference:

*++p is semantically different from *p++

There's no way that the evaluation of those two expressions in code,
could generate the same machine instructions in translation.
 
P

Peter Nilsson

pete said:
Christian said:
[...] All the more reason for using a compiler which can
take portable C and choose the best instruction for the target architecture.
The OP assertion, that ++i could be more efficient (when used in subscript
context), was true of gcc on the Mac I once had.

Well ok, then the Mac port of the gcc compiler sucks ass -- but it
also means that the underlying PPC must be somewhat weak not to make
this irrelevant, which I am not sure I believe.

Not for ++i vs. i++, but for *++p vs. *p++ it made a difference:

*++p is semantically different from *p++

There's no way that the evaluation of those two expressions in code,
could generate the same machine instructions in translation.

I think what Christian Bau is talking about is the difference between the
following...

char *strcpy(char *s, const char *t)
{
char *p = s;
while (*p++ = *t++);
return s;
}

char *strcpy(char *, const char *t)
{
char *p = s; /* more usual is: char *p = s - 1; */
if (*p = *t) /* t--; */
while (*++p = *++t);
return s;
}

These two are semantically the same (hopefully! ;-), but you could (and
will) observe different optimisations from compilers targetting different
architectures. The top version targets 680x0, the bottom targets Power PC.
The former has fast post-increment, the latter has fast pre-increment.

Of course, you could argue that I should get a better optimising compiler
when needed, but these are quite simple cases. More difficult challanges for
a compiler are not too hard to come up with.
 
P

pete

Peter said:
pete said:
Christian said:
[...] All the more reason for using a compiler which can
take portable C and choose the best instruction for the target architecture.
The OP assertion, that ++i could be more efficient (when used in subscript
context), was true of gcc on the Mac I once had.

Well ok, then the Mac port of the gcc compiler sucks ass -- but it
also means that the underlying PPC must be somewhat weak not to make
this irrelevant, which I am not sure I believe.

Not for ++i vs. i++, but for *++p vs. *p++ it made a difference:

*++p is semantically different from *p++

There's no way that the evaluation of those two expressions in code,
could generate the same machine instructions in translation.

I think what Christian Bau is talking about is the difference between the
following...

char *strcpy(char *s, const char *t)
{
char *p = s;
while (*p++ = *t++);
return s;
}

char *strcpy(char *, const char *t)
{
char *p = s; /* more usual is: char *p = s - 1; */
if (*p = *t) /* t--; */
while (*++p = *++t);
return s;
}

These two are semantically the same (hopefully! ;-),

The functions are the same, but I think that it
would be asking a lot from a compiler, to see that.
If the values of p and t were supposed to be meaningful
after the loop, it would be different.
The loop semantics are not the same.
When t points to a zero length string,
the top version will increment and the bottom one won't.

In this version of strncpy, the bottom loop is my prefered
method of writing a loop that will execute as many times
as the intitial value of n, when I'm not counting clock ticks.
But after the top loop executes,
I need n to represent the number of times still left to go,
so I can't write while(n-- && *s2 != '\0'), there.

char *strncpy(char *s1, const char *s2, size_t n)
{
char *const p1 = s1;

while (n != 0 && *s2 != '\0') {
*s1++ = *s2++;
--n;
}
while (n--) {
*s1++ = '\0';
}
return p1;
}
 
C

Christian Bau

pete said:
In this version of strncpy, the bottom loop is my prefered
method of writing a loop that will execute as many times
as the intitial value of n, when I'm not counting clock ticks.
But after the top loop executes,
I need n to represent the number of times still left to go,
so I can't write while(n-- && *s2 != '\0'), there.

char *strncpy(char *s1, const char *s2, size_t n)
{
char *const p1 = s1;

while (n != 0 && *s2 != '\0') {
*s1++ = *s2++;
--n;
}
while (n--) {
*s1++ = '\0';
}
return p1;
}

The second loop would be an example of making your code unreadable in
the hope of saving a few nanoseconds (without success, for many
compilers). Why not

for (; n > 0; --n)
*s1++ = '\0';
 
P

pete

Christian said:
The second loop would be an example of making your code unreadable
in the hope of saving a few nanoseconds

How do you figure there's a hope of saving a few nanoseconds ?

while (n--){;}
is easy for me to recoginize
as a loop that's supposed to execute n times.
That's why I like it.
 
N

Nick

pete said:
Christian Bau wrote:



How do you figure there's a hope of saving a few nanoseconds ?

while (n--){;}
is easy for me to recoginize
as a loop that's supposed to execute n times.
That's why I like it.
Unless there's a compelling reason to null out the remainder of s1,
a single *s1 = '\0' would suffice to null terminate the string, instead
of the
while (n--) loop?

I'm not sure what the spec for strncp() says regarding whether a single
null is acceptable or whether the remainder of string should be nulled out.

Nick L.

BTW - I also like the while (n--) contruct, but that's because
in m680X0 assembler it was implemented as a single instruction
more or less.
 
P

Peter Nilsson

....
Unless there's a compelling reason to null out the remainder of s1,
a single *s1 = '\0' would suffice to null terminate the string,

The 'compelling reason' is supplied by both standards' specification of
strncpy.
 
P

pete

The second loop would be an example of making your code
unreadable in the hope of saving a few nanoseconds
(without success, for many compilers). Why not

for (; n > 0; --n)
*s1++ = '\0';

As a general rule, I don't like using relational operators
to compare size_t objects against zero constants.

The only reason that I write library functions in C code,
is so that I can post examples to this newsgroup,
without having to explain what they're supposed to do.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,009
Latest member
GidgetGamb

Latest Threads

Top