Does Casting Slow a Program Down?

P

PeterOut

If I had code like this.

unsigned short usLimit=10
int a[10], i;

for (i=0; i<(int)usLimit; ++i)
{
a=(int)usLimit;
}

would it run slower than this?

int a[10], i, iLimit=10;

for (i=0; i<iLimit; ++i)
{
a=iLimit;
}

Thanks,
Peter.
 
Z

Zara

If I had code like this.

unsigned short usLimit=10

<<<<< Missing semicolon
int a[10], i;

for (i=0; i<(int)usLimit; ++i)
{
a=(int)usLimit;
}

would it run slower than this?

int a[10], i, iLimit=10;

for (i=0; i<iLimit; ++i)
{
a=iLimit;
}


It depends on the compiler, but the answer should usually be 'No'

If the compiler detectes tha usLimit and iLimit are constants, then it
may detect that (int)usLimit is also constant and it would probably
generate the exact same machine code for both samples.

But this behaviour depends on the optimizations capabilities an
settings of the compiler, and of the code surrounding the snippet you
have shown us.

You may help the compiler declaring either
const unsigned short usLimit=10;
or
const int iLimit=10;


Best regards,

Zara
 
C

Chris Dollin

PeterOut said:
If I had code like this.

unsigned short usLimit=10
int a[10], i;

for (i=0; i<(int)usLimit; ++i)
{
a=(int)usLimit;
}

would it run slower than this?

int a[10], i, iLimit=10;

for (i=0; i<iLimit; ++i)
{
a=iLimit;
}


It might, but it would have nothing to do with the casts,
since they're both unnecessary -- `usLimit` is
converted from `unsigned short` to `int` regardless.

In any case, the Standard is silent on the issue.
 
R

Richard Heathfield

PeterOut said:
If I had code like this.

unsigned short usLimit=10
int a[10], i;

for (i=0; i<(int)usLimit; ++i)
{
a=(int)usLimit;
}

would it run slower than this?

int a[10], i, iLimit=10;

for (i=0; i<iLimit; ++i)
{
a=iLimit;
}


To find out, code both versions, put them in separate functions, and call
them each a zillion times. Then check the profiler's output.

Repeat on each relevant implementation. (Don't expect A to be faster than B
on implementation Y just because it was faster on implementation X.)
 
S

santosh

PeterOut said:
If I had code like this.

unsigned short usLimit=10
int a[10], i;

for (i=0; i<(int)usLimit; ++i)
{
a=(int)usLimit;
}

would it run slower than this?

int a[10], i, iLimit=10;

for (i=0; i<iLimit; ++i)
{
a=iLimit;
}


Quality of code generation is implementation defined. A good optimiser
might be able to figure out that usLimit is not changed and hence, use
a single, cached value. On some hardware, short and int might be of
the same size and representation. In such cases, the cast never takes
up any object code. But speed of execution is not specified by the
standard and no guarantees are made. An assignent statement that takes
ten days to complete is perfectly legal, as per the standard, as long
as all the side effects are resolved before the next sequence point.
 
P

PeterOut

If the compiler detectes tha usLimit and iLimit are constants, then it
may detect that (int)usLimit is also constant and it would probably
generate the exact same machine code for both samples.

But this behaviour depends on the optimizations capabilities an
settings of the compiler, and of the code surrounding the snippet you
have shown us.

You may help the compiler declaring either
const unsigned short usLimit=10;
or
const int iLimit=10;

Thanks very much for your fast reply. I was thinking in general
terms. usLimit/iLimit are given a constant value in the example that
I posted but I was thinking in terms of their being variables that are
determined by input data and/or passed as arguments to a function.
Also, what about casting a (single precision) float to a double or
vice versa? Would that impact the run time or would it depend on the
compiler?

Thanks again,
Peter.
 
C

Christopher Layne

PeterOut said:
Thanks very much for your fast reply. I was thinking in general
terms. usLimit/iLimit are given a constant value in the example that
I posted but I was thinking in terms of their being variables that are
determined by input data and/or passed as arguments to a function.
Also, what about casting a (single precision) float to a double or
vice versa? Would that impact the run time or would it depend on the
compiler?

Thanks again,
Peter.

If you're using gcc, use the -s option and check the assembler output of both
versions. If you're using another compiler - lookup how to produce the
intermediate assembler output (if it has that functionality).
 
S

santosh

PeterOut said:
Thanks very much for your fast reply. I was thinking in general
terms. usLimit/iLimit are given a constant value in the example that
I posted but I was thinking in terms of their being variables that are
determined by input data and/or passed as arguments to a function.
Also, what about casting a (single precision) float to a double or
vice versa? Would that impact the run time or would it depend on the
compiler?

Casting is an operation in C. It may or may not translate to actual
hardware instructions, depending on the size and bit representation of
the various types. Optimisers can also effect the final object code.
So, the only way to find out if casting is incurring runtime penalty,
on a particular implementation, compiled under a particular
optimisation level, is to test and compare the results. On most modern
systems and compilers, it unlikely to make a difference.

It more important to ask yourself if the cast is really needed, rather
than it's runtime performance.
 
S

Serve Laurijssen

santosh said:
Quality of code generation is implementation defined. A good optimiser
might be able to figure out that usLimit is not changed and hence, use
a single, cached value. On some hardware, short and int might be of
the same size and representation. In such cases, the cast never takes
up any object code. But speed of execution is not specified by the
standard and no guarantees are made. An assignent statement that takes
ten days to complete is perfectly legal, as per the standard, as long
as all the side effects are resolved before the next sequence point.

I find the general question which (I think) hes asking interesting.

Can integer casts generate extra instructions?
How about floating point casts?
 
B

Ben Pfaff

Serve Laurijssen said:
Can integer casts generate extra instructions?

Yes; for example, to zero a register or to move data from a
register used for one size of data to a register used for a
different size of data.
How about floating point casts?

Yes.
 
D

David T. Ashley

PeterOut said:
If I had code like this.

unsigned short usLimit=10
int a[10], i;

for (i=0; i<(int)usLimit; ++i)
{
a=(int)usLimit;
}

would it run slower than this?

int a[10], i, iLimit=10;

for (i=0; i<iLimit; ++i)
{
a=iLimit;
}


It usually depends on the compiler. Compilers vary greatly in the quality
of the optimizations they can perform.

However, in this particular case, the answer would usually be "NO". I have
not seen a compiler so poor that it would treat the two cases you gave
differently.

I would expect more variability in a case like this:

int x,y;
x = 10;
y = (char)x;

A clever compiler will recognize that with x==10, the cast has no effect. A
less clever compiler would actually truncate to a character than convert
back to an int.

In your example, most compilers will realize that the cast has no effect.
("Realize" is an inappropriate anthropomorphization of the compiler -- more
precisely, the expression trees it builds and the algorithms it applies ot
the trees will weed this out.)
 
R

Roland Pibinger

PeterOut said:
If I had code like this.

unsigned short usLimit=10
int a[10], i;

for (i=0; i<(int)usLimit; ++i)
{
a=(int)usLimit;
}

would it run slower than this?

int a[10], i, iLimit=10;

for (i=0; i<iLimit; ++i)
{
a=iLimit;
}


It might, but it would have nothing to do with the casts,
since they're both unnecessary -- `usLimit` is
converted from `unsigned short` to `int` regardless.


Right, 'unsigned short' is prometed to 'int' anyway. The casts are
probably in the code because the compiler (hopefully) issues a warning
otherwise. So the real question is whether it makes sense to use
'unsigned short' instead of 'int' in this case.

Best regards,
Roland Pibinger
 
P

Peter Nilsson

PeterOut said:
If I had code like this.

unsigned short usLimit=10
int a[10], i;

for (i=0; i<(int)usLimit; ++i)
{
a=(int)usLimit;
}

would it run slower than this?

int a[10], i, iLimit=10;

for (i=0; i<iLimit; ++i)
{
a=iLimit;
}


It might, but it would have nothing to do with the casts,
since they're both unnecessary -- `usLimit` is
converted from `unsigned short` to `int` regardless.


No. You'll find there's no shortage of implementations where
unsigned short promotes to unsigned int, not int.

The cast in the assignment is redundant since the conversion
is performed anyway. However, the cast in the condition could
preclude the conversion of i (and usLimit) to unsigned int on
some implementations.

Either way, the code has potential for problems if the limit
is larger than 32767.
 
W

Walter Roberson

Peter Nilsson said:
You'll find there's no shortage of implementations where
unsigned short promotes to unsigned int, not int.

ANSI X.3-1989

3.2.1.1 Characters and Integers

A char, a short int, or an int bit-field, or their signed or
unsigned varieties, or an enumeration type, may be used
in any expression whereaever an int or unsigned int may be
used. If an int can represent all values of the original type,
type value is converted to an int; otherwise it is converted
to an unsigned int. These are called the integral promotions.
All other arithmetic types are unchanged by the integral
promotions.

The integral promotions preserve value including sign. As
discussed earlier, whether a "plain" char is treated as signed
is implementation-defined.


Hence, implementations can only promote unsigned short to unsigned
int if the signed int cannot hold all the values of the unsigned
short -- which is to say, implementations on which unsigned short
and unsigned int are the same size. On implementations on which
int has more value bits than unsigned short does, the promotion
must be to signed int, not to unsigned int.

The only implementations I can think of in which unsigned int
and unsigned short are the same size, are ones that have 16 bit
int -- that or they are DSPs that use the same large size for
short and int and long. Are those the implementations you
were thinking of in your "no shortage" statement?
 
C

CBFalconer

Peter said:
.... snip ...

No. You'll find there's no shortage of implementations where
unsigned short promotes to unsigned int, not int.

If they do they are not standards compliant. I agree that they
should, but that is neither here nor there.

--
<http://www.cs.auckland.ac.nz/~pgut001/pubs/vista_cost.txt>

"A man who is right every time is not likely to do very much."
-- Francis Crick, co-discover of DNA
"There is nothing more amazing than stupidity in action."
-- Thomas Matthews
 
K

Keith Thompson

CBFalconer said:
If they do they are not standards compliant. I agree that they
should, but that is neither here nor there.

unsigned short promotes to unsigned int if signed int cannot represent
all possible values of type unsigned short. Consider, for example, a
system where short and int have the same range, as do unsigned short
and unsigned int.

I think you're confusing this with the controversy of "unsigned
preserving" vs. "value preserving" promotion rules. Prior to C89,
some implementations used "unsigned preserving" rules for promotion,
promoting unsigned short to unsigned int regardless of their ranges.
The committee chose to mandate "value preserving" promotion rules.

This is discussed in section 6.3.1.1 of the Rationale
<http://www.open-std.org/jtc1/sc22/wg14/www/C99RationaleV5.10.pdf>.
The choice of value preserving rules is intended to reduce the cases
where an operator has one unsigned int operand and one signed int
operand, yielding a "questionably signed" result.
 
P

Peter Nilsson

ANSI X.3-1989

3.2.1.1 Characters and Integers
Hence, implementations can only promote unsigned short to unsigned
int if the signed int cannot hold all the values of the unsigned
short

s/can only/must/
-- which is to say, implementations on which unsigned short
and unsigned int are the same size. On implementations on which
int has more value bits than unsigned short does, the promotion
must be to signed int, not to unsigned int.
Yes.

The only implementations I can think of in which unsigned int
and unsigned short are the same size, are ones that have 16 bit
int

Similar 32 and 64-bit machines are plentiful.
-- that or they are DSPs that use the same large size for
short and int and long. Are those the implementations you
were thinking of in your "no shortage" statement?

Nope. Virtually every high end server box I've used has had an
implementation where short and int are the same size. Note that
gcc is often configurable in that regard.

[cf http://src.opensolaris.org/source/xref/sfw/usr/src/cmd/gcc/
gcc-3.4.3/gcc/glimits.h]
 
K

Keith Thompson

Peter Nilsson said:
On Feb 1, 9:33 am, (e-mail address removed)-cnrc.gc.ca (Walter Roberson)
wrote: [...]
The only implementations I can think of in which unsigned int
and unsigned short are the same size, are ones that have 16 bit
int

Similar 32 and 64-bit machines are plentiful.
-- that or they are DSPs that use the same large size for
short and int and long. Are those the implementations you
were thinking of in your "no shortage" statement?

Nope. Virtually every high end server box I've used has had an
implementation where short and int are the same size. Note that
gcc is often configurable in that regard.

Really?

In my experience (which is admittedly far from universal), most 32-bit
and 64-bit systems have 16-bit short and 32-bit int. I don't think
I've ever used a system with 16-bit int (before my time). The *only*
systems I've seen where short and int are the same size are Cray
vector systems, where both are 64 bits.

If int is 32 bits, making short 32 bits as well leaves a "hole" in the
type system; assuming CHAR_BIT==8, there's no 16-bit predefined type.
That's legal, of course, but implementations tend to avoid it.

It's quite common for int and long to be the same size; could that be
what you're thinking of?

That doesn't show any actual values; it's full of things like

#define SHRT_MAX __SHRT_MAX__
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,011
Latest member
AjaUqq1950

Latest Threads

Top