Problem with "restrict"

diegotorquemada · Mar 9, 2013

Hello all!

I am trying to learn to use the reserved word restrict. However, it seems that it is not working. The code with and without that keyword have the same size and running time, even though I am inducing pointer aliasing.

Can somebody please check my minimal example?

Thanks and kind regards!

Diego

/*
gcc -std=c99 -O3 -Wall -D USE_RESTRICT -o 08_restrict 08_restrict.c
ls -l
time -p ./08_restrict
*/

#include<stdio.h>
#include<stdlib.h>

#ifdef USE_RESTRICT
#else
#define restrict
#endif

const int N = 100000;

int *restrict x, *restrict y, *restrict z;

void fun(int *restrict a, int *restrict b, int *restrict c, int n);

int main()
{
x = malloc(N * sizeof(int));
y = malloc(N * sizeof(int));
z = malloc(N * sizeof(int));
if (x==NULL || y==NULL || z==NULL)
{
fprintf(stderr, "Error with malloc()!\n");
return EXIT_FAILURE;
}

// run this several times in order to have some average time
for (int i=0; i<10000; i++)
{
for (int j=0; j<N; j++) x[j] = y[j] = z[j] = j;
fun(x, y, z, N);
// fun(x, x, x, N);
}

free(x);
free(y);
free(z);

return EXIT_SUCCESS;
}

void fun(int *restrict a, int *restrict b, int *restrict c, int n)
{
for (int i=n; i<n; i++)
{
// The compiler has to read c in the second line because
// it may happen that either b points to a or c.
b = b + c;
a = a + b*c;
}

return;
}

Ike Naar · Mar 9, 2013

void fun(int *restrict a, int *restrict b, int *restrict c, int n)
{
for (int i=n; i<n; i++)

Did you really mean 'i=n' here, or should that be 'i=0' ?
As written, the loop body is never executed because the initial condition
'i==n' implies the termination condition 'i>=n'.

{
// The compiler has to read c in the second line because
// it may happen that either b points to a or c.
b = b + c;
a = a + b*c;
}

return;

This statement is harmless but redundant.

diegotorquemada · Mar 10, 2013

Did you really mean 'i=n' here, or should that be 'i=0' ?

Ups Ike, I didn't see that mistake. However correcting to i=0, the code still does not work.

As written, the loop body is never executed because the initial condition

'i==n' implies the termination condition 'i>=n'.

{

Click to expand...

// The compiler has to read c in the second line because

Click to expand...

// it may happen that either b points to a or c.

Click to expand...

b = b + c;

Click to expand...

a = a + b*c;

}

Click to expand...

return;

Click to expand...

This statement is harmless but redundant.

}

Click to expand...

Shao Miller · Mar 10, 2013

This statement is harmless but redundant.

It can also be handy as a habit if you ever wish to override 'return'
with a macro, so that something special can happen at each place it
appears... Same with a 'continue' as an entire loop body or just before
the end of a loop body's compound statement.

while (stuff())
continue;

while (stuff())
{
continue;
}

Joe Pfeiffer · Mar 10, 2013

void fun(int *restrict a, int *restrict b, int *restrict c, int n)
{
for (int i=n; i<n; i++)
{

I hadn't encountered the restrict keyword before, but my understanding
after a bit of reading is that in using it you are promising that a, b,
and c *won't* point to the same object.

// The compiler has to read c in the second line because
// it may happen that either b points to a or c.
b = b + c;
a = a + b*c;
}

return;
}

James Kuyper · Mar 10, 2013

void fun(int *restrict a, int *restrict b, int *restrict c, int n)
{
for (int i=n; i<n; i++)
{

Click to expand...

I hadn't encountered the restrict keyword before, but my understanding
after a bit of reading is that in using it you are promising that a, b,
and c *won't* point to the same object.

// The compiler has to read c in the second line because
// it may happen that either b points to a or c.
b = b + c;
a = a + b*c;
}

return;
}

Click to expand...

The rules that apply to restrict-qualified pointers are substantially
more complicated, and I could easily construct examples where your
description would be wrong. However, you've got the right basic idea.
The point is, that because 'restrict' makes that promise, it enables a
compiler to optimize the code on the assumption that the promise will
not be broken. This only matters if there is such an optimization
possible. In this case, there is. If c will never refer to the same
object as b, then it's value need only be retrieved once per pass
through the loop; without 'restrict', it would have to be retrieved twice.
Note, however, that 'restrict' doesn't mandate performance of such
optimizations, it only allows them.

Öö Tiib · Mar 10, 2013

I am trying to learn to use the reserved word restrict. However, it seems
that it is not working. The code with and without that keyword have the
same size and running time, even though I am inducing pointer aliasing.

The example does not contain function without restrict.

The function with restrict is equivalent to nop on most optimizing compilers
so the size is ZERO I presume?

There are no code that measures anything. There are no compiler
options in example.

Those few things are usually done wrongly when someone
"measures performance" and gets "seemingly surprising results".
So I bet it is between chair and keyboard what is not working.

88888 Dihedral · Mar 10, 2013

(e-mail address removed)æ–¼ 2013å¹´3æœˆ10æ—¥æ˜ŸæœŸæ—¥UTC+8ä¸Šåˆ7æ™‚11åˆ†30ç§’å¯«é“ï¼š

Hello all!

I am trying to learn to use the reserved word restrict. However, it seemsthat it is not working. The code with and without that keyword have the same size and running time, even though I am inducing pointer aliasing.

Can somebody please check my minimal example?

Thanks and kind regards!

Diego

/*

gcc -std=c99 -O3 -Wall -D USE_RESTRICT -o 08_restrict 08_restrict.c

ls -l

time -p ./08_restrict

*/

#include<stdio.h>

#include<stdlib.h>

#ifdef USE_RESTRICT

#else

#define restrict

#endif

const int N = 100000;

int *restrict x, *restrict y, *restrict z;

void fun(int *restrict a, int *restrict b, int *restrict c, int n);

int main()

{

x = malloc(N * sizeof(int));

y = malloc(N * sizeof(int));

z = malloc(N * sizeof(int));

if (x==NULL || y==NULL || z==NULL)

{

fprintf(stderr, "Error with malloc()!\n");

return EXIT_FAILURE;

}

// run this several times in order to have some average time

for (int i=0; i<10000; i++)

{

for (int j=0; j<N; j++) x[j] = y[j] = z[j] = j;

fun(x, y, z, N);

// fun(x, x, x, N);

}

free(x);

free(y);

free(z);

return EXIT_SUCCESS;

//====
If you are programming short scripts , then nothing does matter
in your way of EXITING the real time running conditions.

But if you are programming for a library to be used by others in this
way, then the story might be quite different in your career.

}

void fun(int *restrict a, int *restrict b, int *restrict c, int n)

{

for (int i=n; i<n; i++)

{

// The compiler has to read c in the second line because

// it may happen that either b points to a or c.

b = b + c;

a = a + b*c;

}

return;

}

Keith Thompson · Mar 11, 2013

I am trying to learn to use the reserved word restrict. However, it
seems that it is not working. The code with and without that keyword
have the same size and running time, even though I am inducing pointer
aliasing.

Can somebody please check my minimal example?

[snip]

The fact that you get the same code with and without restrict doesn't
mean much.

The use of the "restrict" keyword is a promise (from the programmer
to the compiler) that certain restrictions are not violated.
The compiler is permitted, but not required, to take advantage
of that promise to enable optimizations. An implementation that
ignores "restrict" entirely (other than checking its syntax)
could be conforming. A program that uses "restrict" but violates
the restrictions is in effect lying to the compiler, making the
program's behavior undefined.

The "restrict" keyword can make a program's behavior undefined when it
would be well defined without it. That's pretty much all it does. That
has the side effect of enabling optimizations.

Noob · Mar 11, 2013

diego said:
I am trying to learn to use the reserved word restrict. However, it
seems that it is not working. The code with and without that keyword
have the same size and running time, even though I am inducing
pointer aliasing.

As Robert said, it's best to look at the generated code.

$ cat restrict.c
#ifndef USE_RESTRICT
#define restrict
#endif

void fun(int *restrict a, int *restrict b, int *restrict c)
{
*a = *c;
*b = *c;
}
$ gcc -std=c99 -pedantic -Wall -Wextra -O3 -S restrict.c -o v1.s
$ gcc -std=c99 -pedantic -Wall -Wextra -O3 -S -DUSE_RESTRICT restrict.c -o v2.s

Without restrict:
movl 12(%esp), %eax ; eax <- c
movl 4(%esp), %edx ; edx <- a
movl (%eax), %ecx ; ecx <- *c
movl %ecx, (%edx) ; *a = *c
movl (%eax), %edx ; edx <- *c
movl 8(%esp), %eax ; eax <- b
movl %edx, (%eax) ; *b = *c
ret

NB: *c is loaded twice.

With restrict:
movl 12(%esp), %eax ; eax <- c
movl 4(%esp), %edx ; edx <- a
movl (%eax), %eax ; eax <- *c
movl %eax, (%edx) ; *a = *c
movl 8(%esp), %edx ; edx <- b
movl %eax, (%edx) ; *b = *c
ret

NB: *c is loaded only once.

Regards.

Edward A. Falk · Mar 11, 2013

That's the "problem" here. The compiler knows that "malloc" returns
non-overlapping memory areas (unless it returns 0), and because it has
the definition of "fun" on hand it can put two and two together and see
that there are no possible overlaps between the arrays, and thus
generate better code using that assumption.

Try putting the definition of "fun" in an external C file and compiling
the two files separately.

Since fun() is a global, the compiler can't assume that main() is the only
thing that ever calls it, so I don't think the compiler can make that
particular assumption/optimization.

At this point, I can only assume either that the compiler isn't making
an optimization w.r.t. c that it could be making, or that the
optimization isn't making enough of a difference time-wise to be
detectable to OP.

We really need to see the generated code at this point to be sure.

Edward A. Falk · Mar 11, 2013

"fun" is not a nop. The compiler may, however, figure out that the
entire code ("main" and "fun") does nothing.

D'oh! Of course. I'll bet the entire outer loop was optimized
out of existance. If the compiler knows that fun() has no
side effects, it could do that.

OP, try adding a function call at the end of main() that
references your inputs. That would force the compiler to
keep the outer loop.

James Kuyper · Mar 11, 2013

On 03/11/2013 12:20 PM, David Brown wrote:
....

The compiler may still make some assumptions and optimisations here.
First, if the compiler knows that the function cannot be accessed
externally, then it can be treated as "static". ...

In the context of function definitions, "static" means "cannot be
accessed externally (by name)" (a pointer to a function with internal
linkage can be passed to code in a different translation unit, and
dereferenced there, which is the reason for the "by name" exception). If
a function is not actually static, it can be accessed externally, so it
seems to me that your exception would never apply. Could you give an
example of a context where it would?

... I don't know the
details of Linux compilation here (most of my work is embedded systems),
but if there is no way for anything to access the "fun" function as a
sort of shared library, then since the compiler knows it is generating
an executable rather than a linkable object file, it could make that
assumption and then do the alias analysis.

If some aspects of code generation are delayed until link time, the
linker can know that a function with external linkage is only used in
contexts that enable that optimization. Otherwise, I don't see any way
for that to work.

A function which doesn't have any objects with static storage duration,
and which doesn't access any global variables can be in-lined, whether
or not it's declared inline, and such optimizations could be performed
on the in-lined version. But unless the function has internal linkage,
there must also be an external definition, for which such optimization
cannot be performed until all translation units that make up the program
have been processed and are being linked together.

James Kuyper · Mar 13, 2013

In this case, the compiler could have optimised the second load away,
even without restrict. Even if a and c are the same pointer, it makes
no difference whether the same value is reloaded or not. If you used
non-aligned pointer so writing to *a changes some bytes of *c, then
you have undefined behaviour.

a, b, and c all have the type int*, and therefore must all be correctly
aligned for an int. Where do unaligned pointers enter into this?

Diego's original code sets b to a different value than c, so if
b==c, it makes a big difference whether c is retrieved a second time
for it's second usage (as required without 'restrict') or only once (as
allowed by the use of 'restrict'). Noob's simplified version not only
reorders the assignments, so that it's now an issue whether a==c, rather
than b==c; he also simplified it by setting *a to the same value as *c.
As a result, it doesn't make any difference whether a==c, thereby
missing the whole point of Diego's code.

Noob · Mar 13, 2013

James said:
As a result, it doesn't make any difference whether a==c,
thereby missing the whole point of Diego's code.

I most certainly DID NOT "miss the whole point of Diego's code"
as you so undiplomatically put it.

I was just trying to illustrate the effect of the restrict
keyword, using as simple an example as possible. Obviously,
I over-simplified.

James Kuyper · Mar 13, 2013

I most certainly DID NOT "miss the whole point of Diego's code"
as you so undiplomatically put it.

I was just trying to illustrate the effect of the restrict
keyword, using as simple an example as possible. Obviously,
I over-simplified.

Illustrating the effect of the restrict keyword requires writing code
where it makes a difference whether or not 'restrict' is present. What I
so undiplomatically pointed out was the fact that you over-simplified
the code because you missed the importance of one feature of the
original code that was necessary in order to make 'restrict' relevant.

If you can suggest a more diplomatic way I could have used to describe
the problem, I'm willing to learn. However, it seems to me that any
annoyance I caused you was inherent in what I was saying; nothing I
could do to sugar-coat my description of the problem would change that.

Noob · Mar 13, 2013

James said:
Illustrating the effect of the restrict keyword requires writing code
where it makes a difference whether or not 'restrict' is present.

Please note that the presence/absence of 'restrict' in my
(admittedly over-simplified) example did make a difference,
as far as gcc-4.6.3 -O3 was concerned.

But thanks to you and Christian for pointing out that an
aggressively optimizing compiler may determine that after
*a = *c, *a and *c must have the same contents, therefore
it is not necessary to reload *c.

In hindsight, I should have commented on

void fun(int *restrict a, int *restrict b, int *restrict c)
{
*a = *c+1;
*b = *c+1;
}

diegotorquemada · Mar 13, 2013

Hello all!

I am trying to learn to use the reserved word restrict. However, it seems that it is not working. The code with and without that keyword have the same size and running time, even though I am inducing pointer aliasing.

Can somebody please check my minimal example?

Thanks and kind regards!

Diego

Probably my example was not the best because, from the discussion I deduce that the compiler was "clever" enough to find the pointer alignment I was inducing in the code.

Does somebody has any other example with the restrict keyword that could be helpful in order to see the power of that reserved word?

Thanks,

Diego

Shao Miller · Mar 13, 2013

Probably my example was not the best because, from the discussion I deduce that the compiler was "clever" enough to find the pointer alignment I was inducing in the code.

Does somebody has any other example with the restrict keyword that could be helpful in order to see the power of that reserved word?

Maybe have a read of this:

http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1520.htm

With particular attention to the examples.

Edward A. Falk · Mar 14, 2013

The compiler can make an inlined copy of fun () and optimize that.

Hmmm; very good point. (I've often wondered what the compiler
does with an "inline" function that isn't declared static.)

That brings back to the same conclusion: we need to see the
generated code.

Obviously, this is all an exercise in pedantry anyway; knowing
what one compiler did in one particular case doesn't really
teach us much about programming in C.

Could you fix my code please? (i get no output after inputing the number)	1	Oct 16, 2023
Fibonacci	0	May 13, 2023
Where is my mistake? Why is s equal to minus infinity at some loop iterations?	0	Oct 9, 2022
SENTINEL CONTROL LOOP WHEN DEALING WITH TWO ARRAYS	1	Oct 26, 2023
Need help in debugging tic tac toe (Beginner)	0	Jun 28, 2023
Need help in debugging tic tac toe (beginner)	2	Jun 28, 2023
Adding adressing of IPv6 to program	1	Feb 16, 2023
Programming math challenge gives wrong answer	2	Aug 6, 2023

Problem with "restrict"

diegotorquemada

Ike Naar

diegotorquemada

Shao Miller

Joe Pfeiffer

James Kuyper

Öö Tiib

88888 Dihedral

Keith Thompson

Noob

Edward A. Falk

Edward A. Falk

James Kuyper

James Kuyper

Noob

James Kuyper

Noob

diegotorquemada

Shao Miller

Edward A. Falk

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads