Problem with "restrict"

  • Thread starter diegotorquemada
  • Start date

D

diegotorquemada

Hello all!

I am trying to learn to use the reserved word restrict. However, it seems that it is not working. The code with and without that keyword have the same size and running time, even though I am inducing pointer aliasing.

Can somebody please check my minimal example?

Thanks and kind regards!

Diego

/*
gcc -std=c99 -O3 -Wall -D USE_RESTRICT -o 08_restrict 08_restrict.c
ls -l
time -p ./08_restrict
*/

#include<stdio.h>
#include<stdlib.h>

#ifdef USE_RESTRICT
#else
#define restrict
#endif

const int N = 100000;

int *restrict x, *restrict y, *restrict z;

void fun(int *restrict a, int *restrict b, int *restrict c, int n);

int main()
{
x = malloc(N * sizeof(int));
y = malloc(N * sizeof(int));
z = malloc(N * sizeof(int));
if (x==NULL || y==NULL || z==NULL)
{
fprintf(stderr, "Error with malloc()!\n");
return EXIT_FAILURE;
}

// run this several times in order to have some average time
for (int i=0; i<10000; i++)
{
for (int j=0; j<N; j++) x[j] = y[j] = z[j] = j;
fun(x, y, z, N);
// fun(x, x, x, N);
}

free(x);
free(y);
free(z);

return EXIT_SUCCESS;
}

void fun(int *restrict a, int *restrict b, int *restrict c, int n)
{
for (int i=n; i<n; i++)
{
// The compiler has to read c in the second line because
// it may happen that either b points to a or c.
b = b + c;
a = a + b*c;
}

return;
}
 
Ad

Advertisements

I

Ike Naar

void fun(int *restrict a, int *restrict b, int *restrict c, int n)
{
for (int i=n; i<n; i++)

Did you really mean 'i=n' here, or should that be 'i=0' ?
As written, the loop body is never executed because the initial condition
'i==n' implies the termination condition 'i>=n'.
{
// The compiler has to read c in the second line because
// it may happen that either b points to a or c.
b = b + c;
a = a + b*c;
}

return;


This statement is harmless but redundant.
 
D

diegotorquemada

Did you really mean 'i=n' here, or should that be 'i=0' ?
Ups Ike, I didn't see that mistake. However correcting to i=0, the code still does not work.
As written, the loop body is never executed because the initial condition

'i==n' implies the termination condition 'i>=n'.


// The compiler has to read c in the second line because
// it may happen that either b points to a or c.
b = b + c;

a = a + b*c;

return;




This statement is harmless but redundant.


 
S

Shao Miller

This statement is harmless but redundant.

It can also be handy as a habit if you ever wish to override 'return'
with a macro, so that something special can happen at each place it
appears... Same with a 'continue' as an entire loop body or just before
the end of a loop body's compound statement.

while (stuff())
continue;

while (stuff())
{
continue;
}
 
J

Joe Pfeiffer

void fun(int *restrict a, int *restrict b, int *restrict c, int n)
{
for (int i=n; i<n; i++)
{

I hadn't encountered the restrict keyword before, but my understanding
after a bit of reading is that in using it you are promising that a, b,
and c *won't* point to the same object.
// The compiler has to read c in the second line because
// it may happen that either b points to a or c.
b = b + c;
a = a + b*c;
}

return;
}
 
J

James Kuyper

void fun(int *restrict a, int *restrict b, int *restrict c, int n)
{
for (int i=n; i<n; i++)
{

I hadn't encountered the restrict keyword before, but my understanding
after a bit of reading is that in using it you are promising that a, b,
and c *won't* point to the same object.
// The compiler has to read c in the second line because
// it may happen that either b points to a or c.
b = b + c;
a = a + b*c;
}

return;
}


The rules that apply to restrict-qualified pointers are substantially
more complicated, and I could easily construct examples where your
description would be wrong. However, you've got the right basic idea.
The point is, that because 'restrict' makes that promise, it enables a
compiler to optimize the code on the assumption that the promise will
not be broken. This only matters if there is such an optimization
possible. In this case, there is. If c will never refer to the same
object as b, then it's value need only be retrieved once per pass
through the loop; without 'restrict', it would have to be retrieved twice.
Note, however, that 'restrict' doesn't mandate performance of such
optimizations, it only allows them.
 
Ad

Advertisements

Ö

Öö Tiib

I am trying to learn to use the reserved word restrict. However, it seems
that it is not working. The code with and without that keyword have the
same size and running time, even though I am inducing pointer aliasing.

The example does not contain function without restrict.

The function with restrict is equivalent to nop on most optimizing compilers
so the size is ZERO I presume?

There are no code that measures anything. There are no compiler
options in example.

Those few things are usually done wrongly when someone
"measures performance" and gets "seemingly surprising results".
So I bet it is between chair and keyboard what is not working.
 
8

88888 Dihedral

(e-mail address removed)æ–¼ 2013å¹´3月10日星期日UTC+8上åˆ7時11分30秒寫é“:
Hello all!



I am trying to learn to use the reserved word restrict. However, it seemsthat it is not working. The code with and without that keyword have the same size and running time, even though I am inducing pointer aliasing.



Can somebody please check my minimal example?



Thanks and kind regards!



Diego



/*

gcc -std=c99 -O3 -Wall -D USE_RESTRICT -o 08_restrict 08_restrict.c

ls -l

time -p ./08_restrict

*/



#include<stdio.h>

#include<stdlib.h>



#ifdef USE_RESTRICT

#else

#define restrict

#endif



const int N = 100000;



int *restrict x, *restrict y, *restrict z;



void fun(int *restrict a, int *restrict b, int *restrict c, int n);



int main()

{

x = malloc(N * sizeof(int));

y = malloc(N * sizeof(int));

z = malloc(N * sizeof(int));

if (x==NULL || y==NULL || z==NULL)

{

fprintf(stderr, "Error with malloc()!\n");

return EXIT_FAILURE;

}



// run this several times in order to have some average time

for (int i=0; i<10000; i++)

{

for (int j=0; j<N; j++) x[j] = y[j] = z[j] = j;

fun(x, y, z, N);

// fun(x, x, x, N);

}



free(x);

free(y);

free(z);



return EXIT_SUCCESS;

//====
If you are programming short scripts , then nothing does matter
in your way of EXITING the real time running conditions.


But if you are programming for a library to be used by others in this
way, then the story might be quite different in your career.


}



void fun(int *restrict a, int *restrict b, int *restrict c, int n)

{

for (int i=n; i<n; i++)

{

// The compiler has to read c in the second line because

// it may happen that either b points to a or c.

b = b + c;

a = a + b*c;

}



return;

}
 
K

Keith Thompson

I am trying to learn to use the reserved word restrict. However, it
seems that it is not working. The code with and without that keyword
have the same size and running time, even though I am inducing pointer
aliasing.

Can somebody please check my minimal example?
[snip]

The fact that you get the same code with and without restrict doesn't
mean much.

The use of the "restrict" keyword is a promise (from the programmer
to the compiler) that certain restrictions are not violated.
The compiler is permitted, but not required, to take advantage
of that promise to enable optimizations. An implementation that
ignores "restrict" entirely (other than checking its syntax)
could be conforming. A program that uses "restrict" but violates
the restrictions is in effect lying to the compiler, making the
program's behavior undefined.

The "restrict" keyword can make a program's behavior undefined when it
would be well defined without it. That's pretty much all it does. That
has the side effect of enabling optimizations.
 
N

Noob

diego said:
I am trying to learn to use the reserved word restrict. However, it
seems that it is not working. The code with and without that keyword
have the same size and running time, even though I am inducing
pointer aliasing.

As Robert said, it's best to look at the generated code.

$ cat restrict.c
#ifndef USE_RESTRICT
#define restrict
#endif

void fun(int *restrict a, int *restrict b, int *restrict c)
{
*a = *c;
*b = *c;
}
$ gcc -std=c99 -pedantic -Wall -Wextra -O3 -S restrict.c -o v1.s
$ gcc -std=c99 -pedantic -Wall -Wextra -O3 -S -DUSE_RESTRICT restrict.c -o v2.s

Without restrict:
movl 12(%esp), %eax ; eax <- c
movl 4(%esp), %edx ; edx <- a
movl (%eax), %ecx ; ecx <- *c
movl %ecx, (%edx) ; *a = *c
movl (%eax), %edx ; edx <- *c
movl 8(%esp), %eax ; eax <- b
movl %edx, (%eax) ; *b = *c
ret

NB: *c is loaded twice.

With restrict:
movl 12(%esp), %eax ; eax <- c
movl 4(%esp), %edx ; edx <- a
movl (%eax), %eax ; eax <- *c
movl %eax, (%edx) ; *a = *c
movl 8(%esp), %edx ; edx <- b
movl %eax, (%edx) ; *b = *c
ret

NB: *c is loaded only once.

Regards.
 
E

Edward A. Falk

That's the "problem" here. The compiler knows that "malloc" returns
non-overlapping memory areas (unless it returns 0), and because it has
the definition of "fun" on hand it can put two and two together and see
that there are no possible overlaps between the arrays, and thus
generate better code using that assumption.

Try putting the definition of "fun" in an external C file and compiling
the two files separately.

Since fun() is a global, the compiler can't assume that main() is the only
thing that ever calls it, so I don't think the compiler can make that
particular assumption/optimization.

At this point, I can only assume either that the compiler isn't making
an optimization w.r.t. c that it could be making, or that the
optimization isn't making enough of a difference time-wise to be
detectable to OP.

We really need to see the generated code at this point to be sure.
 
Ad

Advertisements

E

Edward A. Falk

"fun" is not a nop. The compiler may, however, figure out that the
entire code ("main" and "fun") does nothing.

D'oh! Of course. I'll bet the entire outer loop was optimized
out of existance. If the compiler knows that fun() has no
side effects, it could do that.

OP, try adding a function call at the end of main() that
references your inputs. That would force the compiler to
keep the outer loop.
 
J

James Kuyper

On 03/11/2013 12:20 PM, David Brown wrote:
....
The compiler may still make some assumptions and optimisations here.
First, if the compiler knows that the function cannot be accessed
externally, then it can be treated as "static". ...

In the context of function definitions, "static" means "cannot be
accessed externally (by name)" (a pointer to a function with internal
linkage can be passed to code in a different translation unit, and
dereferenced there, which is the reason for the "by name" exception). If
a function is not actually static, it can be accessed externally, so it
seems to me that your exception would never apply. Could you give an
example of a context where it would?
... I don't know the
details of Linux compilation here (most of my work is embedded systems),
but if there is no way for anything to access the "fun" function as a
sort of shared library, then since the compiler knows it is generating
an executable rather than a linkable object file, it could make that
assumption and then do the alias analysis.

If some aspects of code generation are delayed until link time, the
linker can know that a function with external linkage is only used in
contexts that enable that optimization. Otherwise, I don't see any way
for that to work.

A function which doesn't have any objects with static storage duration,
and which doesn't access any global variables can be in-lined, whether
or not it's declared inline, and such optimizations could be performed
on the in-lined version. But unless the function has internal linkage,
there must also be an external definition, for which such optimization
cannot be performed until all translation units that make up the program
have been processed and are being linked together.
 
J

James Kuyper

In this case, the compiler could have optimised the second load away,
even without restrict. Even if a and c are the same pointer, it makes
no difference whether the same value is reloaded or not. If you used
non-aligned pointer so writing to *a changes some bytes of *c, then
you have undefined behaviour.

a, b, and c all have the type int*, and therefore must all be correctly
aligned for an int. Where do unaligned pointers enter into this?

Diego's original code sets b to a different value than c, so if
b==c, it makes a big difference whether c is retrieved a second time
for it's second usage (as required without 'restrict') or only once (as
allowed by the use of 'restrict'). Noob's simplified version not only
reorders the assignments, so that it's now an issue whether a==c, rather
than b==c; he also simplified it by setting *a to the same value as *c.
As a result, it doesn't make any difference whether a==c, thereby
missing the whole point of Diego's code.
 
N

Noob

James said:
As a result, it doesn't make any difference whether a==c,
thereby missing the whole point of Diego's code.

I most certainly DID NOT "miss the whole point of Diego's code"
as you so undiplomatically put it.

I was just trying to illustrate the effect of the restrict
keyword, using as simple an example as possible. Obviously,
I over-simplified.
 
J

James Kuyper

I most certainly DID NOT "miss the whole point of Diego's code"
as you so undiplomatically put it.

I was just trying to illustrate the effect of the restrict
keyword, using as simple an example as possible. Obviously,
I over-simplified.

Illustrating the effect of the restrict keyword requires writing code
where it makes a difference whether or not 'restrict' is present. What I
so undiplomatically pointed out was the fact that you over-simplified
the code because you missed the importance of one feature of the
original code that was necessary in order to make 'restrict' relevant.

If you can suggest a more diplomatic way I could have used to describe
the problem, I'm willing to learn. However, it seems to me that any
annoyance I caused you was inherent in what I was saying; nothing I
could do to sugar-coat my description of the problem would change that.
 
Ad

Advertisements

N

Noob

James said:
Illustrating the effect of the restrict keyword requires writing code
where it makes a difference whether or not 'restrict' is present.

Please note that the presence/absence of 'restrict' in my
(admittedly over-simplified) example did make a difference,
as far as gcc-4.6.3 -O3 was concerned.

But thanks to you and Christian for pointing out that an
aggressively optimizing compiler may determine that after
*a = *c, *a and *c must have the same contents, therefore
it is not necessary to reload *c.

In hindsight, I should have commented on

void fun(int *restrict a, int *restrict b, int *restrict c)
{
*a = *c+1;
*b = *c+1;
}
 
D

diegotorquemada

Hello all!



I am trying to learn to use the reserved word restrict. However, it seems that it is not working. The code with and without that keyword have the same size and running time, even though I am inducing pointer aliasing.



Can somebody please check my minimal example?



Thanks and kind regards!



Diego

Probably my example was not the best because, from the discussion I deduce that the compiler was "clever" enough to find the pointer alignment I was inducing in the code.

Does somebody has any other example with the restrict keyword that could be helpful in order to see the power of that reserved word?

Thanks,

Diego
 
Ad

Advertisements

E

Edward A. Falk

The compiler can make an inlined copy of fun () and optimize that.

Hmmm; very good point. (I've often wondered what the compiler
does with an "inline" function that isn't declared static.)

That brings back to the same conclusion: we need to see the
generated code.

Obviously, this is all an exercise in pedantry anyway; knowing
what one compiler did in one particular case doesn't really
teach us much about programming in C.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top