Niu said:
Burton said:
Niu Xiao said:
I see a lot of use in function declarations, such as
size_t fread(void* restrict ptr, size_t size, size_t nobj, FILE*
restrict fp);
but what does the keyword 'restrict' mean? there is no definition
found in K&R 2nd.
restrict keyword
C99 supports the restrict keyword, which allows for certain
optimizations involving pointers. For example:
void copy(int *restrict d, const int *restrict s, int n)
{
while (n-- > 0)
*d++ = *s++;
} C++ does not recognize this keyword.
A simple work-around for code that is meant to be compiled as
either C or C++ is to use a macro for the restrict keyword:
#ifdef __cplusplus
#define restrict /* nothing */
#endif (This feature is likely to be provided as an
extension by many
C++ compilers. If it is, it is also likely to be allowed as a
reference modifier as well as a pointer modifier.)
[C99: §6.2.5, 6.4.1, 6.7.3, 6.7.3.1, 7, A.1.2, J.2]
[C++98: §2.11]
•
from
http://david.tribble.com/text/cdiffs.htm#C99-restrict
but what optimizations invoving pointers?
If you know that two pointers do not point to the same object,
then you can leave out some kinds of sanity checks, can change
loops (e.g. make a while (a!=b) loop into a do--while(a != b)
loop), can work directly without intermediate copy.
Think of memcpy() and memmove(). If you had to implement
memmove() portably and had no way of checking whether the source
and destination pointers belong to the same object, you were
not allowed to compare the pointers with < or >, thus you cannot
take care of overlapping objects flexibly. This means that you
have to do something along the lines of
void *MemMove (void *pDest, void *pSrc, size_t size)
{
if (pDest == pSrc) {
return pDest;
}
void *pTemp = malloc(size);
if (pTemp) {
unsigned char * pMDest = pTemp;
unsigned char * pMSrc = pSrc;
for (size_t i = 0; i < size; i++) {
*pMDest++ = *pMSrc++;
}
pMDest = pDest;
pMSrc = pTemp;
for (size_t i = 0; i < size; i++) {
*pMDest++ = *pMSrc++;
}
free(pTemp);
pTemp = pDest;
}
return pTemp;
}
which may be pretty wasteful of memory. For MemCopy(), you
can do
void *MemCopy (void * restrict pDest,
void * restrict pSrc, size_t size)
{
unsigned char * restrict pCDest = pDest;
unsigned char * restrict pCSrc = pSrc;
for (size_t i = 0; i < size; i++) {
*pCDest++ = *pCSrc++;
}
return pDest;
}
That is quite a difference, I'd say.
Now, imagine for a moment we were an optimizing compiler and
would optimize the following function
1)
void *MemFoo (void * restrict pDest,
void * restrict pSrc, size_t size)
{
/* Never true: Can be thrown away */
if (pDest == pSrc) {
return pDest;
}
void *pTemp = malloc(size);
if (pTemp) {
unsigned char * pMDest = pTemp;
unsigned char * pMSrc = pSrc;
/* Loops can be merged if auxiliary variable is introduced */
for (size_t i = 0; i < size; i++) {
*pMDest++ = *pMSrc++;
}
pMDest = pDest;
pMSrc = pTemp;
for (size_t i = 0; i < size; i++) {
*pMDest++ = *pMSrc++;
}
free(pTemp);
pTemp = pDest;
}
return pTemp;
}
2)
void *MemFoo (void * restrict pDest,
void * restrict pSrc, size_t size)
{
void *pTemp = malloc(size);
if (pTemp) {
unsigned char * pMDest = pTemp;
unsigned char * restrict pMSrc = pSrc;
unsigned char * restrict pAux = pDest;
for (size_t i = 0; i < size; i++) {
/* pMDest is only used to change the object pointed to by pTemp */
*pMDest++ = *pMSrc++;
*pAux++ = *pMDest++;
}
/* The object pointed to by pTemp is not used after the redefinition */
free(pTemp);
pTemp = pDest;
}
return pTemp;
}
3)
void *MemFoo (void * restrict pDest,
void * restrict pSrc, size_t size)
{
unsigned char *pTemp = malloc(size);
if (pTemp) {
unsigned char * restrict pMSrc = pSrc;
unsigned char * restrict pAux = pDest;
for (size_t i = 0; i < size; i++) {
*pAux++ = *pMSrc++;
}
/* Code could be rescheduled to sooner free resources */
free(pTemp);
pTemp = pDest;
}
return pTemp;
}
4)
void *MemFoo (void * restrict pDest,
void * restrict pSrc, size_t size)
{
unsigned char *pTemp = malloc(size);
if (pTemp) {
free(pTemp);
pTemp = pDest;
unsigned char * restrict pMSrc = pSrc;
unsigned char * restrict pAux = pDest;
for (size_t i = 0; i < size; i++) {
*pAux++ = *pMSrc++;
}
}
return pTemp;
}
Not the optimum. If we imagine for a moment a compiler that
can do truly wondrous things, then we could eliminate the
calls to malloc() and free() as there is only a use of pTemp
in a Boolean context in between.
5->7)
void *MemFoo (void * restrict pDest,
void * restrict pSrc, size_t size)
{
unsigned char *pTemp;
if (size <= __MAX_MALLOC) {
pTemp = pDest;
unsigned char * restrict pMSrc = pSrc;
unsigned char * restrict pAux = pDest;
for (size_t i = 0; i < size; i++) {
*pAux++ = *pMSrc++;
}
}
else {
pTemp = NULL;
}
return pTemp;
}
where __MAX_MALLOC may be an implementation dependent constant
defining the maximal number of bytes that can be allocated by
one call to malloc().
If __MAX_MALLOC does not exist, we arrive at MemCpy(), otherwise
we arrive at
void *MemFoo (void * restrict pDest,
void * restrict pSrc, size_t size)
{
if (size <= __MAX_MALLOC) {
unsigned char * restrict pMSrc = pSrc;
unsigned char * restrict pAux = pDest;
for (size_t i = 0; i < size; i++) {
*pAux++ = *pMSrc++;
}
return pDest;
}
return NULL;
}
Nice, huh?
Cheers
Michael