Is this a compiler bug? (Function accepts *ptr when **ptr is specified)

D

DSF

Hello!

I have some code that compiles with no errors/warnings that would
seem to me to have an error.

I have four variations on a function called ReadTextFile_XX. (XX
representing the four variations.) Each of them reads the file into a
large buffer and then parses it into a structure containing, among
other things, an array of strings. One for each line of the file.

The variations involve character size, 8-8, 8-16, 16-8, 16-16.
A pointer to a callback function to display progress is passed to
each of them. If it's NULL, the pointer is set to a stub that does
nothing, else it's left alone.

I noticed the four functions had quite a bit of redundancy (the
parts that reads the file into a large buffer were identical), so I
separated that code into a function called ReadTextFileCommon. The
entire code would be too large, so here are the highlights:

TEXTFILEDATAA is the structure.
int64 is a 64 bit int.
DWORD is an unsigned long (same as unsigned int, here).

For the callback routine.
typedef int (*RTFDPR)(int type, DWORD filecount, DWORD filesize);

The common function takes a pointer to the address of the pointer
for the callback routine because it takes care of the NULL to stub
conversion, if needed. The common function reads the file by size,
not characters. This is why filebuffer is a void pointer and
TEXTFILEDATAW can be cast to TEXTFILEDATAA because the common function
doesn't use the text-based parts of the structure.

bool ReadTextFileCommon(TEXTFILEDATAA *tfd, int64 *filesize, void
**filebuffer, void **progress);
{
RTFDPR DisplayProgress;
....
rtfdpr is the stub.
if(*progress == NULL)
*progress = rtfdpr;
DisplayProgress = *progress;
....
}

Read 8, output 16.
void ReadTextFile_AW(TEXTFILEDATAW *tfd, uint start, uint end, uint
codepage, void *progress)
{
....
Passes the address of progress to the common routine.
if(ReadTextFileCommon((TEXTFILEDATAA *)tfd, &filesize, (void
**)&filebuffer, &progress) == false)
return;
....
}

Read 16, output 16.
void ReadTextFile_WW(TEXTFILEDATAW *tfd, uint start, uint end, void
*progress)
....
I hadn't modified this code yet to take into account the pointer to
the address of the progress function pointer. Note no & before
progress in the call.
if(ReadTextFileCommon((TEXTFILEDATAA *)tfd, &filesize, (void
**)&filebuffer, progress) == false)
return;
....
}

The above compiles with no problems. I would expect an error when a
pointer to progress is passed instead of an address of pointer to
progress. Even though they are void pointers in the call, I assume
that void * and void** are not the same thing. Also, I don't see why
&filebuffer (which may be char * or wchar_t *) must be cast to void
**, while progress does not.
Is there something I'm missing here, or is the compiler missing the
error?

Thanks for any info.
DSF
"'Later' is the beginning of what's not to be."
D.S. Fiscus
 
B

Barry Schwarz

Hello!

I have some code that compiles with no errors/warnings that would
seem to me to have an error.

I have four variations on a function called ReadTextFile_XX. (XX
representing the four variations.) Each of them reads the file into a
large buffer and then parses it into a structure containing, among
other things, an array of strings. One for each line of the file.

The variations involve character size, 8-8, 8-16, 16-8, 16-16.
A pointer to a callback function to display progress is passed to
each of them. If it's NULL, the pointer is set to a stub that does
nothing, else it's left alone.

I noticed the four functions had quite a bit of redundancy (the
parts that reads the file into a large buffer were identical), so I
separated that code into a function called ReadTextFileCommon. The
entire code would be too large, so here are the highlights:

TEXTFILEDATAA is the structure.
int64 is a 64 bit int.
DWORD is an unsigned long (same as unsigned int, here).

For the callback routine.
typedef int (*RTFDPR)(int type, DWORD filecount, DWORD filesize);

The common function takes a pointer to the address of the pointer
for the callback routine because it takes care of the NULL to stub
conversion, if needed. The common function reads the file by size,
not characters. This is why filebuffer is a void pointer and
TEXTFILEDATAW can be cast to TEXTFILEDATAA because the common function
doesn't use the text-based parts of the structure.

bool ReadTextFileCommon(TEXTFILEDATAA *tfd, int64 *filesize, void
**filebuffer, void **progress);

Call the preceding @0.
{
RTFDPR DisplayProgress;
...
rtfdpr is the stub.
if(*progress == NULL)
*progress = rtfdpr;
DisplayProgress = *progress;
...
}

Read 8, output 16.
void ReadTextFile_AW(TEXTFILEDATAW *tfd, uint start, uint end, uint
codepage, void *progress)

Call the preceding @1.
{
...
Passes the address of progress to the common routine.
if(ReadTextFileCommon((TEXTFILEDATAA *)tfd, &filesize, (void
**)&filebuffer, &progress) == false)

Call the preceding @2.
return;
...
}

Read 16, output 16.
void ReadTextFile_WW(TEXTFILEDATAW *tfd, uint start, uint end, void
*progress)

Call the preceding @3.
...
I hadn't modified this code yet to take into account the pointer to
the address of the progress function pointer. Note no & before
progress in the call.
if(ReadTextFileCommon((TEXTFILEDATAA *)tfd, &filesize, (void
**)&filebuffer, progress) == false)

Call the preceding @4.
return;
...
}

The above compiles with no problems. I would expect an error when a
pointer to progress is passed instead of an address of pointer to
progress. Even though they are void pointers in the call, I assume
that void * and void** are not the same thing. Also, I don't see why
&filebuffer (which may be char * or wchar_t *) must be cast to void
**, while progress does not.
Is there something I'm missing here, or is the compiler missing the
error?

According to @0, the fourth argument to the common function needs to
have type void**.

According to @1, progress has type void*. In @2, the fourth argument
has type pointer to void* which is the same as void**. The argument
has the same type as expected.

According to @3, progress has type void*. In @4, the fourth argument
has type void*. But void*, being a generic pointer, has an implicit
conversion to (or from) any other object pointer type. Therefore, the
compiler can generate the appropriate code to convert the value from
type void* to void** and there is no error to be reported. Even
though there is no reason for a compiler diagnostic, if the value in
progress is not properly aligned for a void** value, the behavior
during execution is undefined.

If instead of void* and void**, you had used char* and char**, there
would be no implicit conversion in @4 from char* to char** and a
diagnostic would be required.
 
J

Jens Thoms Toerring

DSF said:
I have some code that compiles with no errors/warnings that would
seem to me to have an error.
I have four variations on a function called ReadTextFile_XX. (XX
representing the four variations.) Each of them reads the file into a
large buffer and then parses it into a structure containing, among
other things, an array of strings. One for each line of the file.
The variations involve character size, 8-8, 8-16, 16-8, 16-16.
A pointer to a callback function to display progress is passed to
each of them. If it's NULL, the pointer is set to a stub that does
nothing, else it's left alone.
I noticed the four functions had quite a bit of redundancy (the
parts that reads the file into a large buffer were identical), so I
separated that code into a function called ReadTextFileCommon. The
entire code would be too large, so here are the highlights:
TEXTFILEDATAA is the structure.
int64 is a 64 bit int.
DWORD is an unsigned long (same as unsigned int, here).
For the callback routine.
typedef int (*RTFDPR)(int type, DWORD filecount, DWORD filesize);
The common function takes a pointer to the address of the pointer
for the callback routine because it takes care of the NULL to stub
conversion, if needed.

Dealing with NULL is no sufficient reason for passing a pointer
to a function pointer - a function pointer can be NULL. You
only need to pass a pointer to the function pointer if you
need to know in the caller which function actually was used.
The common function reads the file by size,
not characters. This is why filebuffer is a void pointer and
TEXTFILEDATAW can be cast to TEXTFILEDATAA because the common function
doesn't use the text-based parts of the structure.
bool ReadTextFileCommon(TEXTFILEDATAA *tfd, int64 *filesize, void
**filebuffer, void **progress);

Even if you need to pass a pointer to the function pointer,
why not use the correct pointer type? I.e.

bool ReadTextFileCommon( TEXTFILEDATAA *tfd,
int64 *filesize,
void **filebuffer,
RTFDPR *progress );

That still allows you to do

RTFDPR callback = NULL; /* or a pointer to an existing function */
ReadTextFileCommon(tfd, &filesize, filepuffer, &callback);

but won;t keeo the compiler from complaining if what you pass
as the last argument hasn't the proper type, i.e. a pointer to
an appropriate typed function pointer.
{
RTFDPR DisplayProgress;
...
rtfdpr is the stub.
if(*progress == NULL)
*progress = rtfdpr;
DisplayProgress = *progress;
...
}

But if the caller doesn't need to know which callback routine
was actually used in ReadTextFileCommon() if none was requested
you'd better just use

bool ReadTextFileCommon( TEXTFILEDATAA *tfd,
int64 *filesize,
void **filebuffer,
RTFDPR progress );

(RTFDPR already *is* a pointer!) and call it like this (or
with the callback function to use instead of NULL):

ReadTextFileCommon(tfd, &filesize, filepuffer, NULL);

and in ReadTextFileCommon you can then simply do

if ( ! progress )
progress = rtfdpr;
Read 8, output 16.
void ReadTextFile_AW(TEXTFILEDATAW *tfd, uint start, uint end, uint
codepage, void *progress)
{
...
Passes the address of progress to the common routine.
if(ReadTextFileCommon((TEXTFILEDATAA *)tfd, &filesize, (void
**)&filebuffer, &progress) == false)
return;
...
}
Read 16, output 16.
void ReadTextFile_WW(TEXTFILEDATAW *tfd, uint start, uint end, void
*progress)
...
I hadn't modified this code yet to take into account the pointer to
the address of the progress function pointer. Note no & before
progress in the call.

Yup, that's not a good idea, especially when you do that with
void pointers as Barry Schwarz already pointed out;-)
if(ReadTextFileCommon((TEXTFILEDATAA *)tfd, &filesize, (void
**)&filebuffer, progress) == false)
return;
...
}

Mmmm, too many casts here for my liking. Since you don't show
what the structures TEXTFILEDATAW and TEXTFILEDATAA are and how
they're used in your ReadTextFileCommon() function it's
impossible to say if this can work. Unless the two types of
structures differ only in that TEXTFILEDATAW has some addi-
tional members at the end then there's a high chance for
things to go wrong. I'd think about making the TEXTFILEDATAA
data structure a part of the TEXTFILEDATAW structure (or make
up a structure of those parts they have in common - you can't
use anything else than that in ReadTextFileCommon() anyway) and
pass this common part to ReadTextFileCommon(). That avoids having
to cast and thereby keeping the compiler from helping you if you
try to do something fishy.

The way you pass 'filebuffer' also seems odd. Since you pass a
pointer to a pointer I would guess that you do that to allocate
memory for it within ReadTextFileCommon() (otherwise, why pass
a pointer to a pointer you can't dereference?). But for that in
the ReadTextFileCommon() there's not enough information about
the real type to do the allocation - you can't allocate voids
and you don't know what type it has in the caller...

I'm just can guess that in ReadTextFileCommon() you actually read
in the data from the file and allocate a sufficiently large
buffer for those data - otherwise why pass in a pointer to
a pointers to the destination buffer and a pointer for the file
size? But then it would make more sense to pass perhaps a char**
to the function, for which it allocates enough memory (as many
chars as filesize holds in the end, I presume) and do the casting
back in the caller, which actually knows how it wants to interpret
the "raw" data.
Regards, Jens
 
I

Ike Naar

I have some code that compiles with no errors/warnings that would
seem to me to have an error.

You're probably running your compiler in a mode that expects
a dialect of C, not ISO C. This is the default mode for e.g. gcc.
If you're using gcc, add the -pedantic option.
 
K

Kenny McCormack

You're probably running your compiler in a mode that expects
a dialect of C, not ISO C. This is the default mode for e.g. gcc.
If you're using gcc, add the -pedantic option.

See .sig

--
For instance, Standard C says that nearly all extensions to C are prohibited. How
silly! GCC implements many extensions, some of which were later adopted as part of
the standard. If you want these constructs to give an error message as
“required†by the standard, you must specify ‘--pedantic’, which was
implemented only so that we can say “GCC is a 100% implementation of the
standardâ€, not because there is any reason to actually use it.
 
D

DSF

Thanks to all that replied. Both for answering my question. (So
void* and void** are not considered different until dereferenced. I
assume this goes for any level of indirection for void? void***,
etc.) And for making me realize something. I have been writing quite
a bit of very generic, multi-purpose code lately. Hence the void
pointers where less generic pointers would serve as well.

As to the casting of the structures:

TEXTFILEDATAA and TEXTFILEDATAW are defined as thus:
typedef struct tag_TEXTFILEDATAA
{
int id;
DWORD status;
DWORD sysstatus;
HANDLE fh;
uint index;
uint maxlines;
uint longest;
bool unicode;
char **lines;
} TEXTFILEDATAA;
typedef struct tag_TEXTFILEDATAW
{
int id;
DWORD status;
DWORD sysstatus;
HANDLE fh;
uint index;
uint maxlines;
uint longest;
bool unicode;
wchar_t **lines;
} TEXTFILEDATAW;

They are used much the same way FILE is in standard C.
It is perfectly safe for me to cast the (W)ide version to the (A)NSI
version in calls to the common routine because the common routine
deals in raw file data (bytes) and does not touch the "lines" member.

int64 is a 64 bit int.
DWORD is an unsigned long (same as unsigned int, here).

For the callback routine.
typedef int (*RTFDPR)(int type, DWORD filecount, DWORD filesize);
("filecount" and "filesize" actually return different information
dependent upon "type". This would be a good place for more generic
names!)

I have the same four functions and the common one. My previous
example of one of the four is now:

Read 8, output 16.
void ReadTextFile_AW(TEXTFILEDATAW *tfd, uint start, uint end, uint
codepage, RTFDPR progress);

The common routine is now:
bool ReadTextFileCommon(TEXTFILEDATAA *tfd, int64 *filesize, char
**filebuffer, RTFDPR *progress);

The reason it's fed the address of "progress" is because it contains
the decision making code as to where "progress" is.

rtfdpr is the stub.
if(*progress == NULL)
*progress = rtfdpr;

This allows the common routine to return the proper address of
"progress" for the remainder of the four routines to use. The common
routine is called at the beginning of the four routines.

"progress" may be NULL if the programmer doesn't need/want to use
the callback. In which case it's directed to a do nothing stub. I
have determined that doing it this way instead of testing at each
call, e.g.
if(progress)
progress(a, b, c);
produces smaller and faster code. (YMMV)

DSF
"'Later' is the beginning of what's not to be."
D.S. Fiscus
 
B

Barry Schwarz

Thanks to all that replied. Both for answering my question. (So
void* and void** are not considered different until dereferenced. I

No, they are always considered different. But there is an implicit
conversion between the two, in either direction, which means that it
is legal to assign one to the other (which is conceptually what
happens when an argument is "passed by value" to a parameter for the
function call).

This is no different than passing an int to a function expecting a
long or conversely.
assume this goes for any level of indirection for void? void***,

No. There is no implicit conversion between void** and void***. void*
is the only pointer type implicitly convertible to/from any other
object pointer type.
etc.) And for making me realize something. I have been writing quite
a bit of very generic, multi-purpose code lately. Hence the void
pointers where less generic pointers would serve as well.

As to the casting of the structures:

TEXTFILEDATAA and TEXTFILEDATAW are defined as thus:
typedef struct tag_TEXTFILEDATAA
{
int id;
DWORD status;
DWORD sysstatus;
HANDLE fh;
uint index;
uint maxlines;
uint longest;
bool unicode;
char **lines;
} TEXTFILEDATAA;
typedef struct tag_TEXTFILEDATAW
{
int id;
DWORD status;
DWORD sysstatus;
HANDLE fh;
uint index;
uint maxlines;
uint longest;
bool unicode;
wchar_t **lines;
} TEXTFILEDATAW;

They are used much the same way FILE is in standard C.
It is perfectly safe for me to cast the (W)ide version to the (A)NSI
version in calls to the common routine because the common routine
deals in raw file data (bytes) and does not touch the "lines" member.

You are assuming that char** and wchar_t** are the same size. It may
be true on your current system but there is no reason for it to be
true anywhere else.
int64 is a 64 bit int.
DWORD is an unsigned long (same as unsigned int, here).

For the callback routine.
typedef int (*RTFDPR)(int type, DWORD filecount, DWORD filesize);
("filecount" and "filesize" actually return different information
dependent upon "type". This would be a good place for more generic
names!)

Since filecount and filesize are input only parameters, they do not
return any information to the calling program. Did you mean specify
instead of return?
 
J

James Kuyper

Thanks to all that replied. Both for answering my question. (So
void* and void** are not considered different until dereferenced.

No, void* and void** are incompatible types. Any pointer to an object
type can be converted to void* and back again, and void** is a pointer
to object type, but they are still different types.
As to the casting of the structures:

TEXTFILEDATAA and TEXTFILEDATAW are defined as thus:
typedef struct tag_TEXTFILEDATAA
{
int id;
DWORD status;
DWORD sysstatus;
HANDLE fh;
uint index;
uint maxlines;
uint longest;
bool unicode;
char **lines;
} TEXTFILEDATAA;
typedef struct tag_TEXTFILEDATAW
{
int id;
DWORD status;
DWORD sysstatus;
HANDLE fh;
uint index;
uint maxlines;
uint longest;
bool unicode;
wchar_t **lines;
} TEXTFILEDATAW;

They are used much the same way FILE is in standard C.
It is perfectly safe for me to cast the (W)ide version to the (A)NSI
version in calls to the common routine because the common routine
deals in raw file data (bytes) and does not touch the "lines" member.

There is a rule that covers this: if two structures share a common
initial sequence of fields, a pointer to one struct type can be used to
access an object that's actually of the other type, but only for the
purpose of accessing fields in the common initial sequence. However,
there's additional requirements that must apply in order for such code
to have defined behavior. The two structs must be members of the same
union type, that union type must currently be in scope, and the object
being accessed has to be an instance of that union type (6.5.2.3p6).

In real life, it is very often the case that you can get away with
ignoring the union restrictions - but the behavior is undefined as far
as the C standard is concerned. It's not just the fact that an
implementation is free to insert padding bytes in one of the two struct
types that are not present in the other - while legal, that's pretty
unlikely to happens.

The most serious issue is that such code violates the anti-aliasing
rules (6.5p7). Therefore, if the union requirements from 6.5.2.3p6 are
not met, such code has undefined behavior. If you write to one of the
common fields using an lvalue of one struct type, and then read the
value of the same field using an lvalue of the other struct type, the
compiler is not required to consider the possibility that those two
lvalues refer to the same location in memory. If the previous value of
that field is already stored in a register, it's allowed to use the
value from the register, rather than doing a fresh read from memory.
Such optimizations are legal because of 6.5p7, are quite common, and are
far more likely to occur than arbitrary differences in the location of
padding bytes.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,733
Messages
2,569,440
Members
44,832
Latest member
GlennSmall

Latest Threads

Top