Declaration of main()

B

BartC

Never thought I'd be asking about this, but it's giving me some trouble!

I want to use a declaration that looks like this:

typedef unsigned char* ichar;

int main(int nparams,ichar (*params)[]) {
int i;

for (i=0; i<nparams; ++i)
printf("%d: %s\n",i,(*params));

}

(Why? Because this will be the output of a code generator.)

This works perfectly well with four different C compilers. But Clang doesn't
like it: it insists the params type must be char** (and it can't be signed
nor unsigned either, just unspecified, however the latter only gives a
warning; as I have it above, it is an error).

Is there in fact something wrong with the way I'm doing it?

One way to get around it, seems to be to move the main function, which
appears to be special to Clang, outside of the non-C source language.
Another is to make a special case when compiling a function called 'main',
and bodge the output that way. But I don't particularly want to do this, and
it's just pandering to this very fussy compiler.
 
E

Eric Sosman

Never thought I'd be asking about this, but it's giving me some trouble!

I want to use a declaration that looks like this:

typedef unsigned char* ichar;

int main(int nparams,ichar (*params)[]) {

This is equivalent to

int main(int nparams, ichar *(*params))

.... which is equivalent to

int main(int nparams, ichar **params)

.... which is equivalent to

int main(int nparams, unsigned char* **params)

.... which is equivalent to

int main(int nparams, unsigned char ***params)

.... which is in no way equivalent to or even remotely like

int main(int argc, char **argv)
Is there in fact something wrong with the way I'm doing it?

Left as an exercise.
 
K

Kaz Kylheku

Never thought I'd be asking about this, but it's giving me some trouble!

I want to use a declaration that looks like this:

typedef unsigned char* ichar;

int main(int nparams,ichar (*params)[]) {

argv is a pointer to char*

what you have here is a pointer to (incomplete) array of char*

This is a needless complication.
int i;

for (i=0; i<nparams; ++i)
printf("%d: %s\n",i,(*params));

}

(Why? Because this will be the output of a code generator.)

This works perfectly well with four different C compilers. But Clang doesn't
like it: it insists the params type must be char** (and it can't be signed

Generate an additional local and a cast to initialize it:

int main(int nparams, char **params_in)[]) {
ichar (*params_in)[] = (ichar (*)[]) params_in;

Really, does your code generator even have to take over
main?

You can have

int my_main(/*whatever you want*/)
{
}


int main(int argc, char **argv)
{
/* generate call to my_main */
}

This main can be part of a fixed module: your own "BartCRT.o"
that is linked in. :)
 
B

BartC

Eric Sosman said:
Never thought I'd be asking about this, but it's giving me some trouble!

I want to use a declaration that looks like this:

typedef unsigned char* ichar;

int main(int nparams,ichar (*params)[]) {

This is equivalent to

int main(int nparams, ichar *(*params))

... which is equivalent to

int main(int nparams, ichar **params)

... which is equivalent to

int main(int nparams, unsigned char* **params)

... which is equivalent to

int main(int nparams, unsigned char ***params)

... which is in no way equivalent to or even remotely like

int main(int argc, char **argv)

Forgetting the ichar part for a minute, CDECL tells me that:

char *(*params)[]

is 'pointer to array of pointer to char'. Which likely corresponds to the
actual structure of the argv data (and exactly matches what I expressed in
the source language). The difference from char** is that one of the pointers
points to the whole array, instead of the first element, which is taken care
of with the special indexing used.

And type-wise, no compiler has complained, not even Clang; it just doesn't
like it for main().
Left as an exercise.

Apparently the only thing wrong with it is that is it not char**.
 
B

BartC

Kaz Kylheku said:
int main(int nparams,ichar (*params)[]) {

argv is a pointer to char*

what you have here is a pointer to (incomplete) array of char*

I thought that was exactly what argv was.
Really, does your code generator even have to take over
main?

You can have

int my_main(/*whatever you want*/)
{
}


int main(int argc, char **argv)
{
/* generate call to my_main */
}

This main can be part of a fixed module: your own "BartCRT.o"
that is linked in. :)

Yes, but I'd have to write it in a more C-like language. I was trying to
minimise the need for that (for cases where there are complex C declarations
I don't know about and can't really replicate, such as FILE); I didn't
expect to need it for main()!
 
E

Eric Sosman

Eric Sosman said:
Never thought I'd be asking about this, but it's giving me some trouble!

I want to use a declaration that looks like this:

typedef unsigned char* ichar;

int main(int nparams,ichar (*params)[]) {

This is equivalent to

int main(int nparams, ichar *(*params))

... which is equivalent to

int main(int nparams, ichar **params)

... which is equivalent to

int main(int nparams, unsigned char* **params)

... which is equivalent to

int main(int nparams, unsigned char ***params)

... which is in no way equivalent to or even remotely like

int main(int argc, char **argv)

Forgetting the ichar part for a minute, CDECL tells me that:

char *(*params)[]

is 'pointer to array of pointer to char'. Which likely corresponds to
the actual structure of the argv data (and exactly matches what I
expressed in the source language).[...]

No, that is *not* what you expressed in the source language.
Have you forgotten that C functions do not and cannot take array
parameters? 6.7.6.3p7:

"A declaration of a parameter as ‘‘array of type’’ shall
be adjusted to ‘‘qualified pointer to type’’, [...]"

I stand by my claim that your code asks for three levels of
indirection, not two. (And I'd be interested to hear what
cdecl makes of the *entire* function declaration, not just
a snippet of text out of the middle. Context Matters.)
Apparently the only thing wrong with it is that is it not char**.

Yes, that's what's wrong with it. It's not `char**', it's
`unsigned char***' with three asterisks, not two, and that's
wrong. R-O-N-G, wrong.
 
B

BartC

Eric Sosman said:
Eric Sosman said:
On 3/29/2014 2:09 PM, BartC wrote:
typedef unsigned char* ichar;
int main(int nparams,ichar (*params)[]) {
... which is equivalent to

int main(int nparams, unsigned char ***params)
Forgetting the ichar part for a minute, CDECL tells me that:

char *(*params)[]

is 'pointer to array of pointer to char'. Which likely corresponds to
the actual structure of the argv data (and exactly matches what I
expressed in the source language).[...]

No, that is *not* what you expressed in the source language.

(I meant the source language I'm translating to C.

In that language, and again disregarding the ichar subtype, the type of
'params' is 'ref [] ref char'; an Algol-68-style left-to-right type-spec
which means 'pointer to array of pointer to char', exactly what CDECL told
me what 'char *(*params)[] meant. Two levels of pointer.)
Have you forgotten that C functions do not and cannot take array
parameters? 6.7.6.3p7:
"A declaration of a parameter as ‘‘array of type’’ shall
be adjusted to ‘‘qualified pointer to type’’, [...]"

I would guess that applies to the top-level parameter type. I'm fairly
certain you can pass pointer-to-array types in C (and I've done that quite a
lot).
I stand by my claim that your code asks for three levels of
indirection, not two.

That extra [] threw me too, but I don't think it counts in this context.

(And I'd be interested to hear what
cdecl makes of the *entire* function declaration, not just
a snippet of text out of the middle. Context Matters.)

The online cdecl I used is down at the minute. However, I understand that
top-level arrays as parameters are treated differently to those elsewhere.
But this wasn't a top-level one.
 
E

Eric Sosman

Eric Sosman said:
On 3/29/2014 2:09 PM, BartC wrote:
typedef unsigned char* ichar;
int main(int nparams,ichar (*params)[]) {
... which is equivalent to

int main(int nparams, unsigned char ***params)
Forgetting the ichar part for a minute, CDECL tells me that:

char *(*params)[]

is 'pointer to array of pointer to char'. Which likely corresponds to
the actual structure of the argv data (and exactly matches what I
expressed in the source language).[...]

No, that is *not* what you expressed in the source language.

(I meant the source language I'm translating to C.

Fine, but you're translating it to erroneous C.
Have you forgotten that C functions do not and cannot take array
parameters? 6.7.6.3p7:
"A declaration of a parameter as ‘‘array of type’’ shall
be adjusted to ‘‘qualified pointer to type’’, [...]"

I would guess that applies to the top-level parameter type. I'm fairly
certain you can pass pointer-to-array types in C (and I've done that
quite a lot).
I stand by my claim that your code asks for three levels of
indirection, not two.

That extra [] threw me too, but I don't think it counts in this context.

Okay, let's just try a little experiment. Put the
following into a source file:

#include <stdio.h>

/* BartC's main, renamed to protect the innocent */
int bartc(int nparams, char *(*params)[]) {
printf("Hello from bartc: %d params at %p\n",
nparams, (void*) params);
return 0;
}

/* A "statutory" main */
int main(int argc, char **argv) {
puts("Hello from main!");
/* Call the function BartC thinks is compatible with
* a statutory main, passing the actual main's own
* arguments. If BartC is right and his function is
* truly equivalent to a real main, this will work.
*/
return bartc(argc, argv);
}

Feed it to your favorite C compilers (using C11, C99, C90, C89,
or even K&R C, with or without TC's) and see what they have to
say about it.
 
B

BartC

Eric Sosman said:
I stand by my claim that your code asks for three levels of
indirection, not two.

That extra [] threw me too, but I don't think it counts in this context.

Okay, let's just try a little experiment. Put the
following into a source file:

int bartc(int nparams, char *(*params)[]) { ....
int main(int argc, char **argv) {
return bartc(argc, argv);
}

Feed it to your favorite C compilers (using C11, C99, C90, C89,
or even K&R C, with or without TC's) and see what they have to
say about it.

I don't even need to try and compile that (although I did). It's obvious
that the two access argv in different ways (one a pointer to pointer, the
other pointer to array of pointers). Both will work.

The question is, is argv in main() *only* defined as pointer to pointer?
Clang obviously thinks so.

Try this experiment of mine:

#include <stdio.h>

void print_cstyle(int n, char** array){
int i;
for (i=0; i<n; ++i)
printf("C: %d %s\n",i,array);
}

void print_bstyle(int n, char*(*array)[]){
int i;
for (i=0; i<n; ++i)
printf("B: %d %s\n",i,(*array));
}

/* A "statutory" main */
int main(void) {
char *s[] = {"one","two","three"};
int n=sizeof(s)/sizeof(s[0]);

print_cstyle(n,s);
puts("");
print_bstyle(n,&s);

}

The array 's' is a little like the argv data.

I'm printing this using two functions: print_cstyle() which accesses it like
you say main() does. And print_bstyle(), which uses my pointer-to-array
style.

Both work. There are no casts involved. But is print_bstyle() valid, defined
C? If so, then I should be able to use pointer-to-array types for argv in
main.
 
E

Eric Sosman

Eric Sosman said:
I stand by my claim that your code asks for three levels of
indirection, not two.

That extra [] threw me too, but I don't think it counts in this context.

Okay, let's just try a little experiment. Put the
following into a source file:

int bartc(int nparams, char *(*params)[]) { ...
int main(int argc, char **argv) {
return bartc(argc, argv);
}

Feed it to your favorite C compilers (using C11, C99, C90, C89,
or even K&R C, with or without TC's) and see what they have to
say about it.

I don't even need to try and compile that (although I did). It's obvious
that the two access argv in different ways (one a pointer to pointer, the
other pointer to array of pointers). Both will work.

The question is, is argv in main() *only* defined as pointer to pointer?
Clang obviously thinks so.

Try this experiment of mine:

#include <stdio.h>

void print_cstyle(int n, char** array){
int i;
for (i=0; i<n; ++i)
printf("C: %d %s\n",i,array);
}

void print_bstyle(int n, char*(*array)[]){
int i;
for (i=0; i<n; ++i)
printf("B: %d %s\n",i,(*array));
}

/* A "statutory" main */
int main(void) {
char *s[] = {"one","two","three"};
int n=sizeof(s)/sizeof(s[0]);

print_cstyle(n,s);
puts("");
print_bstyle(n,&s);

}

The array 's' is a little like the argv data.


Missing the final NULL, but okay.
I'm printing this using two functions: print_cstyle() which accesses it
like
you say main() does. And print_bstyle(), which uses my pointer-to-array
style.

Both work. There are no casts involved. But is print_bstyle() valid,
defined
C? If so, then I should be able to use pointer-to-array types for argv
in main.

I think it's valid, yes. But note that sneaky little `&'
in the call: You're not passing it what main() receives, you're
not passing it what the environment will pass to main(). If
you declare main() the same way you've declared print_bstyle(),
you have declared main() incorrectly.[*]

print_bstyle() is (on cursory inspection) a valid C function,
but plenty of valid C functions are not main()-like.

[*] Obligatory nitpick: An implementation may support other
forms of main() besides the two required by the Standard, like
`double main(const struct args_s *, unsigned long)', and perhaps
somewhere you'll find an implementation that supports your style
of main(). I've not seen one in not quite four decades of using C,
but maybe you'll get lucky.
 
K

Keith Thompson

BartC said:
Never thought I'd be asking about this, but it's giving me some trouble!

I want to use a declaration that looks like this:

typedef unsigned char* ichar;

int main(int nparams,ichar (*params)[]) {
int i;

for (i=0; i<nparams; ++i)
printf("%d: %s\n",i,(*params));

}

(Why? Because this will be the output of a code generator.)

This works perfectly well with four different C compilers. But Clang doesn't
like it: it insists the params type must be char** (and it can't be signed
nor unsigned either, just unspecified, however the latter only gives a
warning; as I have it above, it is an error).

Is there in fact something wrong with the way I'm doing it?


Yes, very much so.

If you haven't already done so, download

http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf

which is the latest freely available draft of the ISO C standard. Go to
section 5.1.2.2.1 (which applies to hosted implementations).

It specifies that a program's entry point is named "main", and that it
may be defined as:

int main(void) { /* ... */ }

or as

int main(int argc, char *argv[]) { /* ... */ }

or equivalent; or in some other implementation-defined manner.

The "or equivalent" permits you to use different names for "argc"
and "argv" (but *not* for "main"), or to use equivalent typedefs,
or to write "char **argv" rather than "char *argv[]" (these are
equivalent only as parameter declarations, not in other contexts).

The "or in some other implementation-defined manner" means that
a compiler *may* permit other definitions of main, as long as it
documents them, but need not do so. For example, I believe that
Microsoft's C compiler explicitly permits "void main(void)".

Furthermore, since these requirements are not stated as
"constraints", a compiler is not required to diagnose violations
of them. Even if a compiler does not document that it will accept a
different form, it may still do so, with or without a diagnostic.
And if it does, the program's behavior is not defined by the
standard.

All this is entirely consistent with the behavior you're seeing:
some compilers quietly accept different forms, other do not.

Your definition:
int main(int nparams,ichar (*params)[]) { /* ... */ }
(where ichar is a typedef for unsigned char*) is equivalent to:
int main(int nparams,unsigned char * (*params)[]) { /* ... */ }

"params" is of type "pointer to array of pointer to unsigned char".

The fact that this is the output of a code generator doesn't seem
to be relevant. If you happen to have a compiler that accepts your
definition, the runtime environment that invokes the program will
still almost certainly assume that main accepts either no arguments,
or two arguments of types int and char**. By treating the char**
argument as if it were of a completely different type, it's unlikely
that your program will do what you want it to do.

You need to find a different way to do whatever it is you're trying to
do.
One way to get around it, seems to be to move the main function, which
appears to be special to Clang, outside of the non-C source language.
Another is to make a special case when compiling a function called 'main',
and bodge the output that way. But I don't particularly want to do this, and
it's just pandering to this very fussy compiler.

The main function is special to *any* C compiler. The compiler isn't
being "very fussy", it's simply following the C language definition.
You should do the same.
 
K

Keith Thompson

BartC said:
Kaz Kylheku said:
int main(int nparams,ichar (*params)[]) {

argv is a pointer to char*

what you have here is a pointer to (incomplete) array of char*

I thought that was exactly what argv was.

You were mistaken. argv is of type char**. The particular value
passed into main happens to point to the first element of an array
of char*; that's a statement about the *value* of argv, not its type.

[...]
Yes, but I'd have to write it in a more C-like language. I was trying to
minimise the need for that (for cases where there are complex C declarations
I don't know about and can't really replicate, such as FILE); I didn't
expect to need it for main()!

The C language defines what the parameters of main can be, with some
limited flexibility.

What's unusual about main (vs. other functions) isn't that its
definition is restricted to just a few forms, it's that its definition
is more *flexible* than other functions. It is the interface between
your program and the calling environment. If it were defined like other
functions, the environment would define a prototype for it, and you as a
programmer would have to write a definition that conforms to that
prototype. For historical reasons, there is no actual prototype, and
you are permitted a bit more flexibility.

You can write a function that takes parameters
(int nparams,ichar (*params)[]); you just can't call it "main".
 
B

BartC

Richard Damon said:
On 3/29/14, 2:09 PM, BartC wrote:
typedef unsigned char* ichar;

int main(int nparams,ichar (*params)[]) {
The problem is that the standard DEFINES the signature for main (in
5.1.2.2.1) as either:

int main(void) {} or
int main(int argc, char *argv[]) {}
Many compilers don't check this, but apparently Clang does.

OK, I will have to fix it then. (I have to keep in Clang's good books
because the other non-gcc compilers, although they don't care about how I
define main(), complain about a lot of other things!)
I will also add that I am not sure that it is allowed to create a
parameter of type pointer to an array (of x) of unspecified size, the
problem being that the type of parms doesn't know how to do many of the
operations common to pointers (like evaluate parms+1) *parms is really
in incomplete type.

That is a valid point. Although with a pointer to array like that, stepping
to the next array is not always meaningful.
 
B

BartC

BartC said:
Yes, but I'd have to write it in a more C-like language.

I'm talking nonsense!

Obviously this can also be written in my source language, and I can simply
use pointer-to-pointer for params like C expects ('ref ref char params' in
my parlance).

In fact the simple solution for my program as it is, was to switch to
pointer-to-pointer, and change the handful of accesses to params from
indexing to pointer offsets (this source language doesn't mix pointers and
arrays like C does). But in general I will use the wrapper.
 
K

Keith Thompson

Eric Sosman said:
On 3/29/2014 4:00 PM, BartC wrote: [...]
Forgetting the ichar part for a minute, CDECL tells me that:

char *(*params)[]

is 'pointer to array of pointer to char'. Which likely corresponds to
the actual structure of the argv data (and exactly matches what I
expressed in the source language).[...]

No, that is *not* what you expressed in the source language.
Have you forgotten that C functions do not and cannot take array
parameters? 6.7.6.3p7:

"A declaration of a parameter as "array of type" shall
be adjusted to "qualified pointer to type", [...]"

That's not what "char *(*params)[]" means. It declares "params" as a
pointer to an array, which is perfectly valid as a parameter type (it's
a pointer type, not an array type).

The fact that the array type is incomplete is a bit troubling, but not
invalid as far as I can tell.

[...]
 
K

Keith Thompson

BartC said:
The question is, is argv in main() *only* defined as pointer to pointer?
Clang obviously thinks so.

Yes. Clang is quite correct.

If you define main with a second parameter that's of a type incompatible
with char**, how are you going to persuade the runtime environment to
invoke it with data of the type it expects?
 
K

Kaz Kylheku

Yes. Clang is quite correct.

If you define main with a second parameter that's of a type incompatible
with char**, how are you going to persuade the runtime environment to
invoke it with data of the type it expects?

You don't have to; it will work fine. Unless the machine is completely bizarre,
a pointer to an array has a representation which is interchangeable with a
pointer to the first element of an array of the same type.

A formal parameter of type "pointer to incomplete array of T" should have
no trouble accepting a "pointer to T" actual argument.

If you have a machine where a T ** and a T *(*[]) have a different,
incompatible representation and are passed differently as function
parameters, I'd like to hear about it.

Type mismatches can also confound due to invalid aliasing, but in practice
that isn't a problem across module boundaries in this type of situation.

And anyway, there is no aliasing issue between an array and an array element.
Arrays cannot be assigned as a unit in C, and if they could be, then any
modification to an object of type T as an array element involved in
an array-level operation would have o be suspected as modifying some T that is
pointed-at by a T * pointer and vice versa.

So, basically this only fails because it's "artificially" rejected by a type
check, which has nothing to do with the run-time environment not being able to
handle it if the check is removed and code is generated anyway.
 
K

Kaz Kylheku

Eric Sosman said:
On 3/29/2014 4:00 PM, BartC wrote: [...]
Forgetting the ichar part for a minute, CDECL tells me that:

char *(*params)[]

is 'pointer to array of pointer to char'. Which likely corresponds to
the actual structure of the argv data (and exactly matches what I
expressed in the source language).[...]

No, that is *not* what you expressed in the source language.
Have you forgotten that C functions do not and cannot take array
parameters? 6.7.6.3p7:

"A declaration of a parameter as "array of type" shall
be adjusted to "qualified pointer to type", [...]"

That's not what "char *(*params)[]" means. It declares "params" as a
pointer to an array, which is perfectly valid as a parameter type (it's
a pointer type, not an array type).

The fact that the array type is incomplete is a bit troubling, but not
invalid as far as I can tell.

That it is incomplete is actually an important part of this hack, since it
allows params[] to represent "any" number of arguments without running into the
additional undefined behavior of overruning the dimension of an array.
 
K

Kaz Kylheku

[...]
The question is, is argv in main() *only* defined as pointer to pointer?
Clang obviously thinks so.

Yes. Clang is quite correct.

If you define main with a second parameter that's of a type incompatible
with char**, how are you going to persuade the runtime environment to
invoke it with data of the type it expects?

You don't have to; it will work fine. Unless the machine is completely bizarre,
a pointer to an array has a representation which is interchangeable with a
pointer to the first element of an array of the same type.

A formal parameter of type "pointer to incomplete array of T" should have
no trouble accepting a "pointer to T" actual argument.

If you have a machine where a T ** and a T *(*[]) have a different,
incompatible representation and are passed differently as function
parameters, I'd like to hear about it.

Type mismatches can also confound due to invalid aliasing, but in practice
that isn't a problem across module boundaries in this type of situation.

And anyway, there is no aliasing issue between an array and an array element.
Arrays cannot be assigned as a unit in C, and if they could be, then any
modification to an object of type T as an array element involved in
an array-level operation would have o be suspected as modifying some T that is
pointed-at by a T * pointer and vice versa.

So, basically this only fails because it's "artificially" rejected by a type
check, which has nothing to do with the run-time environment not being able to
handle it if the check is removed and code is generated anyway.

The one case I can think of is a system using run time bounds checking
through fat pointers (probably just for debugging), where the pointer
not only points to the address but also include (or points to)
information about the bounds of the memory pointed to.

Now suppose that the bounds information for different pointer types
is binary compatible, and normalized to bytes. The argv pointer, generated as a
char ** would carry a bounds field denoting the extent, in bytes, of the
argv vector (including null terminating entry). If this pointer is
reinterpreted as a pointer to an incomplete array, then the bounds info nicely
becomes reinterpreted as the actual size of that array.

A bound schecking system based on units of element size rather than bytes
would have a problem dealing with pointers to incomplete types, where
the element size is not known.

On the other hand, a bounds checking system could have complicated, rich
meta-data which describes details about an object, such as all of the
dimensions of a de-facto multi-dimensional array (and has ways of dealing with
incomplete objects). That meta-data might depend on the structure of the
static type declaration for its correct interpretation.
 
B

BartC

Kaz Kylheku said:
You don't have to; it will work fine. Unless the machine is completely
bizarre,
a pointer to an array has a representation which is interchangeable with a
pointer to the first element of an array of the same type.
So, basically this only fails because it's "artificially" rejected by a
type
check, which has nothing to do with the run-time environment not being
able to
handle it if the check is removed and code is generated anyway.

This is exactly it.

What makes it more bizarre, is that the actual argv data almost certainly
exists as an array of char* pointers anyway, otherwise how would it be
possible to step or index the argv value?

Yet the language spec makes it impossible (when using Clang at least) to
actually define it as an array! Which is a method completely acceptable in
any other context.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,013
Latest member
KatriceSwa

Latest Threads

Top