substring assignment in fortran, C, etc.

G

Guest

It's an ordinary character assignment.

Maybe it is not by chance that your post is #42 in my news folder.
According to the Hitch Hikers Guide to the Galaxy *the* answer is 42 :)

I wanted just to say the same. And adding, it is not different from a=b
where a and b are both declared as character*6.

Just for curiosity, I used the compiler to generate assembly code for
two simple programs (doing only a="123456" and a=b plus necessary
declarations). Not something I usually do, and I do not know assembly
code ... but the two look rather similar.

the first assignment (to a constant) generates

push $STRLITPACK_0 #2.7
push $main$s_$A #2.7
call memmove #2.7

while the second (a=b) generates

push $main$s_$B #2.7
push $main$s_$A #2.7
call memmove #2.7

They look remarkably similar (STRLITPACK looks something initalized
later on to a sequence of .byte 49 .byte 50 .byte 51 .byte 52 .byte 53
..byte 54 .byte 0)
 
U

user1

robin said:
It's an ordinary character assignment.

Fwiw: It is not hard to look at the assembler code produced by gcc or
gfortran in the two cases. I really don't see much difference. In both
cases, it seems that a temporary sting is stored and then copied by a
function call.

Heres's the fortran version test1.f

character*8 a
a="Hi There"
end

gfortran -S test1.f produces

.file "test1.f"
.section .rdata,"dr"
.align 4
_options.0.533:
.long 68
.long 127
.long 0
.long 0
.long 0
.long 1
.long 0
LC0:
.ascii "Hi There"
.text
..globl _MAIN__
.def _MAIN__; .scl 2; .type 32; .endef
_MAIN__:
pushl %ebp
movl %esp, %ebp
subl $24, %esp
subl $8, %esp
pushl $_options.0.533
pushl $7
call __gfortran_set_options
addl $16, %esp
subl $4, %esp
pushl $8
pushl $LC0
leal -8(%ebp), %eax
pushl %eax
call _memmove
addl $16, %esp
leave
ret
.def __gfortran_set_options; .scl 2; .type 32; .endef
.def _memmove; .scl 2; .type 32; .endef


and here's the C version test2.c

#include <stdio.h>
#include <string.h>

int main()
{
char a[9];
strcpy(a,"hi there");
return 0;
}


gcc -S test2.c produces


.file "test2.c"
.def ___main; .scl 2; .type 32; .endef
.section .rdata,"dr"
LC0:
.ascii "hi there\0"
.text
..globl _main
.def _main; .scl 2; .type 32; .endef
_main:
leal 4(%esp), %ecx
andl $-16, %esp
pushl -4(%ecx)
pushl %ebp
movl %esp, %ebp
pushl %ecx
subl $20, %esp
call ___main
subl $4, %esp
pushl $9
pushl $LC0
leal -13(%ebp), %eax
pushl %eax
call _memcpy
addl $16, %esp
movl $0, %eax
movl -4(%ebp), %ecx
leave
leal -4(%ecx), %esp
ret
.def _memcpy; .scl 2; .type 32; .endef
 
K

Keith Thompson

glen herrmannsfeldt said:
< Right, but in C there's always the danger of confusing arrays and
< pointers. A parameter declared to be of an array type is really a
< pointer, and applying sizeof to it will just give you the size of the
< pointer.

Hopefully new C programmers learn this pretty fast, but yes.
[...]

In an ideal world they learn it pretty fast. Unfortunately ...

Section 6 of the comp.lang.c FAQ, <http://www.c-faq.com>, does a good
job of clearing up the confusion.

Don't bet on it. You do know that it isn't actually specified by
the C standard, don't you? I am not denigrating that FAQ, so much
as pointing out the term "clearing up the confusion" is misleading!

Firstly, the conversion rules between arrays and pointers and back
again allow for indefinite and infinite implicit recursion - the
consensus is that this is a constraint on the implementation to
ensure that they are equivalent, but it's not even hinted at in any
wording. It isn't visible in C as such, but becomes so as soon as
you extend it (e.g. by adding safe pointers, garbage collection,
OpenMP-style parallelism etc.)

Secondly, the wording of 6.3.2.1#3 is seriously ambiguous about
WHEN the conversion takes place, and there are certain reasonable
interpretations that provide visible differences. During the early
years of C89 implementations, several of them varied in how they
implemented array arguments. This was one of the many ignored NB
comments on either C89 or C99. Please ask for examples, if you want.


Glen can be excused for thinking that the array/pointer mess is not
much worse than the Fortran assumed size and array element to array
one, because you need to have been deeply into the C standard to
know just how bad it is in ISO C. K&R C was much cleaner.

Yoiks! I'm a bit embarrassed to admit that I didn't know any of this,
and looking at the standard I still don't see it.

I've set followups to comp.lang.c. Feel free to override that if you
have something Fortranish to say.

Here's C99 6.3.2.1p3:

Except when it is the operand of the sizeof operator or the unary
& operator, or is a string literal used to initialize an array, an
expression that has type ``array of type'' is converted to an
expression with type ``pointer to type'' that points to the
initial element of the array object and is not an lvalue. If the
array object has register storage class, the behavior is
undefined.

That seems relatively straightforward to me (it's the consequences
that make newbies pull their hair out). An expression of array type
is converted implicitly to a pointer in most contexts. There are no
conversions of a pointer back to an array (dereferencing is not
conversion).

So what am I missing (and what have the rest of us been missing all
these years)? Can you provide concrete examples of your first and
second points?
 
N

nmm1

Yoiks! I'm a bit embarrassed to admit that I didn't know any of this,
and looking at the standard I still don't see it.

I've set followups to comp.lang.c. Feel free to override that if you
have something Fortranish to say.

I gave up on that some time ago, so I won't see the thread.
Here's C99 6.3.2.1p3:

Except when it is the operand of the sizeof operator or the unary
& operator, or is a string literal used to initialize an array, an
expression that has type ``array of type'' is converted to an
expression with type ``pointer to type'' that points to the
initial element of the array object and is not an lvalue. If the
array object has register storage class, the behavior is
undefined.

That seems relatively straightforward to me (it's the consequences
that make newbies pull their hair out). An expression of array type
is converted implicitly to a pointer in most contexts. There are no
conversions of a pointer back to an array (dereferencing is not
conversion).

Take a deeper look. The description of subscription is in terms of
arrays, which derives from the concepts in 6.3.2.1 Lvalues etc.
A pointer (value) is not an lvalue, and the corresponding lvalue is
the array to which it points (or, more usually), the first element
of that array). I forget the construction now where I could create
an infinite loop of syntax rules - as I said, it's not visible in
plain C.
So what am I missing (and what have the rest of us been missing all
these years)? Can you provide concrete examples of your first and
second points?

The point is WHEN the conversion is done, and that is most unclear.
It is clear that array syntax is allowed, and has extra syntactic
and semantic properties ('const' and restrict', added in C99).

Consider the following program fragment:

typedef int weeble[5];

void function (weeble arg) {
weeble *ptr = &arg;
printf("%ld %ld\n",(long)sizeof(weeble),(long)sizeof(arg));
}

Where is it stated that the conversion is done AFTER parsing and
BEFORE type matching?

To add chaos to the ambiguity, parsing, type matching and the
'evaluation' of sizeof are ALL done in the last sentence of
translation phase 7. There is nothing in the standard that
distinguishes them in that respect.

In the early days of C89, different vendors interpreted the above
code in all of the three obvious ways (and probably some unobvious
ones, but I never saw them). The conversion could perfectly well
occur after type matching, which would make the declaration of ptr
valid, or even just before evaluation, which would turn sizeof(arg)
into 5*sizeof(int), as anyone accustomed to saner languages would
expect.


Regards,
Nick Maclaren.
 
K

Keith Thompson

I gave up on that some time ago, so I won't see the thread.

I'm Cc'ing this response to you by e-mail.
Take a deeper look. The description of subscription is in terms of
arrays, which derives from the concepts in 6.3.2.1 Lvalues etc.
A pointer (value) is not an lvalue, and the corresponding lvalue is
the array to which it points (or, more usually), the first element
of that array). I forget the construction now where I could create
an infinite loop of syntax rules - as I said, it's not visible in
plain C.

Subscripting is defined in terms of unary "*" and "+". E1[E2] means
(*((E1)+(E2))). Pointer addition is defined in terms of an array; it
works only if the pointer operand (of type foo*) points to an element
of an array of foo.

I still don't see how this can create an "infinite loop of syntax
rules". The requirement for pointer addition to refer to an array is
a semantic rule, not a syntax rule.
So what am I missing (and what have the rest of us been missing all
these years)? Can you provide concrete examples of your first and
second points?

The point is WHEN the conversion is done, and that is most unclear.
It is clear that array syntax is allowed, and has extra syntactic
and semantic properties ('const' and restrict', added in C99).

Consider the following program fragment:

typedef int weeble[5];

void function (weeble arg) {
weeble *ptr = &arg;
printf("%ld %ld\n",(long)sizeof(weeble),(long)sizeof(arg));
}

Where is it stated that the conversion is done AFTER parsing and
BEFORE type matching?

The above violates a constraint. The typedef doesn't introduce a new
type, just a synonym for an existing type (C99 6.7.7p3). So the
declaration
void function (weeble arg)
is equivalent to
void function (int *arg)
(C99 6.7.5.3p7).

Thus &arg is of type int**, and ptr is of type weeble*, or int (*)[5].

The only possible hole in this is the question of whether the the use
of the typedef name in the parameter declaration avoids the adjustment
specified in 6.7.5.3p7. But I don't think it can be reasonably argued
that it does; if it did, we could use a typedef to create a parameter
of array type, something that otherwise doesn't exist.

But the initialization isn't necessary for the following line.
sizeof(weeble) is 5*sizeof(int), and sizeof(arg) is sizeof(int(*)[5]).

As long as you assume that the standard is consistent, I don't see any
ambiguity.
To add chaos to the ambiguity, parsing, type matching and the
'evaluation' of sizeof are ALL done in the last sentence of
translation phase 7. There is nothing in the standard that
distinguishes them in that respect.

In the early days of C89, different vendors interpreted the above
code in all of the three obvious ways (and probably some unobvious
ones, but I never saw them). The conversion could perfectly well
occur after type matching, which would make the declaration of ptr
valid, or even just before evaluation, which would turn sizeof(arg)
into 5*sizeof(int), as anyone accustomed to saner languages would
expect.

What about the equivalent without a typedef?

void fucntion (int arg[5]) {
int (*ptr)[5];
printf("%ld %ld\n", (long)sizeof(int[5]), (long)sizeof(arg));
}

Do you see any ambiguity there?
 
L

lawrence.jones

The point is WHEN the conversion is done, and that is most unclear.
It is clear that array syntax is allowed, and has extra syntactic
and semantic properties ('const' and restrict', added in C99).

You're conflating the conversion of arrays and pointers (6.3.2.1p3) with
the adjustment of function parameter declarations (6.7.5.3p7).
 
P

Phil Carmody

glen herrmannsfeldt said:
< Right, but in C there's always the danger of confusing arrays and
< pointers. A parameter declared to be of an array type is really a
< pointer, and applying sizeof to it will just give you the size of the
< pointer.

Hopefully new C programmers learn this pretty fast, but yes.
[...]

In an ideal world they learn it pretty fast. Unfortunately ...

Section 6 of the comp.lang.c FAQ, <http://www.c-faq.com>, does a good
job of clearing up the confusion.

Don't bet on it. You do know that it isn't actually specified by
the C standard, don't you? I am not denigrating that FAQ, so much
as pointing out the term "clearing up the confusion" is misleading!

Firstly, the conversion rules between arrays and pointers and back
again allow for indefinite and infinite implicit recursion

I'm gonna need a C&V for that. I am aware of rules of replacing
what appears to be one of the above as if it were one of the
other, but nothing in the reverse direction, and certainly no
way of getting infinite recursion.

Phil
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,777
Messages
2,569,604
Members
45,227
Latest member
Daniella65

Latest Threads

Top