Question regarding array assignment

B

BartC

Maybe different operators for pointer assignment and array assignment,
as Fortran has.

(What does Fortran use?)
Does seem like there might be times when array assignment, and array
expressions in general, might have been a nice addition to the
C language.

The usual 'fix' in C is do something like:

#define ARRCPY(a,b) memcpy(a,b,sizeof(a))
.....
ARRCPY(b,a);

I always think such macro-based solutions, when any sorts of enhancements
are brought up, have hindered a more interesting development of the
language.

But then, this simple approach only works for arrays for which sizeof
returns the actual size. To make it work properly, requires deeper changes
in how arrays are implemented; C likes to keep these details transparent and
simple.
 
B

BartC

char a[14];
a = "Hello, world!";
Here also "a" decays to char* and "Hello, world!" also decays to char*
.
Then why we can not assign "a" to another char*?

If p was a char* pointer, then you'd have to write:

*p = ...

to assign to its target (and you can only assign the type it points to - a
single char).

The same with your 'a' above; once it's decayed to a char* type, you need to
write:

*a = "Hello, World!";

but this will fail in a different way because the RHS is not a char type,
but a pointer. You can insert a cast here, but you will just end up
assigning the low-byte of a pointer to the first char of a; probably not
what you want!
 
G

glen herrmannsfeldt

(What does Fortran use?)

=> is the 'points to' operator, so

a=b

copies all the elements from b to a, where

a=b

points pointer a at b. An array pointer inherits the dimensions
(but must have the appropriate rank) of the target.
The usual 'fix' in C is do something like:
#define ARRCPY(a,b) memcpy(a,b,sizeof(a))
....
ARRCPY(b,a);
I always think such macro-based solutions, when any sorts of
enhancements are brought up, have hindered a more interesting
development of the language.

That works for assignment, but not for whole array expressions.

a=b+c*d+sin(sqrt(e))

Most languages that do array expressions do everythign element by
element, but matlab and octave do matrix multiply for the * operator,
and so have the .* operator for element by element multiply.
But then, this simple approach only works for arrays for which sizeof
returns the actual size. To make it work properly, requires deeper
changes in how arrays are implemented; C likes to keep these details
transparent and simple.

Fortran assumed size dummy arguments have similar properties to C
pointers in this regard. Assumed shape (new in Fortran 90) passes the
dimension information along in subroutine and function calls.

Fortran will do structure assignment, but not other operations on
structures. PL/I allows for element by element structure expressions.

DCL 1 (A, B), 2 X FIXED BIN(31), Y COMPLEX FLOAT BIN(53), Z CHAR VAR(100);

A=100;
B=SQRT(A);

Will assign 100, to the three members of structure A, and, in
appropriate form, the square root in structure B.

There is no operator in C that does much like what PL/I does with
structure member Z above.

-- glen
 
B

BartC

glen herrmannsfeldt said:
=> is the 'points to' operator, so

a=b

copies all the elements from b to a, where

a=b

points pointer a at b. An array pointer inherits the dimensions
(but must have the appropriate rank) of the target.

You means a=>b for the latter example? This would be the equivalent of a =
&b then.

This sounds like a 'proper' treatment of pointers, but if it's not possible
to point at individual elements, then it will lack the freedom and
flexibility of C pointers (for the sort of things that C is used for, those
are necessary).
That works for assignment, but not for whole array expressions.

a=b+c*d+sin(sqrt(e))

Most languages that do array expressions do everythign element by
element, but matlab and octave do matrix multiply for the * operator,
and so have the .* operator for element by element multiply.

You're talking here now about going far beyond the basics! C at present has
hardly facilities for manipulating entire arrays and maintaining their
dimensions: assignment as mentioned, comparison, pass-by-value, in fact any
sort of array passing where you can pick up the dimensions at the other end,
and perhaps slicing.

It doesn't even have a built-in way to get the length of an array! (It has
sizeof() which returns the size in bytes, but only for fixed arrays.)
DCL 1 (A, B), 2 X FIXED BIN(31), Y COMPLEX FLOAT BIN(53), Z CHAR VAR(100);

A=100;
B=SQRT(A);

Will assign 100, to the three members of structure A, and, in
appropriate form, the square root in structure B.

There is no operator in C that does much like what PL/I does with
structure member Z above.

Probably for good reason! It's difficult to see how useful this would be.
(Presumably the SQRT example applies SQRT to each element of A, even
strings, and stores in the corresponding
element in B.)
 
K

Keith Thompson

BartC said:
char a[14];
a = "Hello, world!";
Here also "a" decays to char* and "Hello, world!" also decays to char*
.
Then why we can not assign "a" to another char*?

If p was a char* pointer, then you'd have to write:

*p = ...

to assign to its target (and you can only assign the type it points to - a
single char).

The same with your 'a' above; once it's decayed to a char* type, you need to
write:

*a = "Hello, World!";

*a is of type char, so you can write:

*a = 'H';

a doesn't point to a string; it points to a char object (which happens
to be the first character of a string). (It is referred to as a
"pointer to a string", but that's a special-case definition.)

Further, you can write:

*a = 'H';
*(a+1) = 'e';
*(a+2) = 'l';
/* ... */

Of course this:

strcpy(a, "Hello, World!");

will do the same thing much more conveniently.
but this will fail in a different way because the RHS is not a char type,
but a pointer. You can insert a cast here, but you will just end up
assigning the low-byte of a pointer to the first char of a; probably not
what you want!

I wouldn't even mention the possibility of a cast. In many cases, if a
given assignment is invalid, adding a cast doesn't solve anything; it
merely masks the error.
 
A

August Karlstrom

const, not constant. Despite the similarity of names, "const"
and "constant" are two different things.
[...]

Thanks Keith. I should have said "a pointer to a constant".

-- August
 
K

Keith Thompson

August Karlstrom said:
const, not constant. Despite the similarity of names, "const"
and "constant" are two different things.
[...]

Thanks Keith. I should have said "a pointer to a constant".

No, it's a pointer to a *const* (read-only) object.

42 and 's' are constants.

And strictly speaking, string literals are not even "const",
though attempting to modify them does have undefined behavior
(this inconsistency is for historical reasons). Which is why
it's a good idea to use "const" for pointers to string literals,
to enforce what the language doesn't.

(Quibble: you can't really have a pointer to a string literal,
which is a source code construct, but you can have a pointer to
the static array object that corresponds to a string literal.)
 
G

glen herrmannsfeldt

(snip, I wrote)
You means a=>b for the latter example?

I did type the =>, but somehow lost it before posting. Anyway.
This would be the equivalent of a = &b then.

Well, most of the time it is like C's a=b. If b is either an array
or pointer, then it is just a=b.

Seems to me that array expressions would be a more useful addition to C
than complex arithmetic. Structure expressions, too.
This sounds like a 'proper' treatment of pointers, but if it's not
possible to point at individual elements, then it will lack the
freedom and flexibility of C pointers (for the sort of things that
C is used for, those are necessary).

Seems to me that you can add some new features without taking away
the old ones.
You're talking here now about going far beyond the basics!
C at present has hardly facilities for manipulating entire arrays and
maintaining their dimensions: assignment as mentioned, comparison,
pass-by-value, in fact any sort of array passing where you can pick
up the dimensions at the other end, and perhaps slicing.

Even just array expressions for the case where the length is known
would be a useful addition. Well, probably the most common expression
involving arrays in languages that allow them is assigning a constant.

Java has Arrays.fill(), which is somewhat more convenient than writing
a loop everytime you want to set all elements of an array to some value.
It doesn't even have a built-in way to get the length of an array! (It has
sizeof() which returns the size in bytes, but only for fixed arrays.)

But fixed size arrays and structures would be useful enough.
Probably for good reason! It's difficult to see how useful this would be.
(Presumably the SQRT example applies SQRT to each element of A, even
strings, and stores in the corresponding
element in B.)

Though the example isn't so useful, often enough it would be useful.
If a structure represents coordinates in an n-dimensional space, you
could add them and multiply them by constants.

More often, the simpler ones are more useful than the complicated ones.

But yes, PL/I does allow applying SQRT to all members of a structure,
including CHAR variables. When I was in high school I did:

DCL (A,B,C,X) CHAR(100);
A=' 1';
B=' 100';
C=' 2';
DO X=A TO B BY C;
PUT SKIP LIST(SQRT(X));
END;

A loop with start, end, increment, and loop variable all CHAR. Much
harder in C.

To get some actual C into this post, it comes out something like:

#include <stdio.h>
#include <stdlib.h>
#include <math.h>

int main() {
char a[100], b[100], c[100], x[100];
strcpy(a,"1");
strcpy(b,"100");
strcpy(c,"1");
for(strcpy(x,a);strcmp(x,b)<=0;sprintf(x,"%3g",atof(x)+atof(c))) {
printf("%g\n",sqrt(atof(x)));
}
printf("%s\n",x);
}

but takes a lot less writing.

Among others, both PL/I and the above C do a character compare.
Also, both compute the square root in double precision.

-- glen
 
K

Keith Thompson

glen herrmannsfeldt said:
Seems to me that array expressions would be a more useful addition to C
than complex arithmetic. Structure expressions, too.

It's hard to think of a decent way to add first-class array expressions
(including assignment) to C without breaking existing code.

I suppose you could invent a new unary operator that takes an expression
of array type and yields its value (without the usual array-to-pointer
conversion). You could even overload unary "+" if you're fond of
terseness. Then perhaps you could have something like:

char s[] = "hello";
s = +"HELLO";

And you'd also want some new syntax to specify parameters that are
actually of array type. (No, I'm not going to suggest overloading
"static".)

I haven't thought this through (I'm pretty much making it up as
I type), but I don't see how it could gracefully handle arrays of
different sizes. Using pointers to manipulate arrays lets you write
one chunk of code that can handle arrays of any arbitrary size.
Doing that with first-class array expressions is likely to be
difficult. (Ada, for example, has first-class array expressions;
an assignment with arrays of different lengths raises an exception.)

As for structure expressions they already exist. The name of an object
of struct type has been a valid expression at least since C89, and C99
added compound literals.

[...]
 
S

Seebs

So in the following expressing
a = "Hello, world!";
"a" is a pointer type (char*) and "Hello World" has the type char[14].

Not really. a is an array of 14 characters. In *some contexts*, a reference
to an array is converted into a pointer to its first element. But that
doesn't make that pointer into an object which has storage. And
the string literal "Hello, world!" actually is a pointer; there is an
array-like thing somewhere, but we don't see it as an object.
gcc -ansi -pedantic -Wall test.c
test.c: In function ‘main’:
test.c:4:7: error: incompatible types when assigning to type ‘char[14]’ from type ‘char *’
It sounds like "a" is still char[14] and type of "Hello, world!" is char *
That's the reason compiler is saying "when assigning *to* type ‘char[14]" i.e to "a" "form type ‘char *’" that is type "Hello, world!" is not compatible.

The error message is not very informative, and is in practice incorrect.
The actual text in C99 is:

Except when it is the operand of the sizeof operator or the
unary & operator, or is a string literal used to initialize
an array, an expression that has type "array of type" is
converted to an expression with type "pointer to type"
that points to the initial element of the array object and
is not an lvalue.

The real issue is that the pointer "a" is converted to is not an lvalue,
thus, cannot be on the left-hand-side of an expression. gcc's diagnostic
is a reasonable way to think about the problem, and does tell you why the
assignment doesn't work, but "incompatible types" isn't really the problem.

-s
 
K

Keith Thompson

Seebs said:
So in the following expressing
a = "Hello, world!";
"a" is a pointer type (char*) and "Hello World" has the type char[14].

Not really. a is an array of 14 characters. In *some contexts*, a reference
to an array is converted into a pointer to its first element. But that
doesn't make that pointer into an object which has storage. And
the string literal "Hello, world!" actually is a pointer; there is an
array-like thing somewhere, but we don't see it as an object.

That's incorrect. The string literal -- well, the string literal itself
is a token in a C source file. But it's associated with an array object
with static storage duration that exists during program execution.

That's a real object. It doesn't have a name, but neither does an
object allocated via malloc(). You can apply sizeof to get its size, or
unary "&" to get its address (which is of type char(*)[LEN+1]).

A string literal is an expression of array type; it's no more "actually
a pointer" than any other array expression is. Like any such
expression, it's implicitly converted to a pointer in most, but not all,
contexts.
 
B

BartC

Keith Thompson said:
It's hard to think of a decent way to add first-class array expressions
(including assignment) to C without breaking existing code.

You can add new-style arrays as a different, parallel type. However it is
easy to get too ambitious and end up needing complex memory management or
garbage collection, and/or implementing half of C++.

But there's quite a lot that can be done to enhance C and still keep it
low-level (which is my preference as I'd rather use a higher-level
language - probably implemented in C - for the more advanced stuff, and
where you can do these things properly).
 
J

jacob navia

Le 10/12/2013 00:20, BartC a écrit :
You can add new-style arrays as a different, parallel type. However it
is easy to get too ambitious and end up needing complex memory
management or garbage collection, and/or implementing half of C++.

But there's quite a lot that can be done to enhance C and still keep it
low-level (which is my preference as I'd rather use a higher-level
language - probably implemented in C - for the more advanced stuff, and
where you can do these things properly).

1) Operator overloading doesn't break existing code at all.
2) It is implemented in a C compiler since 2004-2005
3) It allows by overloading the operator [ ] to construct
arrays with roperties as the user wishes. This is not "A WAY" of
doing array assignment, implementing read only arrays, copy on
write arrays, etc, it is a method that can be used to implement all
those solutions in a compatible way.

But there is no blinder person as the man that doesn't want to see.
 
B

BartC

jacob navia said:
Le 10/12/2013 00:20, BartC a écrit :
You can add new-style arrays as a different, parallel type. However it
is easy to get too ambitious and end up needing complex memory
management or garbage collection, and/or implementing half of C++.

But there's quite a lot that can be done to enhance C and still keep it
low-level (which is my preference as I'd rather use a higher-level
language - probably implemented in C - for the more advanced stuff, and
where you can do these things properly).

1) Operator overloading doesn't break existing code at all.
2) It is implemented in a C compiler since 2004-2005
3) It allows by overloading the operator [ ] to construct
arrays with roperties as the user wishes. This is not "A WAY" of
doing array assignment, implementing read only arrays, copy on
write arrays, etc, it is a method that can be used to implement all
those solutions in a compatible way.

But there is no blinder person as the man that doesn't want to see.

OK, we have a slightly different perspective on these things: since 1980s,
I've been using a two tiers of languages (sometimes, 3!): a higher level one
to do the bulk of application programming, and a lower level C-like one to
implement everything else, including the other language. (And recently that
has actually been C-based.)

You I know prefer to use C for everything, so need all these advanced
features.

However operator overloading will affect the transparency for which C is
famous. And you won't be able to go too far without needing to use garbage
collection (to deal with intermediate results, slices and pointers to
intermediate results and so on). You will also have to start thinking about
whether that array assignment is going to be deep or shallow, because it's
got nested flex arrays, and structs which contain flexible arrays, and about
whether certain types can be mutable or immutable, because that affects
sharing, etc.

It can get complicated very quickly.
 
G

glen herrmannsfeldt

(snip on array assignment and array expressions)
1) Operator overloading doesn't break existing code at all.
2) It is implemented in a C compiler since 2004-2005
3) It allows by overloading the operator [ ] to construct
arrays with roperties as the user wishes. This is not "A WAY" of
doing array assignment, implementing read only arrays, copy on
write arrays, etc, it is a method that can be used to implement all
those solutions in a compatible way.
But there is no blinder person as the man that doesn't want to see.
OK, we have a slightly different perspective on these things: since 1980s,
I've been using a two tiers of languages (sometimes, 3!): a higher level one
to do the bulk of application programming, and a lower level C-like one to
implement everything else, including the other language. (And recently that
has actually been C-based.)
You I know prefer to use C for everything, so need all these advanced
features.
However operator overloading will affect the transparency for which C is
famous. And you won't be able to go too far without needing to use garbage
collection (to deal with intermediate results, slices and pointers to
intermediate results and so on).

OK, but say you don't go that far.

I do remember PL/I compilers warning that temporary arrays were needed
to evaluate an expression. Nice of them to warn about it.

As far as I know, the most common array expression in languages that
have them is assigning a scalar to every element of an array.

Without thinking too hard about it, ALGOL has the := assignment
operator, so consider the possibility of using that as an array
assignment operator. Since the compiler has to know the extent of
the array to do it, only for arrays with known extent.

Maybe the next most common array expressions are elementwise addition
and multiplying by a scalar. Also nice and easy to do.

So, one possiblity would be array assignment with :=.

a := 0;

and array addition:

a := b + c; (I think it doesn't need the :+ array addition operator.)

a := 3*d;

If you allow for elemental functions:

y := sqrt(x)

does elementwise sqrt, no temporary needed.
You will also have to start thinking about whether that array
assignment is going to be deep or shallow, because it's
got nested flex arrays, and structs which contain flexible arrays,
and about whether certain types can be mutable or immutable,
because that affects sharing, etc.

OK, so don't allow flex arrays.

There is one complication that comes up fairly early, though, with simple
expressions like:

a := a[10];

In PL/I, assignment is done elementwise, such that the new value of
a[10] is used as soon as it changes. (Convenient for loop expansion.)

Fortran requires that the old value be used for all. That is, the whole
right side is evaluated before any element changes. (If the compiler
verifies, under aliasing rules, that no such changes occur, it can
evaluate and assign element by element.)

The PL/I rule is probably more applicable to C. That is, you get the
same result as a for loop expansion, without writing the loops.
(Though slightly less convenient on vector processors.)
It can get complicated very quickly.

Often enough, someone asks on comp.lang.fortran how to write some
complicated operation using array expressions and no DO loops.
Usually in the cases people ask about, the result is much more
complicated, and likely slower, than writing the loops.

But most of those complicated cases require some array reduction
intrinsic functions that C doesn't have.

-- glen
 
B

BartC

glen herrmannsfeldt said:
OK, but say you don't go that far.

I do remember PL/I compilers warning that temporary arrays were needed
to evaluate an expression. Nice of them to warn about it.

As far as I know, the most common array expression in languages that
have them is assigning a scalar to every element of an array.

The same element? C already has that! But via initialisation (or memset, a
cruder form).
So, one possiblity would be array assignment with :=.

a := 0;

and array addition:

a := b + c; (I think it doesn't need the :+ array addition operator.)

a := 3*d;

If you allow for elemental functions:

y := sqrt(x)

does elementwise sqrt, no temporary needed.

I think you're thinking too mathematically. For these functions, you really
need a strong type system, and the operator overloading that Jacob was
talking about. A completely different language. Because A+B might make sense
to do element-wise when A,B represent certain kinds of data, but would be
completely nonsensical for anything else.
Maybe the next most common array expressions are elementwise addition
and multiplying by a scalar. Also nice and easy to do.

(In my dynamic language, doing (10,20,30)*2 results in (10,20,30,10,20,30)!
Far more useful.)

I have thought about what array facilities could be added to a C-like
*low-level* language, but the results will not sound very exciting compared
with your ideas. Value-types for arrays, yes maybe (but then I've had such
things for decades in my own languages, and they were very rarely used).

One thing I did come up with was a Slice type, which is simply a (pointer,
length) pair. That can be used to pass arrays around (so that they carry
their length with them), to represent run-time allocated arrays (but not
flex arrays that will change size, those need to carry extra info), and can
be be used to point to a subsection of another array.

Not earth-shattering, but it is easy to comprehend, and on the back of it
you can have 'forall' statements to iterate along it, and be instantly able
to extract the length, and can also be used to work with non-zero-terminated
counted strings and string slices. I think it is necessary to think small!

(Of course it's possible to just create a (pointer, length) struct now, but
such things are never as good as being built-in to a language; you'd need to
have a different struct for each type of element for example, and there'd be
untidy syntax to access the array.)
 
G

glen herrmannsfeldt

(snip on array assignment and array expressions, I wrote)
The same element? C already has that! But via initialisation
(or memset, a cruder form).

memset() works for char arrays, but if you want to fill every element
of some other type of array with the same value, other than zero,
it doesn't work.

double x[1000];
/* some statements here */

x[0]=1.23;
memcpy((unsigned char*)(x+1),(unsigned char*)x,999*sizeof(*x));

but memcpy() behavior is undefined for the overlap case.

(The favorite way to do this in S/360 and successor assembly is to
assign the first element, and then do a MVC (move characters) for the
rest. The overlap case is defined, and on some processors might be
a special case in microcode.)

memmove() is defined in the overlap case, to do the non-destructive copy.

x := 1.23;

much easier to write!


-- glen
 
B

Ben Bacarisse

BartC said:
The same element? C already has that! But via initialisation (or memset, a
cruder form).

Did you mean "the same scalar"?

Anyway, I don't think C has what Glen is talking about. Initialisation
can only do that for zero, and memset is very restrictive unless the
array really is an array of characters.

<snip>
 
B

BartC

glen herrmannsfeldt said:
The same [value]? C already has that! But via initialisation
(or memset, a cruder form).

memset() works for char arrays, but if you want to fill every element
of some other type of array with the same value, other than zero,
it doesn't work.

But zero is the most common and the most useful. For anything else, it's not
hard to create a version of memset to fill with other types (since we're
going to be calling a function anyway). And if these arrays are created at
runtime by calling a function, that function can be given a fill value too.
double x[1000];
/* some statements here */

x[0]=1.23;
memcpy((unsigned char*)(x+1),(unsigned char*)x,999*sizeof(*x));
x := 1.23;

much easier to write!

Unfortunately it looks exactly as though you are assigning the single value
1.23 to a scalar variable 'x'!

I was going to say you need to write it as x[]:=1.23, or x:={1.23},
something to indicate this is an array operation, but I noticed you're using
":=" instead of "=". Well I use := all the time anyway (even when writing C,
thanks to some pre-processing) so wouldn't have noticed. But := is probably
also the second most common assignment operator after "=".
 
G

glen herrmannsfeldt

(snip, I wrote)
But zero is the most common and the most useful. For anything else, it's not
hard to create a version of memset to fill with other types (since we're
going to be calling a function anyway). And if these arrays are created at
runtime by calling a function, that function can be given a fill value too.
double x[1000];
/* some statements here */
x[0]=1.23;
memcpy((unsigned char*)(x+1),(unsigned char*)x,999*sizeof(*x));
x := 1.23;
much easier to write!
Unfortunately it looks exactly as though you are assigning the
single value 1.23 to a scalar variable 'x'!
I was going to say you need to write it as x[]:=1.23, or x:={1.23},
something to indicate this is an array operation, but I noticed you're using
":=" instead of "=". Well I use := all the time anyway (even when writing C,
thanks to some pre-processing) so wouldn't have noticed. But := is probably
also the second most common assignment operator after "=".

I might like:

a[*]=1.23;

better. I was thinking about operators that C didn't already have,
such that it would obviously not have an old meaning. PL/I uses * for
array cross sections, so this would also allow:

a[*][3]=1.23;
a[3][*]=1.23;

While * already has at least two meanings in C, I don't think any of
them look like that.

Now you can also add:

a[*]=b[*];

and

a[*] *= 2;

-- glen
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,777
Messages
2,569,604
Members
45,226
Latest member
KristanTal

Latest Threads

Top