Assigning an array to another array using C's assignment operator

M

Myth__Buster

Hi,

Here is one technique I have thought of to allow the assignment of one
array to another in an indirect manner as it's not allowed directly in
C. Before all, I would like to clarify that it's not something
designed to replace memcpy(). It's a technique to illustrate that how
good C's rich set of operators are. So, have a look and tell me what
you think of it. The code has all the reasoning for the different
operators I have used. Hope they make things straightforward.

Cheers,
Raghavan

<code>
#include <stdio.h>

// Applicable for arrays which are allocated dynamically also!
//
// The struct assignment below might result in the same code as it
would
// if you happened to use memcpy() and the like. But, the purpose here
// is different!
//
// Purpose here is - To show that "C" indirectly allows array to array
// assignment. :)
//
// NOTE: The struct built for this suffers from no padding issues
whatsoever
// since it involves just an array which must be contiguous and hence
the
// compiler is forced not to play with it even if it wants to for
whatever
// reasons!
//
// This macro doesn't check for the size of the destination and
hence the
// user of this macro should take care of it.
//
// Why the use of comma-expression seemingly dummy one? Well,
// it is needed to inform the compiler that we are not punning
// the types but sincerely dealing with the given addresses to
// just copy data of the given size.
//
// Why seemingly useless (void *) casting? Well, again to inform
// the compiler that we are punning types as said earlier and
// this cast is required for convincing strict-aliasing=1.
//
// And why (void) casting in that comma expression? Well,
// it is to inform the compiler that we understand and hence ignore
// the value of it manually, for having no effect.
//
// The previous attempt would fail to compile if the size-based macro
// is used more than once in the same scope. So, I have used __LINE__
// macro to build the unique data type in this attempt.
//
// Well, the user of this technique can build the required data type
// with unique name by himself/herself very easily. But, to ease his/
her
// job a little, I am constructing the required unique data type with
// the help of the macro. And the uniqueness is based on the line
number
// at which this macro gets placed. So, there will be a redefinition
// of a specific struct type if you happen to use this macro more
than
// once in the same line. However, this limitation shouldn't be
the
// reason not to use this technique which you can as well use
directly
// by building the struct type by yourself in your code wherever you
// want.
//

#define AssignArraysLine(dest, src,
line) \
( \
*(struct t##line \
{ \
char arr[ sizeof(src) ]; \
} *) ((void)dest, (void *)dest) \
= \
*(struct t##line *) \
((void)src, (void *)src) \
)

#define AssignArraysOfSizeLine(dest, src, size, line) \

( \
*(struct
t##line \

{ \
char
arr[ size ]; \
} *) ((void)dest, (void
*)dest) \

= \
*(struct t##line
*) \
((void)src, (void *)src) \
)

#define DummyMacro1(b, a, line) AssignArraysLine(b, a, line)
#define DummyMacro2(b, a, size, line) AssignArraysOfSizeLine(b, a,
size, line)

// Don't get misled by the term static here in the below macro.
// It's just to signify that it's meant for only arrays defined using
// C array subscript([]) operator, which includes variable length
arrays.
//
// NOTE : Don't use the macro more than once in the same line of your
source
// as __LINE__ will be same and hence you would get type-redefinition
error.
//
// And macros are known for side-effects, so be wary of them or you
can just
// hand-code the comprehensive typecasting the above macro does
without any
// problem - you can just ignore __LINE__ macro as well if you code by
hand
// since you will not deliberately redefine a struct more than once!
#define AssignStaticArrays(b, a) DummyMacro1(b, a, __LINE__)

// Universal macro - works for all types of arrays.
//
// NOTE : Don't use the macro more than once in the same line of your
source
// as __LINE__ will be same and hence you would get type-redefinition
error.
//
// And macros are known for side-effects, so be wary of them or you
can just
// hand-code the comprehensive typecasting the above macro does
without any
// problem - you can just ignore __LINE__ macro as well if you code by
hand
// since you will not deliberately redefine a struct more than once!
#define AssignArraysOfSize(b, a, size) DummyMacro2(b, a, size,
__LINE__)

int main(void)
{
int a[ 10 ] = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
int b[ sizeof(a) ];

int i = 0;
printf("Array a : ");
while ( i < 10 )
{
printf("%d ", a[ i ]);
i++;
}
printf("\n");

AssignStaticArrays(b, a); // Once more in the same line -
// AssignStaticArrays(b, a); - Nope!

i = 0;
printf("Array b : ");
while ( i < 10 )
{
printf("%d ", b[ i ]);
i++;
}
printf("\n");

int z[ sizeof(a) ];
AssignStaticArrays(z, a); // This works as it's on a different
line from the
// previous usage above!

i = 0;
printf("Array z : ");
while ( i < 10 )
{
printf("%d ", z[ i ]);
i++;
}
printf("\n");

int c[ sizeof(a) ];
AssignArraysOfSize(c, a, sizeof(a)); // Same rules apply to this
macro usage
// as well as above.
i = 0;
printf("Array c : ");

while ( i < 10 )
{
printf("%d ", b[ i ]);
i++;
}
printf("\n");

int d[ sizeof(c) ];
AssignArraysOfSize(d, c, sizeof(c)); // Works as it's on a
different line.
i = 0;
printf("Array d : ");

while ( i < 10 )
{
printf("%d ", d[ i ]);
i++;
}
printf("\n");


return 0;
}
</code>
 
G

glen herrmannsfeldt

Myth__Buster said:
Here is one technique I have thought of to allow the assignment of one
array to another in an indirect manner as it's not allowed directly in
C. Before all, I would like to clarify that it's not something
designed to replace memcpy(). It's a technique to illustrate that how
good C's rich set of operators are. So, have a look and tell me what
you think of it. The code has all the reasoning for the different
operators I have used. Hope they make things straightforward.

I didn't completely figure out your technique.

In PL/I since about 1964, Fortran since 1990, and many interpreted
langauges, you can say:

A=B+C;

(no ; for the Fortran version)

and add two arrays, element by element.

Seems to me that you could almost do that in C, as there is no
current operation defined for + between two arrays (or pointers).

In most langauges, maybe all, with array expressions, you can also

A=B+1;

to add 1 to each element.

PL/I also allows for structure expressions.

You can:

A=B+C;

where A, B, and C are structures with corresponding members,
and they will be added member by member. Seems to me that C
could do that without too much work. Unlike pointers, there is
no operation for + currently defined for structures.

(All the other operators work in PL/I, too.)

-- glen
 
M

Myth__Buster

I didn't completely figure out your technique.

In PL/I since about 1964, Fortran since 1990, and many interpreted
langauges, you can say:

    A=B+C;

(no ; for the Fortran version)

and add two arrays, element by element.

Seems to me that you could almost do that in C, as there is no
current operation defined for + between two arrays (or pointers).

In most langauges, maybe all, with array expressions, you can also

    A=B+1;

to add 1 to each element.

PL/I also allows for structure expressions.

You can:

    A=B+C;

where A, B, and C are structures with corresponding members,
and they will be added member by member. Seems to me that C
could do that without too much work. Unlike pointers, there is
no operation for + currently defined for structures.

(All the other operators work in PL/I, too.)

-- glen

Well, thanks for sharing information about other languages.

Basically, my technique uses is based on a simple concept -
assigning one struct variable to another which C allows.
And I have used this concept upon arrays. This is what my
technique in the simplest sense.

Please go through the comments to get a detailed picture.

Cheers,
Raghavan
 
M

Myth__Buster

Well, thanks for sharing information about other languages.

Basically, my technique uses is based on a simple concept -
assigning one struct variable to another which C allows.
And I have used this concept upon arrays. This is what my
technique in the simplest sense.

Please go through the comments to get a detailed picture.

Cheers,
Raghavan

*my technique is based . . .
 
B

BartC

Seems to me that you could almost do that in C, as there is no
current operation defined for + between two arrays (or pointers).

No, because arrays aren't handled by value as they would need to be.

Besides, the "-" operator *is* defined between two arrays (although exactly
how the arrays are related is significant):

#define n 10
int A[n],B[n];
int C;

C=A-B;

printf("%d\n",C);

and it wouldn't really fit in well with an A+B that worked completely
differently.
 
E

Eric Sosman

Hi,

Here is one technique I have thought of to allow the assignment of one
array to another [...]

// NOTE: The struct built for this suffers from no padding issues
whatsoever
// since it involves just an array which must be contiguous and hence
the
// compiler is forced not to play with it even if it wants to for
whatever
// reasons!

The array elements must be contiguous (that is, spaced
sizeof(Element) bytes apart and with no inter-element padding),
but it does not follow that wrapping an array in a struct
has "no padding issues." Specifically,

- In `struct S { Type t; }' (for array or non-array `Type'),
we can deduce that sizeof(struct S) >= sizeof(Type), but it is
still possible that sizeof(struct S) > sizeof(Type) -- that is,
the struct may have padding at the end, even though there is
no interstitial padding in Type. Thus, your macros may copy
more bytes than intended, clobbering whatever happens to follow
the destination.

- Similarly, we know that _Alignof(struct S) >= _Alignof(Type),
but equality need not hold. A struct may be more strictly aligned
than any of its elements. Thus, your macros may run afoul of
alignment issues (SIGBUS, anybody?) that a byte-by-byte copy would
not encounter. (If they do, it is more likely than not that the
padding issue will surface, as well.)
int a[ 10 ] = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
int b[ sizeof(a) ];

Do you realize that `b' is almost surely longer than `a'?
(Given your macros' vulnerability to overrun, this may not be
a Bad Thing ...)
 
S

Shao Miller

Hi,

Here is one technique I have thought of to allow the assignment of one
array to another in an indirect manner as it's not allowed directly in
C. Before all, I would like to clarify that it's not something
designed to replace memcpy(). It's a technique to illustrate that how
good C's rich set of operators are. So, have a look and tell me what
you think of it. The code has all the reasoning for the different
operators I have used. Hope they make things straightforward.

Cheers,
Raghavan

[...code...]

This code is fun and presents a learning opportunity. If I could
summarize your technique:

- You generate a structure type with a tag that's based on the line number

- The structure's only member is an array of 'char' that is large enough
to contain the bytes of the source data

- You cast a pointer to the source data as a pointer to this structure type

- You cast a pointer to the destination data as a pointer to this
structure type

- You use these two resulting pointers with indirection to assign from
the source structure to the destination structure

Some criticisms:

- Using __LINE__ for the tag derivation means that the macros cannot be
used more than once on the same line, as you've noted in the comments

- I'm not sure why you didn't use 'unsigned char' instead of 'char' as
the structure member's element-type. Object representation is defined
in terms of 'unsigned char'

- As Mr. Eric Sosman noted, it is not portable to cast any old
object-pointer to a pointer-to-structure type, due to alignment
concerns, unless the code accounts for these concerns. Your code doesn't

- In C99, there are VLAs, which are roughly arrays whose count of
elements is not known at translation-time. Because of this, your use of
'sizeof' in the determination of the structure member's element count is
non-portable; Standard C only allows structure members to be declared
with object types of known sizes (with the exception of Flexible Array
Members). In C11, VLAs are an optional feature

- You might be surprised to learn that at least one Microsoft C
implementation actually changes such assignments to 'memcpy' during
translation and so requires the program to be linked with a library that
provides 'memcpy'

- C >= C99 includes the 'inline' keyword. Functions calls are sometimes
translated to object code that doesn't resemble the usual function calls
that the implementation would normally produce; it can be faster, for
instance. 'memcpy' is a common candidate for such "inlining", so using
your macros versus 'memcpy' has some disadvantages:

- - A function call to 'memcpy' has some type-safety, while your
macros do not

- - If the programmer uses your macros with erroneous code, the
implementation might produce a diagnostic message which is difficult for
the programmer to understand if it involves the expanded form of your macros

- - Your macros will evaluate 'src' and 'dest' multiple times, while
a call to 'memcpy' does not. Consider 'arr++' being 'src', for instance

- - Given that your macro expansions might actually translate to
precisely the same object code as 'memcpy', using the macros can waste
translation time

- Please try to avoid using the term "typecast" in discussion about C;
it's not a C term. Arnold Schwarzenegger is typecast as a tough guy:

http://lmgtfy.com/?q=arnold+schwarzenegger+typecast

- In C >= C99, your code breaks the rules of "effective type", because
you access the stored value of an [array] object as a structure object.
This results in undefined behaviour; a C implementation might actually
complain about your code! (See the Standard's 6.5p7 for detail.)

So to summarize: Your code might be fun, but is most likely (in my
opinion) to be a worse idea than using 'memcpy', especially since it
might translate to a 'memcpy' anyway. :)

Did you have some additional rationale about avoiding 'memcpy', such as
performance concerns, or something else?
 
S

Shao Miller

...
- You generate a structure type with a tag that's based on the line number
...
- Using __LINE__ for the tag derivation means that the macros cannot be
used more than once on the same line, as you've noted in the comments
...

Just for fun and for comp.lang.c criticism, if we have C99 or C11 with
VLAs, I think the following code demonstrates a way to "generate" a
structure type without worrying about the tag. In this code, I believe
that the second 'struct foo' definition doesn't interfere with the first
one:

/* Yields a void expression */
#define InPrototype(expr) \
((void) (void (*)(char[((void) (expr), 1)])) 0)

void test(void * dest, void * src) {
struct foo {
int i;
};

bar.i = (
InPrototype (
*(struct foo { double d; } *) dest = *(struct foo *) src
),
42
);
}

int main(void) {
if (0)
test(0, 0);
return 0;
}

Inside the 'InPrototype' parentheses is an expression (but not a
statement) which defines a 'struct foo' type and works with that type,
even though there is already a different 'struct foo' type. VLAs are
required because the macro magic produces a function-pointer type-name
with a pointer parameter declared via an array declarator, and the
array-size in that declarator is not an integer constant expression.

I'm not 100% sure that the structure assignment will not be
optimized-out by some implementations, however... Maybe someone else knows?

(The 'bar.i' assignment is just to demonstrate that 'InPrototype' is
itself an expression and not a statement.)
 
N

Noob

BartC said:
Besides, the "-" operator *is* defined between two arrays
(although exactly how the arrays are related is significant)

The above statement needs a proper smack-down.

In the context of subtraction, an array object "decays" into
a pointer to the first element of the array.

cf. C89 3.2.2.1 Lvalues and function designators

Except [snip list of exceptions not relevant in this context],
an lvalue that has type ``array of type'' is converted to an
expression that has type ``pointer to type'' that points to
the initial member of the array object and is not an lvalue.

Additionally, if two pointers do not point into the same array,
then subtraction has undefined behavior.

cf. C89 3.3.6 Additive operators

For subtraction, one of the following shall hold:
* both operands have arithmetic type;
* both operands are pointers to qualified or unqualified versions
of compatible object types; or
* the left operand is a pointer to an object type and the right
operand has integral type.

If two pointers that do not point to members of the same
array object are subtracted, the behavior is undefined.
#define n 10
int A[n],B[n];
int C;
C=A-B;

The subtraction has undefined behavior.

References : http://flash-gordon.me.uk/ansi.c.txt
 
J

James Dow Allen

No, because arrays aren't handled by value as they would need to be.

Besides, the "-" operator *is* defined between two arrays (although exactly
how the arrays are related is significant):
...
and it wouldn't really fit in well with an A+B that worked completely
differently.

Hear, hear!

Let's have a show of hands. How many think it would be
a delightful increase of expressive power to define "+"
for arrays? How many agree the inconsistency between
"+" and "-" would make you want to barf?

I'll start. My vote is for ... Barf.

Javascript already does something barfy^H^H^H^Hwonderful.
In that language,
"69" - 2 // is equal to 67
"69" + 2 // is equal to "692"

"Powerful" languages are already available for those
who like such "powerful" features.
It seems malicious to try to inflict them on C.

James
 
G

glen herrmannsfeldt

(snip)
Javascript already does something barfy^H^H^H^Hwonderful.
In that language,
"69" - 2 // is equal to 67
"69" + 2 // is equal to "692"

PL/I, at least, does it consistenty"

'69'-2 is 67,
'69'+2 is 71

You have to:

'69' || 2
if you want '692'.
(and you might actually get '69 2' because it blank pads
the conversion.)
"Powerful" languages are already available for those
who like such "powerful" features.
It seems malicious to try to inflict them on C.

Could add some completely new operators for array expressions.

-- glen
 
A

Anand Hariharan

No, because arrays aren't handled by value as they would need to be.

Correct.


Besides, the "-" operator *is* defined between two arrays (although exactly
how the arrays are related is significant):

IANAL, but that is incorrect. The operands of the minus operator are
the decayed pointer values, not the arrays.

#define n 10
int A[n],B[n];
int C;

C=A-B;

printf("%d\n",C);

Again, IANAL, but that is UB. (Your compiler might probably warn you
about C being the wrong type, that it should be ptrdiff_t, but that is
not a constraint violation.)

- Anand
 
A

Anand Hariharan

No, because arrays aren't handled by value as they would need to be.
Correct.

Besides, the "-" operator *is* defined between two arrays (although exactly
how the arrays are related is significant):

IANAL, but that is incorrect.  The operands of the minus operator are
the decayed pointer values, not the arrays.
#define n 10
int A[n],B[n];
int C;

printf("%d\n",C);

Again, IANAL, but that is UB.  (Your compiler might probably warn you
about C being the wrong type, that it should be ptrdiff_t, but that is
not a constraint violation.)

- Anand


Sorry, didn't notice that 'Noob' had already addressed this in
'<[email protected]>'.

- Anand
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,763
Messages
2,569,562
Members
45,038
Latest member
OrderProperKetocapsules

Latest Threads

Top