=operator for structs

  • Thread starter Christian Christmann
  • Start date
C

Christian Christmann

Hi,

I was wondering how the =operator works for
struct.

When I for example define a struct as follows:

struct point {
int a;
char *c;
};

and create the first struct

struct point p1 = { 10, "Hallo" };

and then create another struct and assign it struct p1

struct point p2;
p2 = p1;

it seams that all elements are copied properly, i.e.
a new variable is created for 'a' but also, what is more interesting,
an independent string char* 'c' is generated since a
modification to p1.c does not affect p2.c.

Does this mean that structs can be assigned by '=' without any
problems even if they contain (pointer to) nested structs as
elements?

Thank you.
Chris
 
P

Peter Nilsson

Christian said:
Hi,

I was wondering how the =operator works for
struct.

It does a shallow copy.
When I for example define a struct as follows:

struct point {
int a;
char *c;
};

and create the first struct

struct point p1 = { 10, "Hallo" };

Note that "Hallo" is a string literal. Your p1->c will point to a
string, but
modifying that string invokes undefined behaviour.
and then create another struct and assign it struct p1

struct point p2;
p2 = p1;

it seams that all elements are copied properly, i.e.
a new variable is created for 'a' but also, what is more interesting,
an independent string char* 'c' is generated since a
modification to p1.c does not affect p2.c.

This is an example of undefined behaviour in action.
Does this mean that structs can be assigned by '=' without any
problems

Generally yes. There isn't a problem with the assignment, there
is a problem withour subsequent use of the struct copy.
even if they contain (pointer to) nested structs as
elements?

Like I say, assignment only does a shallow copy.
 
D

Default User

Christian said:
Hi,

I was wondering how the =operator works for
struct.

When I for example define a struct as follows:

struct point {
int a;
char *c;
};

and create the first struct

struct point p1 = { 10, "Hallo" };

and then create another struct and assign it struct p1

struct point p2;
p2 = p1;

it seams that all elements are copied properly, i.e.
a new variable is created for 'a' but also, what is more interesting,
an independent string char* 'c' is generated since a
modification to p1.c does not affect p2.c.

What do mean by "modification to p1.c"?

If you did something like this:

p1.c[0] = 'q';

Then you did a very bad thing, that's undefined behavior.


If you meant this:

p1.c = "a different string";


Then there's no problem.

Does this mean that structs can be assigned by '=' without any
problems even if they contain (pointer to) nested structs as
elements?

Unlikely. The pointers are copied exactly, each struct initially points
to the same item (assuming the pointers were set to a valid object's
address). However, you'll have to detail your problem more thoroughly.
You use terminology very loosely. Examples would help.



Brian
 
K

Keith Thompson

Christian Christmann said:
I was wondering how the =operator works for
struct.

By copying the values of all the members. It can either copy them one
at a time or, more likely, by doing the equivalent of a memcpy() on
the entire structure.
When I for example define a struct as follows:

struct point {
int a;
char *c;
};

and create the first struct

struct point p1 = { 10, "Hallo" };

and then create another struct and assign it struct p1

struct point p2;
p2 = p1;

it seams that all elements are copied properly, i.e.
a new variable is created for 'a' but also, what is more interesting,
an independent string char* 'c' is generated since a
modification to p1.c does not affect p2.c.

Does this mean that structs can be assigned by '=' without any
problems even if they contain (pointer to) nested structs as
elements?

No, it doesn't. If a structure contains pointers, copying it by
assignment to another structure object just copies the pointers; both
pointers will point to the same external object. To use the jargon,
struct assignment does a "shallow copy", not a "deep copy".

p1.c, a pointer, is part of the structure, and is copied by the
assignment. The string that p1.c points to is not part of the
structure, and is not copied by the assignment.

Given the code above, you can modify p2.c without affecting p1 (just
as you can modify p2.a without affecting p1), but you can't modify
what p2.c points to without affecting p1 (or rather, affecting what
p1.c points to).

And in this case, since p1.c and p2.c both point to a string literal,
you can't legally modify it at all (attempting to do so invokes
undefined behavior).

Here's a program that illustrates what happens. Note that I've
initialized p1.c to point to a (non-const) array object rather than to
a string literal, so modifying the string is allowed.

================================
#include <stdio.h>
int main(void)
{
char hello[] = "hello";

struct point {
int a;
char *c;
};

struct point p1 = { 10, hello };
struct point p2;
p2 = p1;

printf("p1 = { %d, %p --> \"%s\" }\n", p1.a, (void*)p1.c, p1.c);
printf("p2 = { %d, %p --> \"%s\" }\n", p2.a, (void*)p2.c, p2.c);

printf("Modifying p2.c[0]\n");
p2.c[0] = 'J';

printf("p1 = { %d, %p --> \"%s\" }\n", p1.a, (void*)p1.c, p1.c);
printf("p2 = { %d, %p --> \"%s\" }\n", p2.a, (void*)p2.c, p2.c);

printf("Modifying p2.c\n");
p2.c = "Good-bye";

printf("p1 = { %d, %p --> \"%s\" }\n", p1.a, (void*)p1.c, p1.c);
printf("p2 = { %d, %p --> \"%s\" }\n", p2.a, (void*)p2.c, p2.c);

return 0;
}
================================

The output is:

p1 = { 10, 0x22eeb0 --> "hello" }
p2 = { 10, 0x22eeb0 --> "hello" }
Modifying p2.c[0]
p1 = { 10, 0x22eeb0 --> "Jello" }
p2 = { 10, 0x22eeb0 --> "Jello" }
Modifying p2.c
p1 = { 10, 0x22eeb0 --> "Jello" }
p2 = { 10, 0x40205d --> "Good-bye" }

Keep in mind that the printf with a "%p" format prints its argument (a
pointer), while printf with a "%s" format prints what its argument
points to (a string).
 
S

Simon Biber

Christian said:
struct point {
int a;
char *c;
};

and create the first struct

struct point p1 = { 10, "Hallo" };

and then create another struct and assign it struct p1

struct point p2;
p2 = p1;

The previous line is equivalent to:
p2.a = p1.a; /* copy integer value */
p2.c = p1.c; /* copy pointer value */
it seams that all elements are copied properly, i.e.
a new variable is created for 'a' but also, what is more interesting,
an independent string char* 'c' is generated since a
modification to p1.c does not affect p2.c.

p1.c and p2.c were always independent objects. A modification to p1.c
can never affect p2.c! Each of them holds a pointer value, and either
pointer value can be modified at any time.

However, no independent string is generated. Both p1.c and p2.c point to
the same string literal. The string literal, as always, is not
modifyable. If, however, it were a modifyable object, then you could see
that modifying it would result in the modifications being visible from
both p1.c and p2.c.
Does this mean that structs can be assigned by '=' without any
problems even if they contain (pointer to) nested structs as
elements?

If they contain nested structs as elements, then the nested structs will
be copied correctly by the '=' operator.

If they contain _pointers to_ nested structs as elements, then only the
pointer values will be copied. You will then have two pointers that
point to the same object. Modifying the underlying object will affect
access through either pointer.
 
B

Barry Schwarz

It does a shallow copy.


Note that "Hallo" is a string literal. Your p1->c will point to a
string, but
modifying that string invokes undefined behaviour.


This is an example of undefined behaviour in action.

No it isn't. It is perfectly legal to modify p1.c. What would invoke
undefined behavior would be modifying what p1.c points to while it
still points to a string literal.
Generally yes. There isn't a problem with the assignment, there
is a problem withour subsequent use of the struct copy.

What problem are you referring to?
Like I say, assignment only does a shallow copy.


Remove del for email
 
C

Christian Christmann

struct point p1 = { 10, "Hallo" };

and then create another struct and assign it struct p1

struct point p2;
p2 = p1;

What do mean by "modification to p1.c"?

If you did something like this:

p1.c[0] = 'q';

Then you did a very bad thing, that's undefined behavior.

Than k you for all yout helpful answers.
Why do I get an undefined behavior when modifying the string
p1.c points to? Isn't "Hello" a char array somewhere in the
memory that is referenced by p1.c and thus modification to single
char elements like p1.c[0] should be allowed?
 
R

Richard Bos

Christian Christmann said:
struct point p1 = { 10, "Hallo" };
What do mean by "modification to p1.c"?

If you did something like this:

p1.c[0] = 'q';

Then you did a very bad thing, that's undefined behavior.

Than k you for all yout helpful answers.
Why do I get an undefined behavior when modifying the string
p1.c points to? Isn't "Hello" a char array somewhere in the
memory that is referenced by p1.c and thus modification to single
char elements like p1.c[0] should be allowed?

No. pl.c is a pointer to char; whether writing through this pointer
invokes UB depends on what it points at. In this case, it points at
"Hallo", which is a string literal; and string literals are translated
into arrays of char in memory _which may be in unwritable memory_. For
example, an embedded system long on ROM and short on RAM could put all
literal strings in ROM.

Richard
 
F

Flash Gordon

Christian said:
struct point p1 = { 10, "Hallo" };

and then create another struct and assign it struct p1

struct point p2;
p2 = p1;
What do mean by "modification to p1.c"?

If you did something like this:

p1.c[0] = 'q';

Then you did a very bad thing, that's undefined behavior.

Than k you for all yout helpful answers.
Why do I get an undefined behavior when modifying the string
p1.c points to? Isn't "Hello" a char array somewhere in the
memory that is referenced by p1.c and thus modification to single
char elements like p1.c[0] should be allowed?

I'm assuming the definition of the struct was something line:
struct point {
int i;
char *p;
}

So, in other words, p1.c is a pointer to a string literal.

You are correct that "Hello" will be an array in memory somewhere, and
it will obviously have a /0 after it. However, the standard explicitly
states that you are not allowed to modify string literals, so modifying
it is undefined behaviour.

The reason the C language has this restriction is to allow the compiler
to put the string literal in read only memory (e.g. keep it in ROM on an
embedded system, or just in a page marked as read only on a hosted
system) and/or combine string literals, including combining the strings
"Let me say Hello", "Hello" and "lo".

So, depending on what the compiler does, some possible results would be
modifying all string literals that end in "Hello", causing the OS to
raise some form of access violation signal or error, causing an attempt
to write to memory that is physically read only (probably resulting in
nothing happening) or anything else.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top