=operator for structs

Discussion in 'C Programming' started by Christian Christmann, Mar 18, 2006.

  1. Hi,

    I was wondering how the =operator works for
    struct.

    When I for example define a struct as follows:

    struct point {
    int a;
    char *c;
    };

    and create the first struct

    struct point p1 = { 10, "Hallo" };

    and then create another struct and assign it struct p1

    struct point p2;
    p2 = p1;

    it seams that all elements are copied properly, i.e.
    a new variable is created for 'a' but also, what is more interesting,
    an independent string char* 'c' is generated since a
    modification to p1.c does not affect p2.c.

    Does this mean that structs can be assigned by '=' without any
    problems even if they contain (pointer to) nested structs as
    elements?

    Thank you.
    Chris
     
    Christian Christmann, Mar 18, 2006
    #1
    1. Advertising

  2. Christian Christmann wrote:
    > Hi,
    >
    > I was wondering how the =operator works for
    > struct.


    It does a shallow copy.

    > When I for example define a struct as follows:
    >
    > struct point {
    > int a;
    > char *c;
    > };
    >
    > and create the first struct
    >
    > struct point p1 = { 10, "Hallo" };


    Note that "Hallo" is a string literal. Your p1->c will point to a
    string, but
    modifying that string invokes undefined behaviour.

    > and then create another struct and assign it struct p1
    >
    > struct point p2;
    > p2 = p1;
    >
    > it seams that all elements are copied properly, i.e.
    > a new variable is created for 'a' but also, what is more interesting,
    > an independent string char* 'c' is generated since a
    > modification to p1.c does not affect p2.c.


    This is an example of undefined behaviour in action.

    > Does this mean that structs can be assigned by '=' without any
    > problems


    Generally yes. There isn't a problem with the assignment, there
    is a problem withour subsequent use of the struct copy.

    > even if they contain (pointer to) nested structs as
    > elements?


    Like I say, assignment only does a shallow copy.

    --
    Peter
     
    Peter Nilsson, Mar 18, 2006
    #2
    1. Advertising

  3. Christian Christmann

    Default User Guest

    Christian Christmann wrote:

    > Hi,
    >
    > I was wondering how the =operator works for
    > struct.
    >
    > When I for example define a struct as follows:
    >
    > struct point {
    > int a;
    > char *c;
    > };
    >
    > and create the first struct
    >
    > struct point p1 = { 10, "Hallo" };
    >
    > and then create another struct and assign it struct p1
    >
    > struct point p2;
    > p2 = p1;
    >
    > it seams that all elements are copied properly, i.e.
    > a new variable is created for 'a' but also, what is more interesting,
    > an independent string char* 'c' is generated since a
    > modification to p1.c does not affect p2.c.


    What do mean by "modification to p1.c"?

    If you did something like this:

    p1.c[0] = 'q';

    Then you did a very bad thing, that's undefined behavior.


    If you meant this:

    p1.c = "a different string";


    Then there's no problem.


    > Does this mean that structs can be assigned by '=' without any
    > problems even if they contain (pointer to) nested structs as
    > elements?


    Unlikely. The pointers are copied exactly, each struct initially points
    to the same item (assuming the pointers were set to a valid object's
    address). However, you'll have to detail your problem more thoroughly.
    You use terminology very loosely. Examples would help.



    Brian
     
    Default User, Mar 18, 2006
    #3
  4. Christian Christmann <> writes:
    > I was wondering how the =operator works for
    > struct.


    By copying the values of all the members. It can either copy them one
    at a time or, more likely, by doing the equivalent of a memcpy() on
    the entire structure.

    > When I for example define a struct as follows:
    >
    > struct point {
    > int a;
    > char *c;
    > };
    >
    > and create the first struct
    >
    > struct point p1 = { 10, "Hallo" };
    >
    > and then create another struct and assign it struct p1
    >
    > struct point p2;
    > p2 = p1;
    >
    > it seams that all elements are copied properly, i.e.
    > a new variable is created for 'a' but also, what is more interesting,
    > an independent string char* 'c' is generated since a
    > modification to p1.c does not affect p2.c.
    >
    > Does this mean that structs can be assigned by '=' without any
    > problems even if they contain (pointer to) nested structs as
    > elements?


    No, it doesn't. If a structure contains pointers, copying it by
    assignment to another structure object just copies the pointers; both
    pointers will point to the same external object. To use the jargon,
    struct assignment does a "shallow copy", not a "deep copy".

    p1.c, a pointer, is part of the structure, and is copied by the
    assignment. The string that p1.c points to is not part of the
    structure, and is not copied by the assignment.

    Given the code above, you can modify p2.c without affecting p1 (just
    as you can modify p2.a without affecting p1), but you can't modify
    what p2.c points to without affecting p1 (or rather, affecting what
    p1.c points to).

    And in this case, since p1.c and p2.c both point to a string literal,
    you can't legally modify it at all (attempting to do so invokes
    undefined behavior).

    Here's a program that illustrates what happens. Note that I've
    initialized p1.c to point to a (non-const) array object rather than to
    a string literal, so modifying the string is allowed.

    ================================
    #include <stdio.h>
    int main(void)
    {
    char hello[] = "hello";

    struct point {
    int a;
    char *c;
    };

    struct point p1 = { 10, hello };
    struct point p2;
    p2 = p1;

    printf("p1 = { %d, %p --> \"%s\" }\n", p1.a, (void*)p1.c, p1.c);
    printf("p2 = { %d, %p --> \"%s\" }\n", p2.a, (void*)p2.c, p2.c);

    printf("Modifying p2.c[0]\n");
    p2.c[0] = 'J';

    printf("p1 = { %d, %p --> \"%s\" }\n", p1.a, (void*)p1.c, p1.c);
    printf("p2 = { %d, %p --> \"%s\" }\n", p2.a, (void*)p2.c, p2.c);

    printf("Modifying p2.c\n");
    p2.c = "Good-bye";

    printf("p1 = { %d, %p --> \"%s\" }\n", p1.a, (void*)p1.c, p1.c);
    printf("p2 = { %d, %p --> \"%s\" }\n", p2.a, (void*)p2.c, p2.c);

    return 0;
    }
    ================================

    The output is:

    p1 = { 10, 0x22eeb0 --> "hello" }
    p2 = { 10, 0x22eeb0 --> "hello" }
    Modifying p2.c[0]
    p1 = { 10, 0x22eeb0 --> "Jello" }
    p2 = { 10, 0x22eeb0 --> "Jello" }
    Modifying p2.c
    p1 = { 10, 0x22eeb0 --> "Jello" }
    p2 = { 10, 0x40205d --> "Good-bye" }

    Keep in mind that the printf with a "%p" format prints its argument (a
    pointer), while printf with a "%s" format prints what its argument
    points to (a string).

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
    We must do something. This is something. Therefore, we must do this.
     
    Keith Thompson, Mar 18, 2006
    #4
  5. Christian Christmann

    Simon Biber Guest

    Christian Christmann wrote:
    > struct point {
    > int a;
    > char *c;
    > };
    >
    > and create the first struct
    >
    > struct point p1 = { 10, "Hallo" };
    >
    > and then create another struct and assign it struct p1
    >
    > struct point p2;
    > p2 = p1;


    The previous line is equivalent to:
    p2.a = p1.a; /* copy integer value */
    p2.c = p1.c; /* copy pointer value */

    > it seams that all elements are copied properly, i.e.
    > a new variable is created for 'a' but also, what is more interesting,
    > an independent string char* 'c' is generated since a
    > modification to p1.c does not affect p2.c.


    p1.c and p2.c were always independent objects. A modification to p1.c
    can never affect p2.c! Each of them holds a pointer value, and either
    pointer value can be modified at any time.

    However, no independent string is generated. Both p1.c and p2.c point to
    the same string literal. The string literal, as always, is not
    modifyable. If, however, it were a modifyable object, then you could see
    that modifying it would result in the modifications being visible from
    both p1.c and p2.c.

    > Does this mean that structs can be assigned by '=' without any
    > problems even if they contain (pointer to) nested structs as
    > elements?


    If they contain nested structs as elements, then the nested structs will
    be copied correctly by the '=' operator.

    If they contain _pointers to_ nested structs as elements, then only the
    pointer values will be copied. You will then have two pointers that
    point to the same object. Modifying the underlying object will affect
    access through either pointer.

    --
    Simon.
     
    Simon Biber, Mar 18, 2006
    #5
  6. On 17 Mar 2006 16:20:12 -0800, "Peter Nilsson" <>
    wrote:

    >Christian Christmann wrote:
    >> Hi,
    >>
    >> I was wondering how the =operator works for
    >> struct.

    >
    >It does a shallow copy.
    >
    >> When I for example define a struct as follows:
    >>
    >> struct point {
    >> int a;
    >> char *c;
    >> };
    >>
    >> and create the first struct
    >>
    >> struct point p1 = { 10, "Hallo" };

    >
    >Note that "Hallo" is a string literal. Your p1->c will point to a
    >string, but
    >modifying that string invokes undefined behaviour.
    >
    >> and then create another struct and assign it struct p1
    >>
    >> struct point p2;
    >> p2 = p1;
    >>
    >> it seams that all elements are copied properly, i.e.
    >> a new variable is created for 'a' but also, what is more interesting,
    >> an independent string char* 'c' is generated since a
    >> modification to p1.c does not affect p2.c.

    >
    >This is an example of undefined behaviour in action.


    No it isn't. It is perfectly legal to modify p1.c. What would invoke
    undefined behavior would be modifying what p1.c points to while it
    still points to a string literal.

    >
    >> Does this mean that structs can be assigned by '=' without any
    >> problems

    >
    >Generally yes. There isn't a problem with the assignment, there
    >is a problem withour subsequent use of the struct copy.


    What problem are you referring to?

    >
    >> even if they contain (pointer to) nested structs as
    >> elements?

    >
    >Like I say, assignment only does a shallow copy.



    Remove del for email
     
    Barry Schwarz, Mar 18, 2006
    #6
  7. On Sat, 18 Mar 2006 00:24:29 +0000, Default User wrote:


    >>
    >> struct point p1 = { 10, "Hallo" };
    >>
    >> and then create another struct and assign it struct p1
    >>
    >> struct point p2;
    >> p2 = p1;
    >>

    >
    > What do mean by "modification to p1.c"?
    >
    > If you did something like this:
    >
    > p1.c[0] = 'q';
    >
    > Then you did a very bad thing, that's undefined behavior.


    Than k you for all yout helpful answers.
    Why do I get an undefined behavior when modifying the string
    p1.c points to? Isn't "Hello" a char array somewhere in the
    memory that is referenced by p1.c and thus modification to single
    char elements like p1.c[0] should be allowed?
     
    Christian Christmann, Mar 21, 2006
    #7
  8. Christian Christmann

    Richard Bos Guest

    Christian Christmann <> wrote:

    > On Sat, 18 Mar 2006 00:24:29 +0000, Default User wrote:
    >
    > >> struct point p1 = { 10, "Hallo" };


    > > What do mean by "modification to p1.c"?
    > >
    > > If you did something like this:
    > >
    > > p1.c[0] = 'q';
    > >
    > > Then you did a very bad thing, that's undefined behavior.

    >
    > Than k you for all yout helpful answers.
    > Why do I get an undefined behavior when modifying the string
    > p1.c points to? Isn't "Hello" a char array somewhere in the
    > memory that is referenced by p1.c and thus modification to single
    > char elements like p1.c[0] should be allowed?


    No. pl.c is a pointer to char; whether writing through this pointer
    invokes UB depends on what it points at. In this case, it points at
    "Hallo", which is a string literal; and string literals are translated
    into arrays of char in memory _which may be in unwritable memory_. For
    example, an embedded system long on ROM and short on RAM could put all
    literal strings in ROM.

    Richard
     
    Richard Bos, Mar 21, 2006
    #8
  9. Christian Christmann

    Flash Gordon Guest

    Christian Christmann wrote:
    > On Sat, 18 Mar 2006 00:24:29 +0000, Default User wrote:
    >
    >
    >>> struct point p1 = { 10, "Hallo" };
    >>>
    >>> and then create another struct and assign it struct p1
    >>>
    >>> struct point p2;
    >>> p2 = p1;
    >>>

    >> What do mean by "modification to p1.c"?
    >>
    >> If you did something like this:
    >>
    >> p1.c[0] = 'q';
    >>
    >> Then you did a very bad thing, that's undefined behavior.

    >
    > Than k you for all yout helpful answers.
    > Why do I get an undefined behavior when modifying the string
    > p1.c points to? Isn't "Hello" a char array somewhere in the
    > memory that is referenced by p1.c and thus modification to single
    > char elements like p1.c[0] should be allowed?


    I'm assuming the definition of the struct was something line:
    struct point {
    int i;
    char *p;
    }

    So, in other words, p1.c is a pointer to a string literal.

    You are correct that "Hello" will be an array in memory somewhere, and
    it will obviously have a /0 after it. However, the standard explicitly
    states that you are not allowed to modify string literals, so modifying
    it is undefined behaviour.

    The reason the C language has this restriction is to allow the compiler
    to put the string literal in read only memory (e.g. keep it in ROM on an
    embedded system, or just in a page marked as read only on a hosted
    system) and/or combine string literals, including combining the strings
    "Let me say Hello", "Hello" and "lo".

    So, depending on what the compiler does, some possible results would be
    modifying all string literals that end in "Hello", causing the OS to
    raise some form of access violation signal or error, causing an attempt
    to write to memory that is physically read only (probably resulting in
    nothing happening) or anything else.
    --
    Flash Gordon, living in interesting times.
    Web site - http://home.flash-gordon.me.uk/
    comp.lang.c posting guidelines and intro:
    http://clc-wiki.net/wiki/Intro_to_clc
     
    Flash Gordon, Mar 21, 2006
    #9
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Patricia  Van Hise

    structs with fields that are structs

    Patricia Van Hise, Apr 5, 2004, in forum: C Programming
    Replies:
    5
    Views:
    661
    Al Bowers
    Apr 5, 2004
  2. Chris Hauxwell

    const structs in other structs

    Chris Hauxwell, Apr 23, 2004, in forum: C Programming
    Replies:
    6
    Views:
    578
    Chris Hauxwell
    Apr 27, 2004
  3. Paminu
    Replies:
    5
    Views:
    654
    Eric Sosman
    Oct 11, 2005
  4. Daniel Rudy
    Replies:
    15
    Views:
    1,440
    Keith Thompson
    Apr 10, 2006
  5. Tuan  Bui
    Replies:
    14
    Views:
    507
    it_says_BALLS_on_your forehead
    Jul 29, 2005
Loading...

Share This Page