memcmp for <

Discussion in 'C++' started by rajkumar@hotmail.com, Mar 23, 2005.

  1. Guest

    I have a struct like

    struct MyStruct
    {
    int a;
    int b;
    int c:
    bool d;
    bool e;
    }

    I want to insert such a struct in a map. I understand I can declare the
    < operator for such a struct for lexicographical compare like

    x.a < y.a || !(y.a < x.a) && ( x.b < y.b || !(y.b < x.b) && x.c < y.c
    .............

    or a simple one like this

    memcmp(&x,&y,sizeof(MyStruct)) < 0;

    This seems to work if I memset and fill the sizeof(MyStruct) with
    zeroes in the constructor before I assign a b and c etc. This will take
    care of any padding that the compiler adds.

    My question is whether the second approach is portable? If so do I
    really need the memset ? Does the standard say anything about
    initializing the padding bits?

    Raj
     
    , Mar 23, 2005
    #1
    1. Advertising

  2. wrote:
    > I have a struct like
    >
    > struct MyStruct
    > {
    > int a;
    > int b;
    > int c:
    > bool d;
    > bool e;
    > }
    >
    > I want to insert such a struct in a map. I understand I can declare the
    > < operator for such a struct for lexicographical compare like
    >
    > x.a < y.a || !(y.a < x.a) && ( x.b < y.b || !(y.b < x.b) && x.c < y.c
    > ............
    >
    > or a simple one like this
    >
    > memcmp(&x,&y,sizeof(MyStruct)) < 0;
    >
    > This seems to work if I memset and fill the sizeof(MyStruct) with
    > zeroes in the constructor before I assign a b and c etc. This will take
    > care of any padding that the compiler adds.
    >
    > My question is whether the second approach is portable? If so do I
    > really need the memset ? Does the standard say anything about
    > initializing the padding bits?


    1) Yes. 2) Yes. 3) They are uninitialised unless your object has static
    storage duration. Beware, though that as soon as your MyStruct ceases
    being a POD (because you added a private section or a virtual function or
    something of the sort), use of memset and memcpy on it becomes undefined.

    V
     
    Victor Bazarov, Mar 23, 2005
    #2
    1. Advertising

  3. schrieb:
    > I have a struct like
    >
    > struct MyStruct
    > {
    > int a;
    > int b;
    > int c:
    > bool d;
    > bool e;
    > }
    >
    > I want to insert such a struct in a map. I understand I can declare the
    > < operator for such a struct for lexicographical compare like
    >
    > x.a < y.a || !(y.a < x.a) && ( x.b < y.b || !(y.b < x.b) && x.c < y.c
    > ............
    >
    > or a simple one like this
    >
    > memcmp(&x,&y,sizeof(MyStruct)) < 0;
    >
    > This seems to work if I memset and fill the sizeof(MyStruct) with
    > zeroes in the constructor before I assign a b and c etc. This will take
    > care of any padding that the compiler adds.
    >
    > My question is whether the second approach is portable? If so do I
    > really need the memset ? Does the standard say anything about
    > initializing the padding bits?


    Sorry, can't tell you about the padding bits, but it's still not
    portable because of endianess issues:

    struct Foo
    {
    int a;
    };

    Foo f1 = { 1 };
    Foo f2 = { 256 };

    On big-endian machines a memcmp() compare will work correctly. On a
    little-endian machine with 32-bit ints, f1 will contain the byte
    sequence 0x01 0x00 0x00 0x00 (minus padding) and f2 will contain 0x00
    0x01 0x00 0x00. memcmp() will report f1 as greater than f2.

    Cheers,
    Malte
     
    Malte Starostik, Mar 23, 2005
    #3
  4. Guest

    You mentioned something about private section. Could you elaborate how
    that would change things ?

    If the struct carried a vtable pointer or had NON POD could i just
    overload new and memset before i call the constructor ?

    Raj
     
    , Mar 23, 2005
    #4
  5. <> wrote in message
    news:...
    >I have a struct like
    >
    > struct MyStruct
    > {
    > int a;
    > int b;
    > int c:
    > bool d;
    > bool e;
    > }
    >
    > I want to insert such a struct in a map. I understand I can declare the
    > < operator for such a struct for lexicographical compare like
    >
    > x.a < y.a || !(y.a < x.a) && ( x.b < y.b || !(y.b < x.b) && x.c < y.c


    Note that you can also use the (IMO better) following form:
    return (x.a!=y.a) ? x.a<y.a
    : (x.b!=y.b) ? x.b<y.b
    : (x.c!=y.c) ? x.c<y.c
    : (x.d!=y.d) ? x.d<y.d : x.e < y.e;

    > or a simple one like this
    >
    > memcmp(&x,&y,sizeof(MyStruct)) < 0;
    >
    > This seems to work if I memset and fill the sizeof(MyStruct) with
    > zeroes in the constructor before I assign a b and c etc. This will take
    > care of any padding that the compiler adds.


    But not of endianness and other binary representation issues.
    Really, I don't think that saving a few statements is worth the
    loss of portability. Plus the explicit form gives you much more
    flexibility. So why bother?


    --
    http://ivan.vecerina.com/contact/?subject=NG_POST <- email contact form
     
    Ivan Vecerina, Mar 23, 2005
    #5
  6. wrote:
    > You mentioned something about private section. Could you elaborate how
    > that would change things ?


    The layout of an object is only mandated within the same access specifier
    section. So, as soon as you introduce private or protected non-static
    data members, the struct is not a POD any more, and I am not really sure
    why that is, but the Standard makes a point of defining POD-struct that
    way.

    > If the struct carried a vtable pointer or had NON POD could i just
    > overload new and memset before i call the constructor ?


    I am not sure what you mean by "overload memset", but yes, essentially,
    your task would be to gain control over the "padding bytes" by, for
    example, eliminating them using compiler-specific means.

    Let me ask a rhetorical questions, though. If you are prepared to give it
    overloaded 'new' and 'memset' (let's suppose it's possible somehow), why
    don't you just overload the operator < ?

    V
     
    Victor Bazarov, Mar 23, 2005
    #6
  7. Malte Starostik wrote:
    > [...]
    > Sorry, can't tell you about the padding bits, but it's still not
    > portable because of endianess issues:
    >
    > struct Foo
    > {
    > int a;
    > };
    >
    > Foo f1 = { 1 };
    > Foo f2 = { 256 };
    >
    > On big-endian machines a memcmp() compare will work correctly. On a
    > little-endian machine with 32-bit ints, f1 will contain the byte
    > sequence 0x01 0x00 0x00 0x00 (minus padding) and f2 will contain 0x00
    > 0x01 0x00 0x00. memcmp() will report f1 as greater than f2.


    But won't it report f1 consistently greater than f2? The purpose of
    using memcmp (as I understood it) was to forgo the real operator < and
    the memberwise comparison just to see if they were different.

    V
     
    Victor Bazarov, Mar 23, 2005
    #7
  8. "Victor Bazarov" <> wrote in message
    news:N_f0e.55648$01.us.to.verio.net...

    >> My question is whether the second approach is portable? If so do I
    >> really need the memset ? Does the standard say anything about
    >> initializing the padding bits?

    >
    > 1) Yes. 2) Yes. 3) They are uninitialised unless your object has static
    > storage duration.


    Beg pardon? Memcmp portable? I don't see why. As a simple example, I
    can't think of any place in the standard that requires all equal bool values
    to have the same representation. In other words, I don't see anything wrong
    with an implementation that stores a byte in a bool and considers zero to be
    false and any nonzero value to be true. Under such an implementation,
    memcmp might yield unequal for two values that should be considered equal.
     
    Andrew Koenig, Mar 23, 2005
    #8
  9. Guest

    I dont care about that as I want just keep them in a set. If A < B I
    just want to make sure A < B all the time

    Raj
     
    , Mar 23, 2005
    #9
  10. Guest

    >Let me ask a rhetorical questions, though. If you are prepared to
    give it
    >overloaded 'new' and 'memset' (let's suppose it's possible somehow),

    why
    >don't you just overload the operator < ?


    Its some legacy code. The idea being if you add a new member it will
    work automatically. If you overload <
    you will have to manually update it for the new member

    Raj
     
    , Mar 23, 2005
    #10
  11. Andrew Koenig wrote:
    > "Victor Bazarov" <> wrote in message
    > news:N_f0e.55648$01.us.to.verio.net...
    >
    >
    >>>My question is whether the second approach is portable? If so do I
    >>>really need the memset ? Does the standard say anything about
    >>>initializing the padding bits?

    >>
    >>1) Yes. 2) Yes. 3) They are uninitialised unless your object has static
    >>storage duration.

    >
    >
    > Beg pardon? Memcmp portable? I don't see why. As a simple example, I
    > can't think of any place in the standard that requires all equal bool values
    > to have the same representation. In other words, I don't see anything wrong
    > with an implementation that stores a byte in a bool and considers zero to be
    > false and any nonzero value to be true. Under such an implementation,
    > memcmp might yield unequal for two values that should be considered equal.


    Beg pardon? How is the internal representations of 'true' or 'false'
    relevant in this case? Whether 'true' is 0 or 1, two 'true's will
    compare equal and so will two internal representations of 'false'. One
    can't really expect two different internal representations from two
    different architectures to compare equal, but who cares about that?
    The program runs on a virtual machine that cannot have two distinctly
    different representations for 'true' during the same run of the program,
    can it?

    V
     
    Victor Bazarov, Mar 23, 2005
    #11
  12. wrote:
    >>Let me ask a rhetorical questions, though. If you are prepared to

    >
    > give it
    >
    >>overloaded 'new' and 'memset' (let's suppose it's possible somehow),

    >
    > why
    >
    >>don't you just overload the operator < ?

    >
    >
    > Its some legacy code. The idea being if you add a new member it will
    > work automatically. If you overload <
    > you will have to manually update it for the new member


    Maintenance is maintenance. You gotta do it right or you shouldn't be
    doing it at all. Doing half-a-job is not really going to buy you much.

    V
     
    Victor Bazarov, Mar 23, 2005
    #12
  13. Rolf Magnus Guest

    Victor Bazarov wrote:

    >> Beg pardon? Memcmp portable? I don't see why. As a simple example, I
    >> can't think of any place in the standard that requires all equal bool
    >> values
    >> to have the same representation. In other words, I don't see anything
    >> wrong with an implementation that stores a byte in a bool and considers
    >> zero to be
    >> false and any nonzero value to be true. Under such an implementation,
    >> memcmp might yield unequal for two values that should be considered
    >> equal.

    >
    > Beg pardon? How is the internal representations of 'true' or 'false'
    > relevant in this case? Whether 'true' is 0 or 1, two 'true's will
    > compare equal and so will two internal representations of 'false'.


    Read Andrew's response again. His point was that this (e.g. true always
    comparing equal to true in memcmp) might not be the case.

    > One can't really expect two different internal representations from two
    > different architectures to compare equal, but who cares about that?
    > The program runs on a virtual machine that cannot have two distinctly
    > different representations for 'true' during the same run of the program,
    > can it?


    What makes you think it can't?
     
    Rolf Magnus, Mar 23, 2005
    #13
  14. "Victor Bazarov" <> wrote in message
    news:4Jg0e.55662$01.us.to.verio.net...

    > Beg pardon? How is the internal representations of 'true' or 'false'
    > relevant in this case? Whether 'true' is 0 or 1, two 'true's will
    > compare equal and so will two internal representations of 'false'.


    I don't think anything in the standard prohibits two values, both of which
    are "true", from having different internal representations. Please read
    again what I said in my previous post:

    > In other words, I don't see anything wrong with an implementation that
    > stores a byte in a bool
    > and considers zero to be false and any nonzero value to be true.


    On such an implementation, two variables might both have the same value but
    different representations. Of course the implementation would have to
    change representation appropriately if the value were to be treated as an
    integer, but I can see no particular difficulty in doing so.

    As a historical note, I am quite certain that C and C++ implementations have
    existed under which two pointers can compare equal but nevertheless have
    different representations. And I am entirely certain that on most modern
    computers, two floating-point values with different representations can
    compare equal--namely +0 and -0.
     
    Andrew Koenig, Mar 23, 2005
    #14
  15. Andrew Koenig wrote:
    >>In other words, I don't see anything wrong with an implementation that
    >>stores a byte in a bool
    >>and considers zero to be false and any nonzero value to be true.


    I wasn't paying attention apparently. Sorry.
    On a historical note, was there ever an implementation that did that?

    V
     
    Victor Bazarov, Mar 23, 2005
    #15
  16. "Victor Bazarov" <> wrote in message
    news:qyi0e.55698$01.us.to.verio.net...
    > Andrew Koenig wrote:
    >>>In other words, I don't see anything wrong with an implementation that
    >>>stores a byte in a bool
    >>>and considers zero to be false and any nonzero value to be true.


    > I wasn't paying attention apparently. Sorry.
    > On a historical note, was there ever an implementation that did that?


    Not to my knowledge for bool. But definitely for pointers and
    floating-point values.

    Then there's this issue:

    struct X { char a; int b; };

    void foo()
    {
    X x1 = { '?', 42 };
    X x2 = x1;
    // ...
    };

    If there's padding between X::a and X::b, I don't think that the
    implementation is obligated to copy that padding. In other words, I don't
    think there's any guarantee that memcmp will show x1 and x2 as being equal
    if executed at the comment.
     
    Andrew Koenig, Mar 23, 2005
    #16
  17. Jack Klein Guest

    On 23 Mar 2005 07:23:13 -0800, wrote in
    comp.lang.c++:

    > I have a struct like
    >
    > struct MyStruct
    > {
    > int a;
    > int b;
    > int c:
    > bool d;
    > bool e;
    > }
    >
    > I want to insert such a struct in a map. I understand I can declare the
    > < operator for such a struct for lexicographical compare like
    >
    > x.a < y.a || !(y.a < x.a) && ( x.b < y.b || !(y.b < x.b) && x.c < y.c
    > ............
    >
    > or a simple one like this
    >
    > memcmp(&x,&y,sizeof(MyStruct)) < 0;
    >
    > This seems to work if I memset and fill the sizeof(MyStruct) with
    > zeroes in the constructor before I assign a b and c etc. This will take
    > care of any padding that the compiler adds.
    >
    > My question is whether the second approach is portable? If so do I
    > really need the memset ? Does the standard say anything about
    > initializing the padding bits?
    >
    > Raj


    Actually, it is extremely non-portable, and error-prone as well. As
    others have pointed out, endianness can be a killer. If int has four
    octet size bytes and is little endian like Intel and others, consider
    x.a = 256 and y.a = 1. Then they begin with the byte sequences:

    x 0x00 0x10 0x00 0x00 ...
    y 0x01 0x00 0x00 0x00 ...

    So which one will memcmp() find greater?

    Also there are real widely used compilers where padding can certainly
    trip you up.

    Gnu ports for x86, for example, use the Intel 80 bit extended
    precision real format for long double, and sizeof(long double) is 12,
    so they always start aligned to a 4 byte address.

    You can assign two long doubles the same value, then using a union or
    pointer punning change the final two bytes of one of them. They will
    still compare as equal with ==, but not with memcpy().

    --
    Jack Klein
    Home: http://JK-Technology.Com
    FAQs for
    comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
    comp.lang.c++ http://www.parashift.com/c -faq-lite/
    alt.comp.lang.learn.c-c++
    http://www.contrib.andrew.cmu.edu/~ajo/docs/FAQ-acllc.html
     
    Jack Klein, Mar 24, 2005
    #17
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Joona I Palaste

    Re: memcmp versus strstr; reaction to chr(0)

    Joona I Palaste, Jul 24, 2003, in forum: C Programming
    Replies:
    0
    Views:
    426
    Joona I Palaste
    Jul 24, 2003
  2. Thomas Matthews

    Re: memcmp versus strstr; reaction to chr(0)

    Thomas Matthews, Jul 24, 2003, in forum: C Programming
    Replies:
    0
    Views:
    518
    Thomas Matthews
    Jul 24, 2003
  3. Burne C
    Replies:
    3
    Views:
    1,329
    Peter Ammon
    Jul 25, 2003
  4. Dan Pop
    Replies:
    0
    Views:
    377
    Dan Pop
    Jul 24, 2003
  5. Sidney Cadot

    memcmp() semantics

    Sidney Cadot, Nov 23, 2003, in forum: C Programming
    Replies:
    6
    Views:
    607
    Arthur J. O'Dwyer
    Nov 25, 2003
Loading...

Share This Page