nameless struct / union

Discussion in 'C++' started by Bryan Parkoff, Dec 10, 2007.

  1. I hate using struct / union with dot between two words. How can I use
    one word instead of two words because I want the source code look reading
    clear. three variables are shared inside one variable. I manipulate to
    change 8-bit data before it causes to change 16-bit data and 32-bit data.
    For example.

    union

    {

    struct _Byte

    {

    U_BYTE AAL;

    U_BYTE AAH;

    } Byte;

    struct _Word

    {

    U_WORD AAW;

    } Word;

    struct _DWORD

    {

    U_DWORD AA;

    } DWord;

    };

    int main()

    {

    // I hate dot between 2 words.

    Byte.AAL = 0xFF;

    Byte.AAH = 0x20;

    Byte.AAL += 0x0A;

    Byte.AAH += 0x01;

    Word.AAW += 0xFF;

    DWord.AA += 0xFFFF;

    // It is easy reading variable inside struct / union.

    AAL = 0xFF;

    AAH = 0x20;

    AAL += 0x0A;

    AAH += 0x01;

    AAW += 0xFF;

    AA += 0xFFFF;


    --

    Bryan Parkoff
     
    Bryan Parkoff, Dec 10, 2007
    #1
    1. Advertising

  2. Bryan Parkoff wrote:
    > I hate using struct / union with dot between two words. How can I
    > use one word instead of two words because I want the source code look
    > reading clear. three variables are shared inside one variable. I
    > manipulate to change 8-bit data before it causes to change 16-bit
    > data and 32-bit data. For example.
    >
    > union
    >
    > {
    >
    > struct _Byte
    >
    > {
    >
    > U_BYTE AAL;
    >
    > U_BYTE AAH;
    >
    > } Byte;
    >
    > struct _Word
    >
    > {
    >
    > U_WORD AAW;
    >
    > } Word;
    >
    > struct _DWORD
    >
    > {
    >
    > U_DWORD AA;
    >
    > } DWord;
    >
    > };
    >
    > int main()
    >
    > {
    >
    > // I hate dot between 2 words.
    >
    > Byte.AAL = 0xFF;
    >
    > Byte.AAH = 0x20;
    >
    > Byte.AAL += 0x0A;
    >
    > Byte.AAH += 0x01;
    >
    > Word.AAW += 0xFF;
    >
    > DWord.AA += 0xFFFF;
    >
    > // It is easy reading variable inside struct / union.
    >
    > AAL = 0xFF;
    >
    > AAH = 0x20;
    >
    > AAL += 0x0A;
    >
    > AAH += 0x01;
    >
    > AAW += 0xFF;
    >
    > AA += 0xFFFF;


    Uh... I'm a bit rusty on unnamed unions. Does an unnamed union
    create a global instance?

    Also, your use of first assigning one part of the union and then
    using another part has undefined behaviour, IIRC.

    V
    --
    Please remove capital 'A's when replying by e-mail
    I do not respond to top-posted replies, please don't ask
     
    Victor Bazarov, Dec 10, 2007
    #2
    1. Advertising

  3. Bryan Parkoff

    Rolf Magnus Guest

    Victor Bazarov wrote:

    > Uh... I'm a bit rusty on unnamed unions. Does an unnamed union
    > create a global instance?


    I didn't even know it's allowed outside a struct. Is it actually?

    > Also, your use of first assigning one part of the union and then
    > using another part has undefined behaviour, IIRC.


    Yes. You must always only read the member you last wrote to, otherwise the
    behavior is undefined.
     
    Rolf Magnus, Dec 10, 2007
    #3
  4. Bryan Parkoff

    Craig Scott Guest

    On Dec 11, 5:49 am, "Bryan Parkoff" <> wrote:
    > I hate using struct / union with dot between two words. How can I use
    > one word instead of two words because I want the source code look reading
    > clear. three variables are shared inside one variable. I manipulate to
    > change 8-bit data before it causes to change 16-bit data and 32-bit data.
    > For example.
    >
    > union
    > {
    > struct _Byte
    > {
    > U_BYTE AAL;
    > U_BYTE AAH;
    > } Byte;
    >
    > struct _Word
    > {
    > U_WORD AAW;
    > } Word;
    >
    > struct _DWORD
    > {
    > U_DWORD AA;
    > } DWord;
    >
    > };


    Sorry, it's not answering your original post (others seem to be doing
    that already), but using a type name which starts with an underscore
    and is followed by an uppercase letter is not allowed by the C++
    standard (unless your code is part of the compiler implementation
    itself). A name starting with an underscore and NOT followed by an
    uppercase letter cannot be used in the global namespace, but
    presumably could be elsewhere (but I'd recommend against it to avoid
    confusion). See section 17.4.3.1.2 of the standard for details.

    --
    Computational Modeling, CSIRO (CMIS)
    Melbourne, Australia
     
    Craig Scott, Dec 10, 2007
    #4
  5. >> Uh... I'm a bit rusty on unnamed unions. Does an unnamed union
    >> create a global instance?

    >
    > I didn't even know it's allowed outside a struct. Is it actually?
    >
    >> Also, your use of first assigning one part of the union and then
    >> using another part has undefined behaviour, IIRC.

    >
    > Yes. You must always only read the member you last wrote to, otherwise the
    > behavior is undefined.


    I want to follow up. I feel nameless union/struct is necessary. I want
    three variables to share one big variable. You want to work two byte data.
    It causes word data to be modified automatically because it is shared. For
    example, you define Low_Byte and High_Byte. You add 0xFF + 0x03. Low_Byte
    is modified to show 0x02. It does not modify High_Byte when one bit fell
    off Low_Byte to become Carry. Then Carry can be added to High_Byte. It
    makes easier so I do not have to use Word &= 0x00FF. If I want to work word
    data, I do not need Carry and add can be 0x20FF + 0x0003. Word is modified
    to show 0x2102.
    It makes my source code readable clearly. Here is an exmaple below.
    Please let me know what you think.

    tatic union

    {

    U_BYTE B[4];

    U_WORD W[2];

    U_DWORD DW;

    };

    #define Low_Byte B[0]

    #define High_Byte B[1]

    #define Low_Byte2 B[2]

    #define High_Byte2 B[3]

    #define Low_Word W[0]

    #define High_Word W[1]

    #define DW DWord



    int main()

    {

    Low_Byte = 0xFF;

    High_Byte = 0x20;

    Low_Byte += 0x03;

    High_Byte += 0x01;

    Low_Word += 0x00FF;

    DWord += 0x0000FFFF;

    return 0;

    }
     
    Bryan Parkoff, Dec 10, 2007
    #5
  6. Bryan Parkoff wrote:
    > [..] I feel nameless union/struct is necessary. I want three variables to
    > share one big variable. You want to work
    > two byte data. It causes word data to be modified automatically
    > because it is shared.


    That is your mistake. The language explicitly states that you only
    can read the same data you wrote. You cannot write byte and then
    read word. That's not what the unions are for. To accomplish that
    you need to 'static_cast' your word into an array of char and then
    change each char as you want.

    > [..]


    V
    --
    Please remove capital 'A's when replying by e-mail
    I do not respond to top-posted replies, please don't ask
     
    Victor Bazarov, Dec 11, 2007
    #6
  7. > Bryan Parkoff wrote:
    >> [..] I feel nameless union/struct is necessary. I want three variables
    >> to share one big variable. You want to work
    >> two byte data. It causes word data to be modified automatically
    >> because it is shared.

    >
    > That is your mistake. The language explicitly states that you only
    > can read the same data you wrote. You cannot write byte and then
    > read word. That's not what the unions are for. To accomplish that
    > you need to 'static_cast' your word into an array of char and then
    > change each char as you want.


    Please explain why you think that union is not to be used. I want to
    share two bytes into one word. I can modify one low byte at this time and
    high byte next time. Then word gets data from 2 bytes. You can define one
    array with two elements or two bytes. They are the same. Please reread my
    previous thread post so you can compare union using #define without struct
    and union with struct here below.

    union
    {
    struct _B
    {
    BYTE L;
    BYTE H;
    } B;
    WORD W;
    }

    You can store two bytes into one word using pointer like this below. It
    is identical to union above. The problem is that after C++ Compiler
    converted C++ source code into x86 / non x86 machine language. It has extra
    1-2 instructions because it needs to read memory address first before
    accessing variable through pointer. Union does not have extra instructions.
    It has only one instruction to acess variable instead of using pointer.
    Union is the best choice.
    Please explain why you claim that I made my mistake. Please show your
    example of static_cast<> keyword. It is like to put pointer in
    static_cast<>.

    WORD W = 0;
    BYTE L = (BYTE*)&W;
    BYTE H = (BYTE*)&W+1;
    *L = 0xFF;
    *H = 0x20;

    Bryan Parkoff
     
    Bryan Parkoff, Dec 11, 2007
    #7
  8. Bryan Parkoff wrote:
    >> Bryan Parkoff wrote:
    >>> [..] I feel nameless union/struct is necessary. I want three
    >>> variables to share one big variable. You want to work
    >>> two byte data. It causes word data to be modified automatically
    >>> because it is shared.

    >>
    >> That is your mistake. The language explicitly states that you only
    >> can read the same data you wrote. You cannot write byte and then
    >> read word. That's not what the unions are for. To accomplish that
    >> you need to 'static_cast' your word into an array of char and then
    >> change each char as you want.

    >
    > Please explain why you think that union is not to be used.


    I don't have to. The language Standard forbids it. If I had to
    speculate it's because you either need to explicitly allow certain
    combinations (thus making a relatively long set of pairs that are
    OK to share the memory and let you read *not* what you wrote), or
    you disallow everything (like the Standard does) because there are
    combinations (like chars and a pointer, for instance) which are by
    *no* means OK. You cannot write a bunch of chars and then expect
    them to form a valid pointer, and even _reading_ (loading into
    an address register) an invalid pointer can cause hardware fault
    on some systems.

    > [..]


    V
    --
    Please remove capital 'A's when replying by e-mail
    I do not respond to top-posted replies, please don't ask
     
    Victor Bazarov, Dec 11, 2007
    #8
  9. > Bryan Parkoff wrote:
    >>> Bryan Parkoff wrote:
    >>>> [..] I feel nameless union/struct is necessary. I want three
    >>>> variables to share one big variable. You want to work
    >>>> two byte data. It causes word data to be modified automatically
    >>>> because it is shared.
    >>>
    >>> That is your mistake. The language explicitly states that you only
    >>> can read the same data you wrote. You cannot write byte and then
    >>> read word. That's not what the unions are for. To accomplish that
    >>> you need to 'static_cast' your word into an array of char and then
    >>> change each char as you want.

    >>
    >> Please explain why you think that union is not to be used.

    >
    > I don't have to. The language Standard forbids it. If I had to
    > speculate it's because you either need to explicitly allow certain
    > combinations (thus making a relatively long set of pairs that are
    > OK to share the memory and let you read *not* what you wrote), or
    > you disallow everything (like the Standard does) because there are
    > combinations (like chars and a pointer, for instance) which are by
    > *no* means OK. You cannot write a bunch of chars and then expect
    > them to form a valid pointer, and even _reading_ (loading into
    > an address register) an invalid pointer can cause hardware fault
    > on some systems.


    OK, I understand. It looks like non-standard C++ Compiler to accept
    four byte variables to be linked into one dword variable using union. I
    always decide to allow overcoming non-standard C++ Compiler. I hope that it
    should be compatible to all C++ Coompiler like Microsoft, GNU, Mac OSX, and
    others.
    static_cast<> is used only if I want to convert small size to big size,
    but not shared / linked small / big sizes. Hopefully, C++ Compiler should
    be able to implement to support non-standard C++ near the future so this
    code can be very good portablility.
    Thank you for your comment. Smile...

    Bryan Parkoff
     
    Bryan Parkoff, Dec 11, 2007
    #9
  10. Bryan Parkoff

    Fred Zwarts Guest

    "Bryan Parkoff" <> wrote in message news:475e085c$0$9912$...
    >> Bryan Parkoff wrote:
    >>> [..] I feel nameless union/struct is necessary. I want three variables
    >>> to share one big variable. You want to work
    >>> two byte data. It causes word data to be modified automatically
    >>> because it is shared.

    >>
    >> That is your mistake. The language explicitly states that you only
    >> can read the same data you wrote. You cannot write byte and then
    >> read word. That's not what the unions are for. To accomplish that
    >> you need to 'static_cast' your word into an array of char and then
    >> change each char as you want.

    >
    > Please explain why you think that union is not to be used. I want to
    > share two bytes into one word. I can modify one low byte at this time and
    > high byte next time. Then word gets data from 2 bytes. You can define one
    > array with two elements or two bytes. They are the same. Please reread my
    > previous thread post so you can compare union using #define without struct
    > and union with struct here below.
    >
    > union
    > {
    > struct _B
    > {
    > BYTE L;
    > BYTE H;
    > } B;
    > WORD W;
    > }
    >
    > You can store two bytes into one word using pointer like this below. It
    > is identical to union above. The problem is that after C++ Compiler
    > converted C++ source code into x86 / non x86 machine language. It has extra
    > 1-2 instructions because it needs to read memory address first before
    > accessing variable through pointer. Union does not have extra instructions.
    > It has only one instruction to acess variable instead of using pointer.
    > Union is the best choice.
    > Please explain why you claim that I made my mistake. Please show your
    > example of static_cast<> keyword. It is like to put pointer in
    > static_cast<>.
    >
    > WORD W = 0;
    > BYTE L = (BYTE*)&W;
    > BYTE H = (BYTE*)&W+1;
    > *L = 0xFF;
    > *H = 0x20;
    >
    > Bryan Parkoff


    The standard says that using an union in this way results in undefined behavior.
    I don't know the rationale behind this, but I think that normally for a POD it is not a problem.
    However, a union can be a complex thing and its members can be complex classes.
    E.g., members which may be accessed only using their member functions. Even the assignment
    operator may be overwritten, so that it is unclear what happens if you assign one member
    of a union and read another member of a union.
    Probably, this is the rationale behind the standard.
    The same problems will show up in your second strategy if WORD and BYTE are complex classes.
     
    Fred Zwarts, Dec 11, 2007
    #10
  11. Bryan Parkoff

    Rolf Magnus Guest

    Bryan Parkoff wrote:

    > OK, I understand.  It looks like non-standard C++ Compiler to accept
    > four byte variables to be linked into one dword variable using union.


    The compiler is right to accept it. The standard just says that the behavior
    is undefined, which means the compiler can make anything it wants out of
    it. Doing what you expected is just one possible outcome of many. You
    should always avoid undefined behavior because it's ... well, not defined
    what happens.

    > I always decide to allow overcoming non-standard C++ Compiler.  I hope
    > that it should be compatible to all C++ Coompiler like Microsoft, GNU, Mac
    > OSX, and others.


    If you want to be compatible with all compilers, you have to stick to the
    standard. That's what it's there for.

    > static_cast<> is used only if I want to convert small size to big size,
    > but not shared / linked small / big sizes.


    You can static_cast between pointer types. The standard guarantees that you
    can access any object as array of char that way. However, the internal
    layout of the object might again be different between compilers. One
    example from the real world is endianness. The order of the bytes in bigger
    size integers is not the same on all CPUs.
     
    Rolf Magnus, Dec 11, 2007
    #11
  12. Bryan Parkoff

    James Kanze Guest

    On Dec 11, 6:54 am, "Victor Bazarov" <> wrote:
    > Bryan Parkoff wrote:
    > >> Bryan Parkoff wrote:
    > >>> [..] I feel nameless union/struct is necessary. I want three
    > >>> variables to share one big variable. You want to work
    > >>> two byte data. It causes word data to be modified automatically
    > >>> because it is shared.


    > >> That is your mistake. The language explicitly states that
    > >> you only can read the same data you wrote. You cannot
    > >> write byte and then read word. That's not what the unions
    > >> are for. To accomplish that you need to 'static_cast' your
    > >> word into an array of char and then change each char as you
    > >> want.


    > > Please explain why you think that union is not to be used.


    > I don't have to. The language Standard forbids it. If I had
    > to speculate it's because you either need to explicitly allow
    > certain combinations (thus making a relatively long set of
    > pairs that are OK to share the memory and let you read *not*
    > what you wrote), or you disallow everything (like the Standard
    > does) because there are combinations (like chars and a
    > pointer, for instance) which are by *no* means OK.


    > You cannot write a bunch of chars and then expect them to form
    > a valid pointer, and even _reading_ (loading into an address
    > register) an invalid pointer can cause hardware fault on some
    > systems.


    That would be a valid objection for casting to char* and writing
    as well.

    I don't know the actual motivation for the restriction. What
    the OP is asking for is usually called type punning. In
    pre-standard (C standard) days, there were two widespread
    techniques of type punning: using a union (as he does), and
    casting the address to a different type. Most compilers
    supported both, but at least one didn't support either without
    special options.

    For whatever reasons, the C standards committee decided to only
    support the casting of the pointer, and then only if one of the
    types involved was a character type (char* or unsigned char*).
    In practice, compilers still differ with regards to what they
    support. (Note that regardless of what the compiler supports,
    it's likely that both will work if optimization is turned off.)

    --
    James Kanze (GABI Software) email:
    Conseils en informatique orientée objet/
    Beratung in objektorientierter Datenverarbeitung
    9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
     
    James Kanze, Dec 11, 2007
    #12
  13. "Victor Bazarov" <> wrote in
    news:fjkpd9$g20$:

    > That is your mistake. The language explicitly states that you only
    > can read the same data you wrote. You cannot write byte and then
    > read word.



    Although the Standard does not the define the behaviour of what happens
    when you write to one union member and then read from a different one,
    it certainly does not restrict the implementation from defining the
    behaviour.

    I've never come across a system where this union practice didn't do
    exactly what you want it to do. Never.

    I doubt you'll be sacrificing portability if you go ahead with this
    method. If you want your code to get the 100% portable stamp of approval
    tho, you might consider finding a different way of doing it.


    > That's not what the unions are for. To accomplish that
    > you need to 'static_cast' your word into an array of char and then
    > change each char as you want.



    If I wanted to do something like:

    union Foo {
    char unsigned bytes[sizeof long];
    long unsigned x;
    };

    Foo bar;

    bar.x = 27892;
    bar.bytes[0] = 5;

    , then I'd probably do something like:


    struct Foo {
    long unsigned x;

    char unsigned *const bytes;

    Foo() : bytes(reinterpret_cast<char unsigned*>(&x)) {}
    };


    Unfortunately tho this increases the size of Foo... plus you can no
    longer move the object in memory... and it's also not a POD... plus
    sizeof bytes won't give you what you want.

    The thing is tho, that if you're making assumptions about the amount of
    bytes in a certain integer type, and also about whether there's padding
    bits in those integer types, then portability's already been thrown out
    the window, so I'd say just go with the union method.

    --
    Tomás Ó hÉilidhe
     
    Tomás Ó hÉilidhe, Dec 12, 2007
    #13
  14. "Victor Bazarov" <> wrote in news:fjl8m6$mhg$1
    @news.datemas.de:

    >> Please explain why you think that union is not to be used.

    >
    > I don't have to. The language Standard forbids it.



    I don't see anything about a constraint violation in the Standard, nor do I
    see any explicit forbidding of the implementation to define the behaviour
    of what happens when you use unions like this.

    --
    Tomás Ó hÉilidhe
     
    Tomás Ó hÉilidhe, Dec 12, 2007
    #14
  15. Bryan Parkoff

    red floyd Guest

    Tomás Ó hÉilidhe wrote:
    > "Victor Bazarov" <> wrote in
    > news:fjkpd9$g20$:
    >
    >> That is your mistake. The language explicitly states that you only
    >> can read the same data you wrote. You cannot write byte and then
    >> read word.

    >
    >
    > Although the Standard does not the define the behaviour of what happens
    > when you write to one union member and then read from a different one,
    > it certainly does not restrict the implementation from defining the
    > behaviour.
    >
    > I've never come across a system where this union practice didn't do
    > exactly what you want it to do. Never.
    >
    > I doubt you'll be sacrificing portability if you go ahead with this
    > method. If you want your code to get the 100% portable stamp of approval
    > tho, you might consider finding a different way of doing it.
    >


    I doubt the OP is even concerned about portability. Looks like he's
    trying to map the x86 register set.
     
    red floyd, Dec 12, 2007
    #15
  16. "red floyd" <> wrote in message
    news:DQI7j.5552$...
    > Tomás Ó hÉilidhe wrote:
    >> "Victor Bazarov" <> wrote in
    >> news:fjkpd9$g20$:
    >>> That is your mistake. The language explicitly states that you only
    >>> can read the same data you wrote. You cannot write byte and then
    >>> read word.

    >>
    >>
    >> Although the Standard does not the define the behaviour of what happens
    >> when you write to one union member and then read from a different one, it
    >> certainly does not restrict the implementation from defining the
    >> behaviour.
    >>
    >> I've never come across a system where this union practice didn't do
    >> exactly what you want it to do. Never.
    >>
    >> I doubt you'll be sacrificing portability if you go ahead with this
    >> method. If you want your code to get the 100% portable stamp of approval
    >> tho, you might consider finding a different way of doing it.
    >>

    >
    > I doubt the OP is even concerned about portability. Looks like he's
    > trying to map the x86 register set.


    I prefer to make sure to standardize my source code to ANSI C/C++ and
    accepts portability. Sometimes, Microsoft Visual C++ 9.0 has extended
    standard that standard C/C++ Compiler does not accept. For example, you use
    comment "//" and "/* */" on file.c (not file.cpp). Standard C Compiler does
    not accept "//". I have to live with standard C/C++ Compiler because I want
    to port my source code from Microsoft C/C++ Compiler to GNU to Mac OSX to
    Linux to Unix, etc.
    How do you think? You try to map x86 register set. Then, how do you
    map register set on non-x86 machine? Maybe some machine do not have four
    sizes to be shared on one register meaning register set does not accept
    byte, word, and dword. It does accept qword. Then, you are able to use
    char, word, dword, and qword on C++ source code. Then, C/C++ Compiler will
    convert all sizes to 64-bit size using mask to clear all upper bits like
    "AND".
    What choice do I have? Should I use union if I want to share four sizes
    into one 64 bit variable? Should I avoid union and use big size using "AND"
    and right shift? Try to compare two examples below.

    union _Size
    {
    struct _B
    {
    unsigned char L;
    unsigned char H;
    } B;
    unsigned short W;
    } Size;

    // Example 1 == I want to modify only word before I read two individual
    bytes.
    Size.B.L = 0xFF;
    Size.B.H = 0x20;
    Size.W = 0x0A; // 1 bit as carry fell off Size.B.L and add carry to Size.B.H
    unsigned char L = Size.B.L;
    unsigned char H = Size.B.H;

    // Example 2 -- I want to avoid "AND" and right shift.
    Size.B.L = 0xFF;
    Size.B.H = 0x20;
    Size.W = 0x0A; // 1 bit as carry fell off Size.B.L and add carry to Size.B.H
    unsigned char L = Size.W & 0xFF;
    unsigned char H = Size.W >> 8;

    Read two bytes directly instead of "AND" and right shift can only have
    two instructions of machine language. Using "AND" and right shift may have
    more than two instructions. You have to decide which example 1 or 2 is best
    for you. You tell C/C++ Compiler to test optimization and see which is
    faster. I should use this to be ported to non-x86 machine and test it.
    Please state your opinion. Should I use "AND" and shift? Should I use
    union? Which is best practice writing C++?

    Bryan Parkoff
     
    Bryan Parkoff, Dec 12, 2007
    #16
  17. Bryan Parkoff

    James Kanze Guest

    On Dec 12, 2:46 am, "Tomás Ó hÉilidhe" <> wrote:
    > "Victor Bazarov" <> wrote
    > innews:fjkpd9$g20$:
    > > That is your mistake. The language explicitly states that
    > > you only can read the same data you wrote. You cannot write
    > > byte and then read word.


    > Although the Standard does not the define the behavior of
    > what happens when you write to one union member and then read
    > from a different one, it certainly does not restrict the
    > implementation from defining the behavior.


    > I've never come across a system where this union practice
    > didn't do exactly what you want it to do. Never.


    Really. G++ does define it (I think), but it's about the only
    one that doesn't. I've had real problems with it in the
    past---with the Microsoft C compiler.

    Note that depending on the optimization options, and what you
    are doing around it, it may also seem to work even though the
    compiler doesn't guarantee it. Seeming to work is one of the
    possible behaviors of undefined behavior. And it may stop
    working as a result of modifying some totally unrelated
    statement. (That was, in fact, the behavoir I encountered with
    Microsoft C.)

    > I doubt you'll be sacrificing portability if you go ahead with
    > this method.


    You definitely will be. In practice, as well as in theory.

    --
    James Kanze (GABI Software) email:
    Conseils en informatique orientée objet/
    Beratung in objektorientierter Datenverarbeitung
    9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
     
    James Kanze, Dec 12, 2007
    #17
  18. Bryan Parkoff

    James Kanze Guest

    On Dec 12, 5:46 am, "Bryan Parkoff" <> wrote:

    > I prefer to make sure to standardize my source code to ANSI C/C++ and
    > accepts portability.


    Regretfully, conforming to the relevant ISO standard doesn't
    guarantee portability.

    > Sometimes, Microsoft Visual C++ 9.0 has extended standard that
    > standard C/C++ Compiler does not accept. For example, you use
    > comment "//" and "/* */" on file.c (not file.cpp). Standard C
    > Compiler does not accept "//".


    If they conform to the C standard, they do. "//" is just as
    valid in C as in C++.

    > What choice do I have? Should I use union if I want to share
    > four sizes into one 64 bit variable?


    Of course. But of course, portably, you can only access the
    last value written. Portably, it can't be made to work
    otherwise anyway---type punning, even when it works, is never
    portable.

    > Should I avoid union and use big size using "AND"
    > and right shift? Try to compare two examples below.


    > union _Size
    > {
    > struct _B
    > {
    > unsigned char L;
    > unsigned char H;
    > } B;
    > unsigned short W;
    > } Size;


    > // Example 1 == I want to modify only word before I read two individual
    > bytes.
    > Size.B.L = 0xFF;
    > Size.B.H = 0x20;
    > Size.W = 0x0A; // 1 bit as carry fell off Size.B.L and add carry to Size..B.H
    > unsigned char L = Size.B.L;
    > unsigned char H = Size.B.H;


    The problem is that even if the type punning worked, the values
    you get in L and H will vary. You're not even really guaranteed
    that one will be 0x0A, and the other 0x00 (although it's hard to
    imagine an implementation where this wouldn't be the case).

    (Also, I can't make any sense of your comment.)

    > // Example 2 -- I want to avoid "AND" and right shift.
    > Size.B.L = 0xFF;
    > Size.B.H = 0x20;
    > Size.W = 0x0A; // 1 bit as carry fell off Size.B.L and add carry to Size..B.H
    > unsigned char L = Size.W & 0xFF;
    > unsigned char H = Size.W >> 8;


    > Read two bytes directly instead of "AND" and right shift can only have
    > two instructions of machine language.


    I would expect just about any reasonable compiler to generate
    the same code for your two examples. And if the code is
    different, the second will probably be faster, since it will
    only require one memory read.

    > Using "AND" and right shift may have more than two
    > instructions.


    Or not. It depends on the compiler and the architecture.
    Masking with 0xFF and shifting right 8 bits are common enough
    idioms for accessing bytes that the compiler will recognize
    them, and generate the byte access instructions, if that is the
    fastest way to do it.

    > You have to decide which example 1 or 2 is best
    > for you. You tell C/C++ Compiler to test optimization and see which is
    > faster. I should use this to be ported to non-x86 machine and test it.
    > Please state your opinion. Should I use "AND" and shift? Should I use
    > union? Which is best practice writing C++?


    Please state what you are trying to accomplish. Until we know
    that, we can't very well say what the best way to do it is.

    --
    James Kanze (GABI Software) email:
    Conseils en informatique orientée objet/
    Beratung in objektorientierter Datenverarbeitung
    9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
     
    James Kanze, Dec 12, 2007
    #18
  19. Bryan Parkoff

    Rolf Magnus Guest

    Bryan Parkoff wrote:

    >>> I doubt you'll be sacrificing portability if you go ahead with this
    >>> method. If you want your code to get the 100% portable stamp of approval
    >>> tho, you might consider finding a different way of doing it.
    >>>

    >>
    >> I doubt the OP is even concerned about portability. Looks like he's
    >> trying to map the x86 register set.


    You could separate portability into several parts: The hardware
    architecture, the operating system and the compiler. If we are talking
    about x86, the hardware is pretty much fixed, but there are still different
    compilers that may handle things differently.

    > I prefer to make sure to standardize my source code to ANSI C/C++ and
    > accepts portability.


    That's basically a good idea, but for those parts that are as low-level as
    CPU registers, you can't get, nor do you need, full portability.

    > Sometimes, Microsoft Visual C++ 9.0 has extended
    > standard that standard C/C++ Compiler does not accept. For example, you
    > use
    > comment "//" and "/* */" on file.c (not file.cpp). Standard C Compiler
    > does not accept "//".


    This has actually been part of standard C for almost 9 years now.

    > How do you think? You try to map x86 register set. Then, how do you
    > map register set on non-x86 machine? Maybe some machine do not have four
    > sizes to be shared on one register meaning register set does not accept
    > byte, word, and dword.


    x86 is the only architecture I heard of that does this. There are many
    differences bewteen CPU architectures concerning registers. Many other
    architectures have a lot more registers, but some of those might have
    special meaning. Other architectures don't even have registers. And there
    are architectures which can dynamically switch between several modes, with
    register sets behaving differently depending on the mode. Hardware
    registers are as unportable as it gets.

    > It does accept qword. Then, you are able to use char, word, dword, and
    > qword on C++ source code. Then, C/C++ Compiler will convert all sizes to
    > 64-bit size using mask to clear all upper bits like "AND".
    > What choice do I have? Should I use union if I want to share four
    > sizes into one 64 bit variable? Should I avoid union and use big size
    > using "AND" and right shift? Try to compare two examples below.


    If you want to use 64 bit values, you already have to sacrifice portability.
    In standard C++, there is no portable type that is guaranteed to be 64 bits
    wide. Most compilers seem to offer such a type, but under different
    compiler-specific names.

    > union _Size
    > {
    > struct _B
    > {
    > unsigned char L;
    > unsigned char H;
    > } B;
    > unsigned short W;
    > } Size;


    Don't use names starting with an underscore followed by an uppercase letter.
    Those are reserved for the compiler/standard library.

    > // Example 2 -- I want to avoid "AND" and right shift.
    > Size.B.L = 0xFF;
    > Size.B.H = 0x20;
    > Size.W = 0x0A; // 1 bit as carry fell off Size.B.L and add carry to
    > Size.B.H unsigned char L = Size.W & 0xFF;
    > unsigned char H = Size.W >> 8;
    >
    > Read two bytes directly instead of "AND" and right shift can only have
    > two instructions of machine language. Using "AND" and right shift may
    > have more than two instructions.


    It may or may not, depending on the optimization capabilities of your
    compiler. Don't speculate what the compiler might produce out of your code.
    If you want to know, look into assembler output. It might surprise you.
    Also, there is still the cast option. Unions are not the language element
    that is meant to be used for this kind of thing.
     
    Rolf Magnus, Dec 12, 2007
    #19
  20. Bryan Parkoff

    Pete Becker Guest

    On 2007-12-11 22:46:36 -0500, red floyd <> said:

    >
    > I doubt the OP is even concerned about portability. Looks like he's
    > trying to map the x86 register set.


    Even so, portability may be a legitimate concern. There is more than
    one compiler that targets the x86.

    --
    Pete
    Roundhouse Consulting, Ltd. (www.versatilecoding.com) Author of "The
    Standard C++ Library Extensions: a Tutorial and Reference
    (www.petebecker.com/tr1book)
     
    Pete Becker, Dec 12, 2007
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Matt Garman
    Replies:
    1
    Views:
    669
    Matt Garman
    Apr 25, 2004
  2. Peter Dunker

    union in struct without union name

    Peter Dunker, Apr 26, 2004, in forum: C Programming
    Replies:
    2
    Views:
    875
    Chris Torek
    Apr 26, 2004
  3. AJAY SHARMA
    Replies:
    0
    Views:
    278
    AJAY SHARMA
    Sep 26, 2006
  4. AJAY SHARMA
    Replies:
    0
    Views:
    397
    AJAY SHARMA
    Sep 26, 2006
  5. Juha Nieminen
    Replies:
    7
    Views:
    962
    Greg Herlihy
    Mar 21, 2007
Loading...

Share This Page