Macro for setting MSB - Intended to work on both Little andBig-endian machines

Discussion in 'C Programming' started by Myth__Buster, Mar 26, 2013.

  1. Myth__Buster

    Myth__Buster Guest

    Hi All,

    Here is my attempt for setting the MSB of an integer depending upon whetherthe underlying machine is Little or Big-endian. Any comments/suggestions/views are appreciated.

    Here I have assumed though I don't store the 1ULL(LL - long long - to force1 to be stored in a multi-byte memory resource(say register) to hold the value 1) in a variable in my program, it will be accessed as a multi-byte value and hence 1 will be stored in the LSB of most-significant-byte of the value stored in a multi-byte memory resource(register) and not in the LSB ofleast-significant-byte of that resource. Please let me know if this is correct.

    <Code>

    #include <stdio.h>
    #include <limits.h>

    #define LSET_MSB(x) ((x) = (x) | 1ULL << (sizeof(x) * CHAR_BIT - 1))

    #define BSET_MSB(x) ((x) = (x) | 1ULL << (CHAR_BIT - 1))

    #define LIITE_ENDIAN (1ULL & 1)

    int main(void)
    {
    unsigned long long int x = 1;

    printf("x : %llu\n", x);
    printf("x : %#llx\n", x);

    if ( LIITE_ENDIAN )
    {
    printf("Little\n");
    LSET_MSB(x);
    }
    else
    {
    printf("Big\n");
    BSET_MSB(x);
    }

    printf("x : %llu\n", x);
    printf("x : %#llx\n", x);

    return 0;
    }

    </code>

    <OutputOnMyMachine>
    x : 1
    x : 0x1
    Little
    x : 9223372036854775809
    x : 0x8000000000000001
    </OutputOnMyMachine>

    Cheers,
    Raghavan
     
    Myth__Buster, Mar 26, 2013
    #1
    1. Advertising

  2. Myth__Buster

    Myth__Buster Guest

    Here I have assumed though I don't store the 1ULL(LL - long long - to force1 to be stored in a multi-byte memory resource(say register) to hold the value 1) in a variable in my program, it will be accessed as a multi-byte value and hence 1 will be stored in the LSB of most-significant-byte of the value stored in a multi-byte memory resource(register) and not in the LSB ofleast-significant-byte of that resource. Please let me know if this is correct.

    ----

    In the above, I am referring to a big-endian machine and not little-endian.
     
    Myth__Buster, Mar 26, 2013
    #2
    1. Advertising

  3. Myth__Buster

    Myth__Buster Guest

    By the way, I know the bitwise operations in C are independent of underlying machine's endian-ness and that's why I have attempted to figure out what's that
    endian-ness.

    Now, please note that the paragraph dealing with 1LL rather 1ULL in the corrected post is written with big-endian in mind but I forgot to mention it.

    Okay, here is why I thought (1ULL & 1) in C would mean different from the usual or-ring two 1's: 1ULL, because of its type LL(long long) being guaranteed to be large enough to demand more than a byte to get stored. So, for a given big-endian machine where long long is 8 bytes wide, 1ULL will be stored(in some memory resource say register, even for temporary or intermediateusages) as the least-significant-bit(LS-bit) of its most-significant-byte(MS-byte) in memory being set given that LS-byte will be stored at an higheraddress on such a machine unlike on a little-endian machine.

    However, now I realize that C abstracts the way in which 1ULL would be stored in memory and 1ULL will just mean number 1 regardless of the regardless of the underlying endian-ness.

    The above confusion came up in my mind as I was comparing the most obvious way of checking out a machine's endian-ness as under with some other supposed way.

    // Except that I have an explicit variable(actually, its memory location) to
    // hold and represent the value 1, there is no difference from the numeral-
    // literal 1 which would anyway be held in a temporary register for such
    // operations.

    int x = 1;
    if ( *(char *)&x == 1 )
    {
    printf("Little-endian\n);
    }

    // How about this? I think this and above should be same, isn't it?

    int x = 1;
    if ( x & 1 )
    {
    printf("Little-endian\n");
    }

    ---

    However, I see that (x & 1) on big-endian clearly abstracts the way 1 is laid out in memory and just takes the LS-bit of MS-byte and and-s with numerical 1 as if 1 was at the LS-bit of MS-byte like on little-endian machine.


    In essence, I just realized when you deal with memory directly with pointers to variables in C, you can get close to the endian-ness of the underlyinghardware but not when you deal with the variables just by their value.

    In the end, it's all about direct vs indirect access! Good.
     
    Myth__Buster, Mar 26, 2013
    #3
  4. Myth__Buster

    Myth__Buster Guest

    *was at the LS-bit of LS-byte like on little-endian machine . . .
     
    Myth__Buster, Mar 26, 2013
    #4
  5. Myth__Buster

    Myth__Buster Guest

    > . . . that might be
    >
    > ((byte*)&x)[0], ((byte*)&x)[3], or ((byte*)&x)[1]
    >


    Yeah, agreed and I have just realized what I am doing exactly which I have written in my latest post above.

    But, how can it be ((byte*)&x)[1]? This is possible if the machine is neither little nor big-endian but some mixed or different one altogether, right? And why not ((byte*)&x)[2], isn't there a machine which would give this in this context.
     
    Myth__Buster, Mar 26, 2013
    #5
  6. Myth__Buster

    James Kuyper Guest

    Re: Macro for setting MSB - Intended to work on both Little and Big-endianmachines

    On 03/26/2013 07:11 AM, Myth__Buster wrote:
    > By the way, I know the bitwise operations in C are independent of
    > underlying machine's endian-ness and that's why I have attempted to
    > figure out what's that endian-ness.
    >
    > Now, please note that the paragraph dealing with 1LL rather 1ULL in
    > the corrected post is written with big-endian in mind but I forgot to
    > mention it.
    >
    > Okay, here is why I thought (1ULL & 1) in C would mean different from
    > the usual or-ring two 1's: 1ULL, because of its type LL(long long)
    > being guaranteed to be large enough to demand more than a byte to get
    > stored. So, for a given big-endian machine where long long is 8 bytes
    > wide, 1ULL will be stored(in some memory resource say register, even
    > for temporary or intermediate usages) as the
    > least-significant-bit(LS-bit) of its most-significant-byte(MS-byte)


    Endianness is detectable in portable C code only when a value is stored
    in an object. That's because the only way to determine the endianess is
    to access the individual bytes of the object, which requires used of a
    union or type-punning. There's nothing you can portably do to determine
    the endianness of a register, because you can't take the address of a
    register, and you can't force a compiler to put a union object in a
    register (the 'register' keyword is just a suggestion, which the
    compiler is free to ignore).

    I don't understand why you think the least significant bit would be
    residing in the most significant byte. Whatever the reason is for that
    expectation, it's incorrect.

    > in memory being set given that LS-byte will be stored at an higher
    > address on such a machine unlike on a little-endian machine.
    >
    > However, now I realize that C abstracts the way in which 1ULL would
    > be stored in memory and 1ULL will just mean number 1 regardless of
    > the regardless of the underlying endian-ness.
    >
    > The above confusion came up in my mind as I was comparing the most
    > obvious way of checking out a machine's endian-ness as under with
    > some other supposed way.
    >
    > // Except that I have an explicit variable(actually, its memory
    > location) to
    > // hold and represent the value 1, there is no difference from the
    > numeral-
    > // literal 1 which would anyway be held in a temporary register for
    > such
    > // operations.


    Actually, there's one huge and highly relevant difference. You can use
    the expression &x when you have a variable, whereas &1 is a syntax
    error. Without an address that can be converted to char*, there's no way
    to test endianess.

    >
    > int x = 1;
    > if ( *(char *)&x == 1 )
    > {
    > printf("Little-endian\n);
    > }


    <pedantic>
    Keep in mind that if sizeof(int)==4, which is quite common nowadays,
    there are 4! = 24 different possible byte orders (most of which are
    exceedingly uncommon). Only one of those orderings is called
    little-endian, and only one is called big-endian - the others are
    generically called middle-endian. At least two of those other orders
    (2143 and 3412) have actually been used. I once ran into a web page that
    identified 11 of those 24 orders as having been used in specific
    contexts, which it identified - but I didn't think to bookmark it, and
    I've never been able to find it again.

    Your test will identify 6 of those possible orders as little-endian,
    only one of which actually is. Strictly speaking, you can identify an
    ordering as little endian only by checking the first sizeof(int)-1 bytes.
    </pedantic>

    <more pedantic>
    The standard doesn't require that the least significant bit of an 'int'
    be in the same location as the least significant bit when the byte that
    contains it is interpreted as unsigned char. It is extremely unlikely
    that this issue will ever come up.
    </more pedantic>

    <even more pedantic>
    The standard doesn't even require that the CHAR_BIT least significant
    bits are all stored in the same byte. In principle, an 8-bit byte could
    contain bits 0,4,8,12,16,20,24, and 28. The next byte could contain bits
    1,5,9,13,17,21,24, and 29, etc. There's a total of 64! different
    possible bit-orderings. However, I think you can fairly assume that any
    system for which such things are true was designed by aliens. :)
    </even more pedantic>

    > // How about this? I think this and above should be same, isn't it?
    >
    > int x = 1;
    > if ( x & 1 )


    Given x==1, the expression x&1 is guaranteed to be true regardless of
    endianess (even in in the most pedantic case I discussed above), which
    makes it a very bad way to test for endianess.

    > {
    > printf("Little-endian\n");
    > }
    >
    > ---
    >
    > However, I see that (x & 1) on big-endian clearly abstracts the way 1
    > is laid out in memory and just takes the LS-bit of MS-byte and and-s
    > with numerical 1 as if 1 was at the LS-bit of MS-byte like on
    > little-endian machine.


    If 'int' is bigendian, the least significant bit will be in the last
    byte, which is the least significant byte when using a bigendian
    representation. If 'int' is little-endian, that bit will be in the first
    byte, which is the least significant byte when using little-endian
    notation. But either way, it will reside in the least-significant byte,
    not the most significant one.

    > In essence, I just realized when you deal with memory directly with
    > pointers to variables in C, you can get close to the endian-ness of
    > the underlying hardware but not when you deal with the variables just
    > by their value.

    --
    James Kuyper
     
    James Kuyper, Mar 26, 2013
    #6
  7. Myth__Buster

    Guest

    On Tuesday, 26 March 2013 11:11:21 UTC, Myth__Buster wrote:
    > int x = 1;
    > if ( *(char *)&x == 1 )
    > {
    > printf("Little-endian\n);
    > }


    For portability, you will also need

    assert(sizeof x > 1);

    There are many architectures out there (some TI DSPs spring to mind) where char and int share the same size. When that is the case, your test will not detect big-endianness.
     
    , Mar 26, 2013
    #7
  8. Re: Macro for setting MSB - Intended to work on both Little and Big-endian machines

    Myth__Buster <> writes:
    > *was at the LS-bit of LS-byte like on little-endian machine . . .


    Without any quoted context, we can't tell what this refers to. Not all
    newsreaders provide an easy way to view the parent article; mine does,
    but it doesn't seem to provide an easy way to get back, so I don't use
    it much. Please quote enough of the parent article for your followup to
    make sense.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    Working, but not speaking, for JetHead Development, Inc.
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
     
    Keith Thompson, Mar 26, 2013
    #8
  9. Re: Macro for setting MSB - Intended to work on both Little and Big-endian machines

    China Blue White <> writes:
    > In article <>,
    > Myth__Buster <> wrote:
    >
    >> > . . . that might be
    >> >
    >> > ((byte*)&x)[0], ((byte*)&x)[3], or ((byte*)&x)[1]
    >> >

    >>
    >> Yeah, agreed and I have just realized what I am doing exactly which I
    >> have written in my latest post above.
    >>
    >> But, how can it be ((byte*)&x)[1]? This is possible if the machine is
    >> neither little nor big-endian but some mixed or different one
    >> altogether, right? And why not ((byte*)&x)[2], isn't there a machine
    >> which would give this in this context.

    >
    > It's PDP-11/VAX endianess where the byte order is <1><msb><lsb><2>.


    The PDP-11 used a middle-endian representation, but I don't believe the
    VAX ever did.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    Working, but not speaking, for JetHead Development, Inc.
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
     
    Keith Thompson, Mar 26, 2013
    #9
  10. Myth__Buster

    Myth__Buster Guest

    On Tuesday, March 26, 2013 8:29:00 AM UTC-4, wrote:
    > On Tuesday, 26 March 2013 11:11:21 UTC, Myth__Buster wrote:
    >
    > > int x = 1;

    >
    > > if ( *(char *)&x == 1 )

    >
    > > {

    >
    > > printf("Little-endian\n);

    >
    > > }

    >
    >
    >
    > For portability, you will also need
    >
    >
    >
    > assert(sizeof x > 1);
    >
    >
    >
    > There are many architectures out there (some TI DSPs spring to mind) where char and int share the same size. When that is the case, your test will not detect big-endianness.


    Well, regardless of the sizeof(char) in terms o bits and hence the value CHAR_BIT, sizeof(char) is guaranteed to be 1. But, CHAR_BIT can be big enoughto represent the largest integer data type in C: long long i.e., sizeof(long long) == 1 is possible. So, if that is the case, in C there would beno portable way of checking the endian-ness of such a kind of machine. In fact, with respect to integers, endian-ness has no role in such a machine as there are no more than one byte even in the largest integer data type to play with the byte-order of that integer in the respective memory layout.
     
    Myth__Buster, Mar 26, 2013
    #10
  11. Myth__Buster

    Myth__Buster Guest

    On Tuesday, March 26, 2013 8:11:40 AM UTC-4, James Kuyper wrote:
    > On 03/26/2013 07:11 AM, Myth__Buster wrote:
    >
    > > By the way, I know the bitwise operations in C are independent of

    >
    > > underlying machine's endian-ness and that's why I have attempted to

    >
    > > figure out what's that endian-ness.

    >
    > >

    >
    > > Now, please note that the paragraph dealing with 1LL rather 1ULL in

    >
    > > the corrected post is written with big-endian in mind but I forgot to

    >
    > > mention it.

    >
    > >

    >
    > > Okay, here is why I thought (1ULL & 1) in C would mean different from

    >
    > > the usual or-ring two 1's: 1ULL, because of its type LL(long long)

    >
    > > being guaranteed to be large enough to demand more than a byte to get

    >
    > > stored. So, for a given big-endian machine where long long is 8 bytes

    >
    > > wide, 1ULL will be stored(in some memory resource say register, even

    >
    > > for temporary or intermediate usages) as the

    >
    > > least-significant-bit(LS-bit) of its most-significant-byte(MS-byte)

    >
    >
    >
    > Endianness is detectable in portable C code only when a value is stored
    >
    > in an object. That's because the only way to determine the endianess is
    >
    > to access the individual bytes of the object, which requires used of a
    >
    > union or type-punning.


    Yes, I realized after writing the opening post of this thread that we really need to deal with memory identified by their addresses to figure out the byte-ordering. This I have commented in my above post to my earlier post.

    > There's nothing you can portably do to determine
    >
    > the endianness of a register, because you can't take the address of a
    >
    > register, and you can't force a compiler to put a union object in a
    >
    > register (the 'register' keyword is just a suggestion, which the
    >
    > compiler is free to ignore).


    Yes, I am aware of the fact that registers are not addressable and 'register' keyword is not a command but request. However, I thought the compile-time constants such as 1ULL would also be held in a temporary memory location in big-endian representation before being moved to a register for any operation using that numerical-constant.

    >
    >
    > I don't understand why you think the least significant bit would be
    >
    > residing in the most significant byte. Whatever the reason is for that
    >
    > expectation, it's incorrect.
    >
    >


    Well, I should have said, 'most-significant-address' instead of most-significant-byte. Sorry for the confusion.

    >
    > > in memory being set given that LS-byte will be stored at an higher

    >
    > > address on such a machine unlike on a little-endian machine.

    >
    > >

    >
    > > However, now I realize that C abstracts the way in which 1ULL would

    >
    > > be stored in memory and 1ULL will just mean number 1 regardless of

    >
    > > the regardless of the underlying endian-ness.

    >
    > >

    >
    > > The above confusion came up in my mind as I was comparing the most

    >
    > > obvious way of checking out a machine's endian-ness as under with

    >
    > > some other supposed way.

    >
    > >

    >
    > > // Except that I have an explicit variable(actually, its memory

    >
    > > location) to

    >
    > > // hold and represent the value 1, there is no difference from the

    >
    > > numeral-

    >
    > > // literal 1 which would anyway be held in a temporary register for

    >
    > > such

    >
    > > // operations.

    >
    >
    >
    > Actually, there's one huge and highly relevant difference. You can use
    >
    > the expression &x when you have a variable, whereas &1 is a syntax
    >
    > error.


    Yes, I know that &1 doesn't make any sense as '&' operator needs lvalue(location value) to operate on which compile-time-numerical-constant 1 is not associated with. But, I didn't mention here since I thought it was really necessary. In fact, I can go on mentioning such differences: 1++, 1--, --1, ++1,
    &(1), 1 = 2, 1 += 2, and so on! :)

    >
    >
    > >

    >
    > > int x = 1;

    >
    > > if ( *(char *)&x == 1 )

    >
    > > {

    >
    > > printf("Little-endian\n);

    >
    > > }

    >
    >
    >
    > <pedantic>
    >
    > Keep in mind that if sizeof(int)==4, which is quite common nowadays,
    >
    > there are 4! = 24 different possible byte orders (most of which are
    >
    > exceedingly uncommon). Only one of those orderings is called
    >
    > little-endian, and only one is called big-endian - the others are
    >
    > generically called middle-endian. At least two of those other orders
    >
    > (2143 and 3412) have actually been used. I once ran into a web page that
    >
    > identified 11 of those 24 orders as having been used in specific
    >
    > contexts, which it identified - but I didn't think to bookmark it, and
    >
    > I've never been able to find it again.
    >
    >


    So, you mean we have to iterate over bytes and figure out from one byte at a time. Right?

    >
    > Your test will identify 6 of those possible orders as little-endian,
    >
    > only one of which actually is. Strictly speaking, you can identify an
    >
    > ordering as little endian only by checking the first sizeof(int)-1 bytes.
    >
    > </pedantic>
    >


    Yup. I hope you are referring to long long type here of size 8 bytes wherein you are not considering first and last bytes for middle/mixed endian-nesschecking.

    >
    >
    > <more pedantic>
    >
    > The standard doesn't require that the least significant bit of an 'int'
    >
    > be in the same location as the least significant bit when the byte that
    >
    > contains it is interpreted as unsigned char. It is extremely unlikely
    >
    > that this issue will ever come up.
    >
    > </more pedantic>
    >
    >


    Yeah, this is what allows us to check the endian-ness of a machine using a variable's address to know at what byte in it is the number 1 stored if that variable's value is 1. And if that issue comes up, then that would break many programs I guess even the simple ones:

    int x = 1;
    (*(unsigned char *)&x & 1) == 1; // This will be incorrectly true even with a
    // big-endian machine on which sizeof(int)>
    // sizeof(char).

    >
    > <even more pedantic>
    >
    > The standard doesn't even require that the CHAR_BIT least significant
    >
    > bits are all stored in the same byte. In principle, an 8-bit byte could
    >
    > contain bits 0,4,8,12,16,20,24, and 28. The next byte could contain bits
    >
    > 1,5,9,13,17,21,24, and 29, etc. There's a total of 64! different
    >
    > possible bit-orderings. However, I think you can fairly assume that any
    >
    > system for which such things are true was designed by aliens. :)
    >
    > </even more pedantic>
    >


    And that would be a pain big time for the compiler designer! :)

    >
    >
    > > // How about this? I think this and above should be same, isn't it?

    >
    > >

    >
    > > int x = 1;

    >
    > > if ( x & 1 )

    >
    >
    >
    > Given x==1, the expression x&1 is guaranteed to be true regardless of
    >
    > endianess (even in in the most pedantic case I discussed above), which
    >
    > makes it a very bad way to test for endianess.
    >
    >


    Yes, I have realized after posting this thread.

    >
    > > {

    >
    > > printf("Little-endian\n");

    >
    > > }

    >
    > >

    >
    > > ---

    >
    > >

    >
    > > However, I see that (x & 1) on big-endian clearly abstracts the way 1

    >
    > > is laid out in memory and just takes the LS-bit of MS-byte and and-s

    >
    > > with numerical 1 as if 1 was at the LS-bit of MS-byte like on

    >
    > > little-endian machine.

    >
    >
    >
    > If 'int' is bigendian, the least significant bit will be in the last
    >
    > byte, which is the least significant byte when using a bigendian
    >
    > representation. If 'int' is little-endian, that bit will be in the first
    >
    > byte, which is the least significant byte when using little-endian
    >
    > notation. But either way, it will reside in the least-significant byte,
    >
    > not the most significant one.
    >
    > --
    >
    > James Kuyper


    Yes, I know that. As mentioned above, I intended to say that least-significant-byte will be stored in the most-significant-address in case of a big-endian machine.


    - Raghavan
     
    Myth__Buster, Mar 26, 2013
    #11
  12. Myth__Buster

    Myth__Buster Guest

    On Tuesday, March 26, 2013 11:36:34 AM UTC-4, Keith Thompson wrote:
    > Myth__Buster <> writes:
    >
    > > *was at the LS-bit of LS-byte like on little-endian machine . . .

    >
    >
    >
    > Without any quoted context, we can't tell what this refers to. Not all
    >
    > newsreaders provide an easy way to view the parent article; mine does,
    >
    > but it doesn't seem to provide an easy way to get back, so I don't use
    >
    > it much. Please quote enough of the parent article for your followup to
    >
    > make sense.
    >
    >
    >
    > --
    >
    > Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    >
    > Working, but not speaking, for JetHead Development, Inc.
    >
    > "We must do something. This is something. Therefore, we must do this."
    >
    > -- Antony Jay and Jonathan Lynn, "Yes Minister"


    Sorry I thought since the posts were next-to-next, I thought of not pasting the entire paragraph. The paragraph to which it applies is

    "However, I see that (x & 1) on big-endian clearly abstracts the way 1 is laid out in memory and just takes the LS-bit of MS-byte and and-s with numerical 1 as if 1 was at the LS-bit of MS-byte like on little-endian machine."

    But, here MS-byte shall be read as MS-address.

    Thaks.
     
    Myth__Buster, Mar 26, 2013
    #12
  13. Myth__Buster

    James Kuyper Guest

    Re: Macro for setting MSB - Intended to work on both Little and Big-endianmachines

    On 03/26/2013 02:00 PM, Myth__Buster wrote:
    > On Tuesday, March 26, 2013 8:11:40 AM UTC-4, James Kuyper wrote:

    ....
    >> There's nothing you can portably do to determine
    >> the endianness of a register, because you can't take the address of a
    >> register, and you can't force a compiler to put a union object in a
    >> register (the 'register' keyword is just a suggestion, which the
    >> compiler is free to ignore).

    >
    > Yes, I am aware of the fact that registers are not addressable and 'register' keyword is not a command but request. However, I thought the compile-time constants such as 1ULL would also be held in a temporary memory location in big-endian representation before being moved to a register for any operation using that numerical-constant.


    They might be - but there's no portable C code that can be used to
    determine whether such values are stored in big-endian or little-endian
    format. The flip side of this is that there is correspondingly no reason
    why you should ever care - if there were a reason to care, the behavior
    associated with that reason would provide a mechanism for checking the
    endianess.

    >> I don't understand why you think the least significant bit would be
    >> residing in the most significant byte. Whatever the reason is for that
    >> expectation, it's incorrect.

    >
    > Well, I should have said, 'most-significant-address' instead of most-significant-byte. Sorry for the confusion.


    No, the term "most-significant" simply isn't meaningful when applied to
    an address. Perhaps you meant the "last address"?

    ....
    >> <pedantic>
    >> Keep in mind that if sizeof(int)==4, which is quite common nowadays,
    >> there are 4! = 24 different possible byte orders (most of which are
    >> exceedingly uncommon). Only one of those orderings is called
    >> little-endian, and only one is called big-endian - the others are
    >> generically called middle-endian. At least two of those other orders
    >> (2143 and 3412) have actually been used. I once ran into a web page that
    >> identified 11 of those 24 orders as having been used in specific
    >> contexts, which it identified - but I didn't think to bookmark it, and
    >> I've never been able to find it again.

    >
    > So, you mean we have to iterate over bytes and figure out from one byte at a time. Right?


    Correct.

    >> Your test will identify 6 of those possible orders as little-endian,
    >> only one of which actually is. Strictly speaking, you can identify an
    >> ordering as little endian only by checking the first sizeof(int)-1 bytes.
    >> </pedantic>

    >
    > Yup. I hope you are referring to long long type here of size 8 bytes wherein you are not considering first and last bytes for middle/mixed endian-ness checking.


    No, I was very explicitly referring to int values with 4 bytes, which
    are quite common nowadays. In principle, the number of possible byte
    orderings for an 8 byte long long would be 8! = 40320 (the number
    actually used is far smaller, of course). Applied to such an integer,
    your test would incorrectly identify 7!-1 = 5039 of those orderings as
    little-endian.

    >> <more pedantic>
    >> The standard doesn't require that the least significant bit of an 'int'
    >> be in the same location as the least significant bit when the byte that
    >> contains it is interpreted as unsigned char. It is extremely unlikely
    >> that this issue will ever come up.
    >> </more pedantic>

    >
    > Yeah, this is what allows us to check the endian-ness of a machine using a variable's address to know at what byte in it is the number 1 stored if that variable's value is 1. And if that issue comes up, then that would break many programs I guess even the simple ones:


    Many of my programs would be completely unaffected - they're written to
    avoid making unportable assumptions about byte-ordering or bit-ordering.

    To be fair, a large part of the nominal portability of my code is due to
    the fact that the HDF library <http://www.hdfgroup.org/> is responsible
    for hiding many such portability issues from my code. We're required by
    our client to use that library, and could not port our code to a
    platform which does not have a working installation of HDF. A working
    HDF library would have to include routines for converting data from the
    format of HDF files to the native format on that machine, and
    vice-versa, so that work would have already been done for me. Assuming
    that had already been done, my own code would require little if any
    additional modifications.
     
    James Kuyper, Mar 26, 2013
    #13
  14. Re: Macro for setting MSB - Intended to work on both Little and Big-endian machines

    Myth__Buster <> writes:
    [snip]
    > Sorry I thought since the posts were next-to-next, I thought of not
    > pasting the entire paragraph. The paragraph to which it applies is
    >
    > "However, I see that (x & 1) on big-endian clearly abstracts the way 1
    > is laid out in memory and just takes the LS-bit of MS-byte and and-s
    > with numerical 1 as if 1 was at the LS-bit of MS-byte like on
    > little-endian machine."
    >
    > But, here MS-byte shall be read as MS-address.


    Take a look at
    https://gist.github.com/Keith-S-Thompson/5248300
    to see what your followup looks like in my newsreader. This shows how
    badly Google Groups messes up Usenet posts. Note the double-spacing of
    quoted text -- and the quadruple-spacing of quoted quoted text.

    Something I didn't mention before: Please don't quote signatures (the
    stuff following the "-- " at the bottom of most posts).

    Please either fix up your posts before clicking "Send", or use a real
    newsreader.

    Thanks.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    Working, but not speaking, for JetHead Development, Inc.
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
     
    Keith Thompson, Mar 26, 2013
    #14
  15. Re: Macro for setting MSB - Intended to work on both Little and Big-endian machines

    James Kuyper <> wrote:
    > On 03/26/2013 07:11 AM, Myth__Buster wrote:
    >> By the way, I know the bitwise operations in C are independent of
    >> underlying machine's endian-ness and that's why I have attempted to
    >> figure out what's that endian-ness.


    (snip)
    > Endianness is detectable in portable C code only when a value is stored
    > in an object. That's because the only way to determine the endianess is
    > to access the individual bytes of the object, which requires used of a
    > union or type-punning.


    (snip)

    > <pedantic>
    > Keep in mind that if sizeof(int)==4, which is quite common nowadays,
    > there are 4! = 24 different possible byte orders (most of which are
    > exceedingly uncommon). Only one of those orderings is called
    > little-endian, and only one is called big-endian - the others are
    > generically called middle-endian. At least two of those other orders
    > (2143 and 3412) have actually been used. I once ran into a web page that
    > identified 11 of those 24 orders as having been used in specific
    > contexts, which it identified - but I didn't think to bookmark it, and
    > I've never been able to find it again.


    I suppose on a bit addressable machine there are even more
    possibilities.

    Still, the most common example of middle-endian is VAX floating point.
    In storage, they are little endian 16 bit words in big endian order.
    If you look at them in little endian byte order, they look like big
    endian words in little endian order. For example, when initializing
    a floating point variable with a hexadecimal constant in VAX Fortran.

    For another interesting bit order, consider the bits in the FAT entries
    in the FAT12 file system. (It is different for add and even entries.)

    -- glen
     
    glen herrmannsfeldt, Mar 26, 2013
    #15
  16. Re: Macro for setting MSB - Intended to work on both Little and Big-endian machines

    Myth__Buster <> wrote:

    (snip)
    > Well, regardless of the sizeof(char) in terms o bits and hence the
    > value CHAR_BIT, sizeof(char) is guaranteed to be 1. But, CHAR_BIT
    > can be big enough to represent the largest integer data type in
    > C: long long i.e., sizeof(long long) == 1 is possible.
    > So, if that is the case, in C there would be no portable way of
    > checking the endian-ness of such a kind of machine. In fact, with
    > respect to integers, endian-ness has no role in such a machine as
    > there are no more than one byte even in the largest integer data
    > type to play with the byte-order of that integer in the respective
    > memory layout.


    Well, a system could have sizeof(long long)==1, but sizeof(double)
    longer, so that endianness did matter in floating point. (Assuming
    a known floating point representation.)

    -- glen
     
    glen herrmannsfeldt, Mar 26, 2013
    #16
  17. Re: Macro for setting MSB - Intended to work on both Little and Big-endian machines

    Keith Thompson <> wrote:

    (snip, someone wrote)
    >> It's PDP-11/VAX endianess where the byte order is <1><msb><lsb><2>.


    > The PDP-11 used a middle-endian representation, but I don't believe the
    > VAX ever did.


    The VAX floating point format is middle endian. Adapted from a floating
    point system that was used with some models of the PDP-11.

    Little endian 16 bit words are stored in big-endian order. This is
    visible initializaing floating point variables with hex constants
    in VAX Fortran. That is, the constant is considered a little endian
    integer, when mapped into memory.

    All binary integer representations on VAX are little endian. Packed
    decimal (BCD) integers are big endian, similar to the IBM 360
    representation.

    -- glen
     
    glen herrmannsfeldt, Mar 26, 2013
    #17
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Myth__Buster
    Replies:
    8
    Views:
    245
    Correador UK
    Mar 27, 2013
  2. Myth__Buster
    Replies:
    0
    Views:
    164
    Myth__Buster
    Mar 26, 2013
  3. Myth__Buster
    Replies:
    0
    Views:
    171
    Myth__Buster
    Mar 26, 2013
  4. Myth__Buster
    Replies:
    0
    Views:
    180
    Myth__Buster
    Mar 26, 2013
  5. Myth__Buster
    Replies:
    1
    Views:
    181
    Eric Sosman
    Mar 26, 2013
Loading...

Share This Page