convert a char[4] (binary) to an unsigned long

Discussion in 'C++' started by Vincent, Aug 5, 2005.

  1. Vincent

    Vincent Guest

    Hi all,

    I want to convert a char[4] (binary) to an unsigned long. How can I do
    this?

    Thanks,
    Vincent
    Vincent, Aug 5, 2005
    #1
    1. Advertising

  2. Vincent sade:
    > Hi all,
    >
    > I want to convert a char[4] (binary) to an unsigned long. How can I do
    > this?
    >
    > Thanks,
    > Vincent
    >


    assert(sizeof(long) == 4);
    char b[4] = {0x01,0x02,0x03,0x04};
    unsigned long a = 0;
    a |= (b[0] << 24);
    a |= (b[1] << 16);
    a |= (b[2] << 8);
    a |= b[3];

    But if you have MSB in b[3] then you should reverse the order.
    Beware of big endian and little endian.

    Tobias
    --
    IMPORTANT: The contents of this email and attachments are confidential
    and may be subject to legal privilege and/or protected by copyright.
    Copying or communicating any part of it to others is prohibited and may
    be unlawful.
    Tobias Blomkvist, Aug 5, 2005
    #2
    1. Advertising

  3. Vincent

    John Ratliff Guest

    Vincent wrote:
    > Hi all,
    >
    > I want to convert a char[4] (binary) to an unsigned long. How can I do
    > this?
    >
    > Thanks,
    > Vincent
    >


    I don't think this is possible without knowing the endian-ness of the
    machine. Maybe someone will correct me.

    If the char[4] came from the machine, then you could probably do a
    reinterpret_cast, but I'm almost positive it won't be portable.

    // assume this has your binary unsigned long
    extern char ul_bin[4];
    unsigned long ul = *reinterpret_cast<unsigned long *>(ul_bin);

    --John Ratliff
    John Ratliff, Aug 5, 2005
    #3
  4. Vincent

    Vincent Guest

    The program will have to work on MS Windows 2000. The char[4] is a set
    of characters, read from a file.

    I hope this will help you answering my question.
    Vincent, Aug 5, 2005
    #4
  5. Vincent

    Hans Guest

    Vincent skrev:

    > Hi all,
    >
    > I want to convert a char[4] (binary) to an unsigned long. How can I do
    > this?
    >
    > Thanks,
    > Vincent


    Use memcpy:

    unsigned long ChararrToLong(const char * const src)
    {
    unsigned long dest;
    memcpy(&dest, src, sizeof(dest));
    return dest;
    }


    This may be what you want or not. If you depend on the chars being put
    in a specific order into the unsigned long, you might want to do some
    byte-swapping while copying.
    Hans, Aug 5, 2005
    #5
  6. Vincent sade:
    > The program will have to work on MS Windows 2000. The char[4] is a set
    > of characters, read from a file.
    >
    > I hope this will help you answering my question.
    >


    Find out what format the long's are stored in, what endianness.

    long l = 0x04030201

    can be stored as

    Big endian: 0x04030201
    Little endian: 0x01020304

    Or any random order you desire in your file, but if you don't
    know the byte order, how will you be able to read them back correctly?

    http://en.wikipedia.org/wiki/Endianess

    Tobias
    --
    IMPORTANT: The contents of this email and attachments are confidential
    and may be subject to legal privilege and/or protected by copyright.
    Copying or communicating any part of it to others is prohibited and may
    be unlawful.
    Tobias Blomkvist, Aug 5, 2005
    #6
  7. Vincent

    John Ratliff Guest

    Vincent wrote:
    > The program will have to work on MS Windows 2000. The char[4] is a set
    > of characters, read from a file.
    >
    > I hope this will help you answering my question.
    >


    It will only work if the char[4] read from Windows was created on a
    machine with endian-ness the same as Windows 2000 (little endian for
    x86) written in endian correct order.

    In other words, if you wrote an unsigned long created on a machine to a
    file, and then wanted to read that unsigned long from a char[4] byte
    array on the same machine, the reinterpret_cast would work. If this file
    is created on some other machine, you can only know if it would work if
    you know the endian-ness of the machine which created the file.

    A question though, why are you reading an unsigned long into a char[4]
    array anyways? Why not read it directly into an unsigned long? Or, how
    does the unsigned long get written in the first place? Maybe you should
    consider writing it as a string instead and parsing the string back
    using strtoul() instead.

    --John Ratliff
    John Ratliff, Aug 5, 2005
    #7
  8. Vincent

    Jack Klein Guest

    On Fri, 05 Aug 2005 12:25:38 +0200, Tobias Blomkvist <>
    wrote in comp.lang.c++:

    > Vincent sade:
    > > Hi all,
    > >
    > > I want to convert a char[4] (binary) to an unsigned long. How can I do
    > > this?
    > >
    > > Thanks,
    > > Vincent
    > >

    >
    > assert(sizeof(long) == 4);


    This doesn't actually solve the problem. And what happens if
    sizeof(long) is 8, which it is on some 64 bit platforms?

    > char b[4] = {0x01,0x02,0x03,0x04};
    > unsigned long a = 0;
    > a |= (b[0] << 24);


    The problem here is that b[0] is promoted to either int or unsigned
    int before it is shifted. There are still a large number of platforms
    where long has 32 bits but int has only 16. Shifting by 24 on such a
    platform is undefined behavior, and will almost certainly give the
    wrong results.

    Perhaps you think that the extra step of initializing 'a' to 0 and
    using |= forces b[0] to be promoted to unsigned long. It most
    certainly does not. b[0] is promoted to either int or unsigned int,
    the shift is performed and assuming there is no undefined behavior or
    the program continues regardless, the resulting unsigned int value is
    only then promoted to unsigned long.

    Should be:

    unsigned long a = ((unsigned long)b1 << 24);

    > a |= (b[1] << 16);


    Same cast here.

    > a |= (b[2] << 8);
    > a |= b[3];


    The last two do not need the cast. Except maybe platforms where
    unsigned char and int both have 16 bits, and the value in the unsigned
    char is greater than 255. And yes, there are platforms like this that
    actually have C++ compilers.

    > But if you have MSB in b[3] then you should reverse the order.
    > Beware of big endian and little endian.
    >
    > Tobias


    --
    Jack Klein
    Home: http://JK-Technology.Com
    FAQs for
    comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
    comp.lang.c++ http://www.parashift.com/c -faq-lite/
    alt.comp.lang.learn.c-c++
    http://www.contrib.andrew.cmu.edu/~ajo/docs/FAQ-acllc.html
    Jack Klein, Aug 6, 2005
    #8
  9. John Ratliff wrote:

    > I don't think this is possible without knowing the endian-ness of the
    > machine. Maybe someone will correct me.


    int x = 1;

    endianness = * (char *) & x ? LITTLE_ENDIAN : BIG_ENDIAN;

    Some compilers (GCC for sure) can optimize away code using this expression.

    e.g.

    int x = 1;
    if ( * (char *) & x )
    {
    ... little endian code ...
    ... optimized away when compiled for a little endian machine ...
    } else
    {
    ... big endian code ...
    ... optimized away when compiled for a big endian machine ...
    }
    Gianni Mariani, Aug 6, 2005
    #9
  10. Gianni Mariani, Aug 6, 2005
    #10
  11. Jack Klein sade:
    >>>

    >>
    >>assert(sizeof(long) == 4);

    >
    >
    > This doesn't actually solve the problem. And what happens if
    > sizeof(long) is 8, which it is on some 64 bit platforms?


    It fails.

    >
    >>char b[4] = {0x01,0x02,0x03,0x04};
    >>unsigned long a = 0;
    >>a |= (b[0] << 24);

    >
    >
    > The problem here is that b[0] is promoted to either int or unsigned
    > int before it is shifted. There are still a large number of platforms
    > where long has 32 bits but int has only 16. Shifting by 24 on such a
    > platform is undefined behavior, and will almost certainly give the
    > wrong results.


    True. An
    assert(sizeof(int) == 4);
    would secure the code.

    >
    > The last two do not need the cast. Except maybe platforms where
    > unsigned char and int both have 16 bits, and the value in the unsigned
    > char is greater than 255. And yes, there are platforms like this that
    > actually have C++ compilers.


    On the other hand, writing code like this, you must be aware. Why do
    you think I used assert?

    Tobias
    --
    IMPORTANT: The contents of this email and attachments are confidential
    and may be subject to legal privilege and/or protected by copyright.
    Copying or communicating any part of it to others is prohibited and may
    be unlawful.
    Tobias Blomkvist, Aug 6, 2005
    #11
  12. Vincent

    Fraser Ross Guest

    "Gianni Mariani"
    > int x = 1;
    >
    > endianness = * (char *) & x ? LITTLE_ENDIAN : BIG_ENDIAN;


    Can someone explain how this expression works? std::reverse is useful for
    changing endiann type.

    Fraser.
    Fraser Ross, Aug 8, 2005
    #12
  13. Vincent

    Fraser Ross Guest

    "Fraser Ross"
    > "Gianni Mariani"
    > > int x = 1;
    > >
    > > endianness = * (char *) & x ? LITTLE_ENDIAN : BIG_ENDIAN;

    >
    > Can someone explain how this expression works? std::reverse is useful for
    > changing endiann type.
    >
    > Fraser.
    >
    >


    I see it now. A static_cast would be more understandable. For a moment I
    thought there was a use of a bit-wise operator.

    Fraser.
    Fraser Ross, Aug 8, 2005
    #13
  14. Vincent

    Fraser Ross Guest

    "Fraser Ross"
    > I see it now. A static_cast would be more understandable.


    No, reinterpret_cast is required.

    Fraser.
    Fraser Ross, Aug 8, 2005
    #14
  15. Vincent

    ThosRTanner Guest

    &x points to a number of bytes which contain (on a big endian machine,
    LSB is at highest byte address) 0, 0, ... , 1, and (on a little endian
    machine, LSB is at lowest byte address) 1, 0, ... 0

    Interpreting the pointer as a char * and getting the byte pointed to
    will return the contents of the lowest addressed byte of the word,
    which will be 0 for big endian machines and 1 for little endian
    machines.

    Optimising out the code is presumably a result of gcc recognising that
    particular pattern - it would be rather dangerous if you were cross
    compiling!
    ThosRTanner, Aug 8, 2005
    #15
  16. Vincent

    Vincent Guest

    Thanks for this suggestion. It works! Somewhere else in my script, I
    have to convert an unsigned long to a char[4]. I tried to use memcpy to
    create a LongtoChararr function, but i failed. I'm not very familiar
    with memcpy. Can you help me again?


    Hans wrote:
    > Vincent skrev:
    >
    > > Hi all,
    > >
    > > I want to convert a char[4] (binary) to an unsigned long. How can I do
    > > this?
    > >
    > > Thanks,
    > > Vincent

    >
    > Use memcpy:
    >
    > unsigned long ChararrToLong(const char * const src)
    > {
    > unsigned long dest;
    > memcpy(&dest, src, sizeof(dest));
    > return dest;
    > }
    >
    >
    > This may be what you want or not. If you depend on the chars being put
    > in a specific order into the unsigned long, you might want to do some
    > byte-swapping while copying.
    Vincent, Aug 8, 2005
    #16
  17. Vincent

    Old Wolf Guest

    Tobias Blomkvist wrote:
    > Jack Klein sade:
    >>>
    >>>assert(sizeof(long) == 4);

    >>
    >> This doesn't actually solve the problem. And what happens if
    >> sizeof(long) is 8, which it is on some 64 bit platforms?

    >
    > It fails.
    >
    >>
    >>>char b[4] = {0x01,0x02,0x03,0x04};
    >>>unsigned long a = 0;
    >>>a |= (b[0] << 24);

    >>
    >>
    >> The problem here is that b[0] is promoted to either int or unsigned
    >> int before it is shifted. There are still a large number of platforms
    >> where long has 32 bits but int has only 16. Shifting by 24 on such a
    >> platform is undefined behavior, and will almost certainly give the
    >> wrong results.

    >
    > True. An
    > assert(sizeof(int) == 4);
    > would secure the code.


    Actually it wouldn't, eg. (8-bit signed char):
    char b[4] = { 0x01, 0x02, 0x03, 0x99 };

    Then 0x01020300 | 0x99 will become 0x01020300 | 0xFFFFFF99
    which is not the desired result. You have to make the
    chars unsigned before you apply bit operations to them.
    Old Wolf, Aug 8, 2005
    #17
  18. Vincent

    John Ratliff Guest

    Vincent wrote:
    > Thanks for this suggestion. It works! Somewhere else in my script, I
    > have to convert an unsigned long to a char[4]. I tried to use memcpy to
    > create a LongtoChararr function, but i failed. I'm not very familiar
    > with memcpy. Can you help me again?
    >


    If byte order is not essential, you can do reinterpret_cast again.

    unsigned long ul = 0xFEDCBA98;
    char *ptr = reinterpret_cast<char *>(&ul);

    Depending upon endianness, you will end up with one of these:
    ptr[] = {0xFE, 0xDC, 0xBA, 0x98}; // big endian machine
    ptr[] = {0x98, 0xBA, 0xDC, 0xFE}; // little endian machine

    Note if an unsigned long is not 4 bytes on the platform you're using,
    you will end up with a different sized array.

    If you really want to use memcpy,

    unsigned long ul = 0xFEDCBA98;
    char ptr[sizeof(unsigned long)];

    memcpy(ptr, &ul, sizeof(unsigned long));

    --John Ratliff
    John Ratliff, Aug 9, 2005
    #18
  19. In message <>,
    ThosRTanner <> writes
    >&x points to a number of bytes which contain (on a big endian machine,
    >LSB is at highest byte address) 0, 0, ... , 1, and (on a little endian
    >machine, LSB is at lowest byte address) 1, 0, ... 0
    >
    >Interpreting the pointer as a char * and getting the byte pointed to
    >will return the contents of the lowest addressed byte of the word,
    >which will be 0 for big endian machines and 1 for little endian
    >machines.
    >
    >Optimising out the code is presumably a result of gcc recognising that
    >particular pattern - it would be rather dangerous if you were cross
    >compiling!


    It's rather dangerous anyway if your target platform has
    sizeof(int)==sizeof(char).

    --
    Richard Herring
    Richard Herring, Aug 15, 2005
    #19
  20. Vincent

    Earl Purple Guest

    Vincent wrote:
    > Hi all,
    >
    > I want to convert a char[4] (binary) to an unsigned long. How can I do
    > this?


    My suggestion is to:
    - always use big-endian regardless of platform
    - use something like this:

    const size_t CHAR_BITS = 8; // or whatever it is on your system

    unsigned long makeLong( const char* data )
    {
    unsigned long result = 0;
    for ( int i=0; i<4; ++i )
    {
    result <<= CHAR_BITS;
    result |= data;
    }
    }

    that will work whenever sizeof(long) >= 4 or there are sufficient
    trailing 0 bytes that overflow does not occur.
    Earl Purple, Aug 15, 2005
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Karl Heinz Buchegger

    char, unsigned char, and binary file io

    Karl Heinz Buchegger, Apr 22, 2004, in forum: C++
    Replies:
    1
    Views:
    572
    glen_stark
    Apr 22, 2004
  2. George Marsaglia

    Assigning unsigned long to unsigned long long

    George Marsaglia, Jul 8, 2003, in forum: C Programming
    Replies:
    1
    Views:
    657
    Eric Sosman
    Jul 8, 2003
  3. Replies:
    1
    Views:
    440
    Diez B. Roggisch
    Jun 1, 2005
  4. Daniel Rudy

    unsigned long long int to long double

    Daniel Rudy, Sep 19, 2005, in forum: C Programming
    Replies:
    5
    Views:
    1,174
    Peter Shaggy Haywood
    Sep 20, 2005
  5. Alex Vinokur
    Replies:
    9
    Views:
    770
    James Kanze
    Oct 13, 2008
Loading...

Share This Page