converting fron unsiged char to int

Discussion in 'C++' started by Joseph Suprenant, Aug 18, 2003.

  1. I have an array of unsigned chars and i would like them converted to
    an array of ints. What is the best way to do this? Using RedHat 7.3
    on an Intel Pentium 4 machine. Having trouble here, hope someone can
    help
    Thanks
    Joseph Suprenant, Aug 18, 2003
    #1
    1. Advertising

  2. "Joseph Suprenant" <> wrote in message
    news:...
    > I have an array of unsigned chars and i would like them converted to
    > an array of ints. What is the best way to do this? Using RedHat 7.3
    > on an Intel Pentium 4 machine. Having trouble here, hope someone can
    > help
    > Thanks


    Have you considered a for loop?

    unsigned char a[10];
    int b[10];
    for (int i = 0; i < 10; ++i)
    b = a;

    john
    John Harrison, Aug 18, 2003
    #2
    1. Advertising

  3. J. Campbell wrote:

    >
    > I'm no expert, so...listen to John Harrison. However, if you want the
    > bit pattern of the first 4-bytes of your char array to represent your
    > first int, then the next 4-bytes to represent the next int, the
    > following will work.
    >
    > .
    > .
    > .
    > char* inBYTES;
    > .
    > //define your char array
    > .
    > unsigned long* inWORDS = (unsigned long*)inBYTES;


    This is extremely dangerous and likely to fail. You have no guarantee
    that a char* points to an address that is properly aligned for an
    unsigned long. Also, this C-style cast is a bad idea. reinterpret_cast
    would be slightly better.

    There's also no guarantee that unsigned long is 4 bytes wide.

    If this is, in fact, what the OP wants to do, the correct method would
    look something like this:

    char *bytes;
    // ...
    long val = 0;
    for (int i=0; i<sizeof(long); ++i)
    {
    val = (val << CHAR_BIT) | bytes;
    }

    Of course, how this is done depends on exactly how the value is
    represented in the array pointed to by 'bytes'. For example, the order
    of the bytes, how many bytes are used to represent the value, etc.

    -Kevin
    --
    My email address is valid, but changes periodically.
    To contact me please use the address from a recent posting.
    Kevin Goodsell, Aug 18, 2003
    #3
  4. J. Campbell wrote:

    > Kevin Goodsell <> wrote in message news:<3f4148c4@shknews01>...
    >
    >>J. Campbell wrote:
    >>
    >>
    >>>I'm no expert, so...listen to John Harrison. However, if you want the
    >>>bit pattern of the first 4-bytes of your char array to represent your
    >>>first int, then the next 4-bytes to represent the next int, the
    >>>following will work.
    >>>
    >>>.
    >>>.
    >>>.
    >>>char* inBYTES;
    >>>.
    >>>//define your char array
    >>>.
    >>>unsigned long* inWORDS = (unsigned long*)inBYTES;

    >>
    >>This is extremely dangerous and likely to fail. You have no guarantee
    >>that a char* points to an address that is properly aligned for an
    >>unsigned long. Also, this C-style cast is a bad idea. reinterpret_cast
    >>would be slightly better.
    >>
    >>There's also no guarantee that unsigned long is 4 bytes wide.
    >>
    >>If this is, in fact, what the OP wants to do, the correct method would
    >>look something like this:
    >>
    >>char *bytes;
    >>// ...
    >>long val = 0;
    >>for (int i=0; i<sizeof(long); ++i)
    >>{
    >> val = (val << CHAR_BIT) | bytes;
    >>}
    >>
    >>Of course, how this is done depends on exactly how the value is
    >>represented in the array pointed to by 'bytes'. For example, the order
    >>of the bytes, how many bytes are used to represent the value, etc.
    >>
    >>-Kevin

    >
    >
    >
    > Kevin, thanks for the reply. This post is a bit off topic to the
    > original thread, but perhaps you can help me think about programming
    > in c++ terms.
    >
    > I'm a physical scientist, not a computer programmer. I've been
    > writing programs for years to solve very narrowly-defined problems.
    > My old language was a compiled BASIC, QB 4.5. An example of the types
    > of things I use programming for, I once modified (hardware) an
    > instrument to take a different sort of measurement than it was
    > originally intended to take. The file that held the measurements
    > contained the data I was after, but it was buried in a bunch of binary
    > data that I didn't need, and the instrument-driving software wouldn't
    > allow me to extract the data I needed. I could get the data by hand
    > using a hex-editor, but it was extremely tedious and I had hundereds
    > of files to scour. To solve this problem, I made a program to read
    > the file scan the text header for the proper offsets, then go to these
    > offsets, get the binary data, convert it to ascii and output
    > deliminated data to a file that I could then use in plotting-type
    > software. I recently decided to learn C++ in order to avoid the slow
    > (on my old compiler) conversions like you outlined above where you
    > take each byte and shift it to the proper place in the integer then
    > add to the underlying value. As such, I thought "C++...groovy, I can
    > load the data, then create a pointer to whatever position I choose,
    > select the type of data I want, create a second pointer at that
    > location, then grab the data without performing *any* conversions."
    >
    > I realize that this is non-portable, and that it's poor practice to
    > have 2 pointers to the same memory space, and I'm not trying to argue
    > that my method has any merit. I'm so used to thinking in terms of
    > "how to most efficiently get the results on the platform at hand" that
    > I feel like I'm missing the point of C++. Perhaps C++ is the wrong
    > language for me, since it was designed for large projects that need to
    > be maintained over time. Anyway, C++ is so much faster, that I can
    > probablly discard the notion that I need to look for efficiency
    > shortcuts.

    =-+-=-+-=-+-=-+-=-+-=-+-=-+-=-+-=-+-=-+-=-+-=-+-=-+-=-+-=-+-=-+-=-+-=
    One of the benefits of C and C++ is that you can cast a pointer
    {or anytype) into almost any other type.
    For example, read the data into a large unsigned char array.
    Using a cast, you could convert the value at offset 4 into an integer:
    int value = (int)(*ptr_to_buffer);
    One issue to watch out for is byte ordering for multibyte quantities
    otherwise known as Endianess.
    =-+-=-+-=-+-=-+-=-+-=-+-=-+-=-+-=-+-=-+-=-+-=-+-=-+-=-+-=-+-=-+-=-+-=


    >
    > My question is this. If you were writing code for your own use, that
    > would be unlikely to be used by anyone else, would you bother with
    > properly converting your bytes to ints in a platform independent
    > manner, or would you take the shortcut of simply loading the file,
    > then accessing it in it's native format?

    No, I don't worry about the conversions or sizes for most private
    utilities that I write.

    The only time I worry about the size of an integer variable is
    when I'm dealing with the I/O of data or writing/reading to/from
    hardware devices.

    I have better things to do with my time than worrying about the
    size of an integral or floating point variable. One of those
    items is finishing the program on time, robust and of good
    quality. If time or space is an issue (after the program works)
    then I'll go back with a profiler and adjust the program as
    necessary.

    In your case, get your data parsing working correctly first.
    Use simple techniques. After you have it working and need to
    speed it up, you can use techniques such as buffer large
    amounts of data.

    >
    > Anyway, sorry for the rambling post, and thanks for the input.

    No problem. Better entertainment than those people complaining
    about being redirected.

    --
    Thomas Matthews

    C++ newsgroup welcome message:
    http://www.slack.net/~shiva/welcome.txt
    C++ Faq: http://www.parashift.com/c -faq-lite
    C Faq: http://www.eskimo.com/~scs/c-faq/top.html
    alt.comp.lang.learn.c-c++ faq:
    http://www.raos.demon.uk/acllc-c /faq.html
    Other sites:
    http://www.josuttis.com -- C++ STL Library book
    Thomas Matthews, Aug 19, 2003
    #4
  5. J. Campbell wrote:

    <snip>

    > I recently decided to learn C++ in order to avoid the slow
    > (on my old compiler) conversions like you outlined above where you
    > take each byte and shift it to the proper place in the integer then
    > add to the underlying value.


    It really shouldn't be very slow. Besides that the "search for
    efficiency" that many programmers feel is a core component of
    programming is often misguided. There are many reasons for this. First,
    speed simply isn't that important in most case - at least, not compared
    to other factors such as correctness, portability, maintainability, and
    getting the program done on time. Attempts to speed up code are often at
    odds with these other goals. Second, most code is executed infrequently
    enough that optimizing it down to nothing would not significantly
    improve overall program performance - to be worth while, optimizations
    have to be carefully targeted at the parts of the program that really
    need it. Third, the efficiency bottlenecks in a program tend to be
    things like I/O accesses, not the actual code itself. Fourth,
    algorithm-level optimizations almost always give much, much more
    dramatic improvements than micro-tuning code, so worrying about things
    like a few shift operations is rather foolish.

    Don't get me wrong - I don't like slow programs. But it's much more
    important to address the overall design than to worry about the
    efficiency of any particular section of code, particularly because you
    don't know from the start which sections are going to be taking up the
    program's execution time. By addressing design first, you can ensure
    correctness, get the program running, and pave the way for optimizations
    later on if the program is deemed too slow - if it's fast enough, you've
    saved yourself the effort. Besides that, good design tends to lead to
    reasonably efficient code in the first place.

    Sorry about the barely-topical (for the thread) rant. It's one of those
    things I'm always going off on.

    > As such, I thought "C++...groovy, I can
    > load the data, then create a pointer to whatever position I choose,
    > select the type of data I want, create a second pointer at that
    > location, then grab the data without performing *any* conversions."


    I used to think that also. There are a number of problems with it,
    however. First and foremost, different types may require different
    memory alignment. A long might need to be on a 4-byte boundary, for
    example, so if I try to access a char array as a long, and that char
    array is not on a 4-byte boundary, the program's behavior is undefined.
    This particular error results in a bus error (causing a crash) on some
    systems. On Intel-bases systems I believe that improper alignment simply
    causes your program to take a performance hit.

    But that's just the first problem. You also have to worry about whether
    the data in the array is the right format for the type you want to
    interpret it as, if there's padding bytes, and things like that. Even if
    all that checks out, the same data on a different platform won't work
    the same way. Byte order can be different, data type sizes can vary, etc.

    A few final notes about alignment: void * and char * are both capable of
    representing any other object pointer type, and chars don't have
    alignment requirements, so you can always access anything as an array of
    chars without alignment problems (though unless you use unsigned chars
    there is also a possible problem with invalid representations - unsigned
    char is the only thing that is required to have none of those, and no
    padding bits). Also, memory returned from malloc() is required to be
    properly aligned for any type, so in theory it can be used as "common
    ground" for any types, but this doesn't seem to be useful very often.

    >
    > I realize that this is non-portable, and that it's poor practice to
    > have 2 pointers to the same memory space,


    I don't know about 2 pointers to the same place being bad practice. I
    can see how it could lead to problems in some cases, but I think such
    problems are more a result of other things, such as not sufficiently
    limiting the scope of objects, or poor memory management. Multiple
    pointers to the same object is harmless by itself.

    > and I'm not trying to argue
    > that my method has any merit. I'm so used to thinking in terms of
    > "how to most efficiently get the results on the platform at hand" that
    > I feel like I'm missing the point of C++. Perhaps C++ is the wrong
    > language for me, since it was designed for large projects that need to
    > be maintained over time. Anyway, C++ is so much faster, that I can
    > probablly discard the notion that I need to look for efficiency
    > shortcuts.


    In many cases, that is true. Shortcuts for efficiency are often
    counter-productive anyway.

    I think you'll have greater success with the language if you learn to
    use the language itself, rather than learning "C++ for <insert system
    name here>". Code relying on particular properties of a given system,
    aside from being non-portable, tends to be more brittle as well.

    >
    > My question is this. If you were writing code for your own use, that
    > would be unlikely to be used by anyone else, would you bother with
    > properly converting your bytes to ints in a platform independent
    > manner, or would you take the shortcut of simply loading the file,
    > then accessing it in it's native format?


    It depends on how useful I expect the program to be. If it's a
    quick-and-dirty program that will soon be discarded, it's likely that
    I'd use the simplest method that I could think of, which may be
    something like what you describe. For anything else, I'd do it the Right
    Way, even if I'm only doing it for practice. I consider any and all
    programming to be an opportunity to learn, helping me to become a better
    programmer.

    -Kevin
    --
    My email address is valid, but changes periodically.
    To contact me please use the address from a recent posting.
    Kevin Goodsell, Aug 19, 2003
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Simon
    Replies:
    4
    Views:
    1,341
    Old Wolf
    Mar 28, 2005
  2. Schnoffos
    Replies:
    2
    Views:
    1,199
    Martien Verbruggen
    Jun 27, 2003
  3. trey

    newbie: char* int and char *int

    trey, Sep 10, 2003, in forum: C Programming
    Replies:
    7
    Views:
    397
    Irrwahn Grausewitz
    Sep 10, 2003
  4. Hal Styli
    Replies:
    14
    Views:
    1,615
    Old Wolf
    Jan 20, 2004
  5. gert
    Replies:
    20
    Views:
    1,144
Loading...

Share This Page