Question about cast

Discussion in 'C++' started by junw2000@gmail.com, Jan 25, 2007.

  1. Guest

    Hi,

    For the code below:

    char *c = "0113";
    unsigned short *p;
    p = (unsigned short*)c; //LINE1
    std::cout<<"*p: "<<*p<<'\n';
    std::cout<<"*c: "<<c<<'\n';

    The output is:
    *p: 12592
    *c: 0113

    Why?
    How does LINE1 work? For the string "0113", there is a implicit '/0' at
    the end. How does LINE1 handle it?

    Thanks

    Jack
    , Jan 25, 2007
    #1
    1. Advertising

  2. Ian Collins Guest

    wrote:
    > Hi,
    >
    > For the code below:
    >
    > char *c = "0113";


    should be const char*.

    > unsigned short *p;
    > p = (unsigned short*)c; //LINE1
    > std::cout<<"*p: "<<*p<<'\n';
    > std::cout<<"*c: "<<c<<'\n';
    >
    > The output is:
    > *p: 12592
    > *c: 0113
    >
    > Why?


    What else would you expect?

    > How does LINE1 work? For the string "0113", there is a implicit '/0' at
    > the end. How does LINE1 handle it?
    >

    It assigns the value of c to p. Assuming sizeof unsigned short to be 2,
    *p is the first two bytes of the string literal pointed to by 2.

    Convert 12592 to hex and check the ASCII values for '0' and '1'

    --
    Ian Collins.
    Ian Collins, Jan 25, 2007
    #2
    1. Advertising

  3. * :
    > Hi,
    >
    > For the code below:
    >
    > char *c = "0113";
    > unsigned short *p;
    > p = (unsigned short*)c; //LINE1
    > std::cout<<"*p: "<<*p<<'\n';
    > std::cout<<"*c: "<<c<<'\n';
    >
    > The output is:
    > *p: 12592
    > *c: 0113
    >
    > Why?


    Why not? What did you expect?


    > How does LINE1 work?


    It uses a C-style cast, which is interpreted as a reinterpret_cast.

    Look up reinterpret_cast.

    Then remember in the future to not use C-style casts, and remember that
    while you're still a novice every occurrence of reinterpret_cast in
    your code means you have a bug.


    > For the string "0113", there is a implicit '/0' at
    > the end. How does LINE1 handle it?


    It doesn't.

    --
    A: Because it messes up the order in which people normally read text.
    Q: Why is it such a bad thing?
    A: Top-posting.
    Q: What is the most annoying thing on usenet and in e-mail?
    Alf P. Steinbach, Jan 25, 2007
    #3
  4. Guest

    On 25 Jan, 08:45, wrote:
    > Hi,
    >
    > For the code below:
    >
    > char *c = "0113";
    > unsigned short *p;
    > p = (unsigned short*)c; //LINE1
    > std::cout<<"*p: "<<*p<<'\n';
    > std::cout<<"*c: "<<c<<'\n';
    >
    > The output is:
    > *p: 12592
    > *c: 0113
    >
    > Why?


    Because casts like this have undefined behaviour. There is no why,
    undefined behaviour means anything is allowed to happen.

    > How does LINE1 work? For the string "0113", there is a implicit '/0' at
    > the end. How does LINE1 handle it?


    Since LINE1 is undefined behaviour any further questions are
    meaningless.
    , Jan 25, 2007
    #4
  5. On Jan 25, 3:45 pm, wrote:
    > For the code below:
    >
    > char *c = "0113";
    > unsigned short *p;
    > p = (unsigned short*)c; //LINE1
    > std::cout<<"*p: "<<*p<<'\n';
    > std::cout<<"*c: "<<c<<'\n';
    >
    > The output is:
    > *p: 12592
    > *c: 0113
    >
    > Why?


    It looks like the program is trying to show that the memory location
    where the string is located (p) and the string. It is showing that if
    you send a char * (should be const char *) to an IO stream it will
    print as a character string. However, if you send a pointer to any
    other type to a IO stream it will display the memory location.

    The cast in //LINE1 is a nasty hack and you should never do anything
    like it. I don't know if the cast itself is UB, but I'm pretty sure
    that dereferencing p would be (if I had to guess I would say that the
    code was written on a platform where sizeof( unsigned short ) ==
    sizeof( char ) == 1). If p were a const void * then it would be
    correct, but still not a good thing to do unless forced (normally to
    interact with C style code).

    > How does LINE1 work? For the string "0113", there is a implicit '/0' at
    > the end. How does LINE1 handle it?


    It doesn't. c is really a pointer to the memory location that the
    string is stored at. The line is simply a way of getting that memory
    location pointer into something that can be displayed by the IO stream
    as a pointer rather than a string.

    The first three lines are reasonable C, but unreasonable C++.


    K
    =?iso-8859-1?q?Kirit_S=E6lensminde?=, Jan 26, 2007
    #5
  6. Guest

    On Jan 25, 12:52 am, Ian Collins <> wrote:
    > wrote:
    > > Hi,

    >
    > > For the code below:

    >
    > > char *c = "0113";should be const char*.

    >
    > > unsigned short *p;
    > > p = (unsigned short*)c; //LINE1
    > > std::cout<<"*p: "<<*p<<'\n';
    > > std::cout<<"*c: "<<c<<'\n';

    >
    > > The output is:
    > > *p: 12592
    > > *c: 0113

    >
    > > Why?What else would you expect?


    Maybe I should do this:
    p = static_cast<unsigned short*>c;
    Is it right?
    I need to do checksum of a string. The function is like this: checksum(
    unsigned short *p, int count).
    So I have to convert char* to unsign short*. Is there any better to do
    it?

    >
    > > How does LINE1 work? For the string "0113", there is a implicit '/0' at
    > > the end. How does LINE1 handle it?It assigns the value of c to p. Assuming sizeof unsigned short to be 2,

    > *p is the first two bytes of the string literal pointed to by 2.
    >
    > Convert 12592 to hex and check the ASCII values for '0' and '1'


    The binary of 12592 is 11000100110000. The binary of '0' is 11000. The
    binary of '1' is 110001.
    After the cast, why it becomes '10' other than '01'?

    Thanks.

    Jack
    , Jan 26, 2007
    #6
  7. Ian Collins Guest

    wrote:
    >
    > On Jan 25, 12:52 am, Ian Collins <> wrote:
    >
    > Maybe I should do this:
    > p = static_cast<unsigned short*>c;


    p = reinterpret_cast<short*>(c);

    >
    >>>How does LINE1 work? For the string "0113", there is a implicit '/0' at
    >>>the end. How does LINE1 handle it?It assigns the value of c to p. Assuming sizeof unsigned short to be 2,

    >>
    >>*p is the first two bytes of the string literal pointed to by 2.
    >>
    >>Convert 12592 to hex and check the ASCII values for '0' and '1'

    >
    >
    > The binary of 12592 is 11000100110000. The binary of '0' is 11000. The
    > binary of '1' is 110001.
    > After the cast, why it becomes '10' other than '01'?
    >

    Google for little endian.


    --
    Ian Collins.
    Ian Collins, Jan 26, 2007
    #7
  8. Kai-Uwe Bux Guest

    wrote:

    >
    >
    > On Jan 25, 12:52 am, Ian Collins <> wrote:
    >> wrote:
    >> > Hi,

    >>
    >> > For the code below:

    >>
    >> > char *c = "0113";should be const char*.

    >>
    >> > unsigned short *p;
    >> > p = (unsigned short*)c; //LINE1
    >> > std::cout<<"*p: "<<*p<<'\n';
    >> > std::cout<<"*c: "<<c<<'\n';

    >>
    >> > The output is:
    >> > *p: 12592
    >> > *c: 0113

    >>
    >> > Why?What else would you expect?

    >
    > Maybe I should do this:
    > p = static_cast<unsigned short*>c;
    > Is it right?
    > I need to do checksum of a string. The function is like this: checksum(
    > unsigned short *p, int count).
    > So I have to convert char* to unsign short*. Is there any better to do
    > it?


    There may be no way to solve the underlying problem by casting pointer types
    around. E.g., what happens if the char* points to a place not suitably
    aligned for short? What happens if the string contains a number of
    characters that is not a multiple of sizeof(short)? Besides, very likely
    you have undefined behavior anyway, depending on what checksum() does
    internally.


    Best

    Kai-Uwe Bux
    Kai-Uwe Bux, Jan 26, 2007
    #8
  9. Guest

    There is a standard way to do this, though it involves a bit of
    implementation-defined behavior (such as endianness). There are two
    errors in your code: first, the reinterpret_cast is not valid because
    unsigned short could have stricter alignment requirements than char,
    and you attempt to access a string literal as an unsigned short. The
    latter is wrong for two reasons, one because the alignment
    requirements of short could be stricter than char, and two because the
    standard disallows accessing objects as different types, so the
    compiler could optimize it away.

    char *c = "0113" is valid, because old C code used that idiom a lot,
    so it was included for backwards compatibility. It's use is deprecated
    though.

    The second problem can be avoided by copying the array into an
    unsigned short. The first can be avoided by first casting to void *,
    then char * or unsigned char *. You could also use std::memcpy or
    std::memmove, since the standard appears to make special consideration
    for them (it uses them in examples). This sort of copying is only
    allowed for POD types.

    Technically the standard only allows for copying of this sort from one
    object to another of the same type because types are allowed to have
    padding bits and trap bits, but as long as (type(1) <<
    type(sizeof(type)) * type(CHAR_BIT)) - 1 is equal to
    std::type_traits<type>::max(), for unsigned types at least, the
    copying will be valid. In practice, I doubt you'll find too many
    implementations that go into these peculiarities.

    Here's a valid implementation (assuming some valid min() function):

    if ((unsigned short(1) << unsigned short(sizeof(unsigned short)) *
    unsigned short(CHAR_BIT)) - 1 !=
    std::type_traits<unsigned short>::max()) return;
    unsigned short s = 0;
    const char *c = "0113";
    for (std::size_t i = 0; i < min(sizeof(s), 5); ++i) static_cast<const
    char *>(static_cast<const void *>(&s)) = c;
    std::cout<<"s: "<<s<<'\n';
    std::cout<<"c: "<<c<<'\n';

    The actual value of s is implementation-defined due to several factors
    including the size of s, the representation of unsigned shorts and the
    values '0' '1' and '3' map to. The vast majority of platforms will
    represent an unsigned short as a two's complement integer of two
    bytes, with the bit order the same as a char. The only thing that will
    differ normally is endianness, whether the '0' or the '1' will make up
    the first byte.
    , Jan 27, 2007
    #9
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. VB Programmer

    Question: Invalid Cast Exception Error

    VB Programmer, Oct 28, 2003, in forum: ASP .Net
    Replies:
    4
    Views:
    1,351
    VB Programmer
    Oct 28, 2003
  2. Christopher Benson-Manica

    Quick cast style question

    Christopher Benson-Manica, Apr 30, 2004, in forum: C++
    Replies:
    15
    Views:
    532
    Jake Montgomery
    May 4, 2004
  3. MSG

    to cast or not to cast malloc ?

    MSG, Feb 6, 2004, in forum: C Programming
    Replies:
    38
    Views:
    1,072
    Dan Pop
    Feb 10, 2004
  4. EvilRix
    Replies:
    8
    Views:
    634
    Martin Dickopp
    Feb 14, 2004
  5. Pavel
    Replies:
    7
    Views:
    524
    Pavel
    Sep 19, 2010
Loading...

Share This Page