char change to int, very wired

Discussion in 'C Programming' started by pembed2012, Jun 8, 2012.

  1. pembed2012

    pembed2012 Guest

    Dear all,

    I need you help.

    here the program:

    char a = 0x91;

    printf("%x",a);

    result: ff ff ff 91

    now, i was confused with the result. I think it is 91. but it seems
    convert to something.

    why?

    btw, compiler:gcc machine x86 intel
    thanks
    pembed2012, Jun 8, 2012
    #1
    1. Advertising

  2. pembed2012

    Stefan Ram Guest

    pembed2012 <> writes:
    >printf("%x",a);


    Try »"%hhx\n"«.
    Stefan Ram, Jun 8, 2012
    #2
    1. Advertising

  3. On 8 juin, 21:46, pembed2012

    > char a = 0x91;
    > printf("%x",a);
    > result: ff ff ff 91
    > why?


    'char' means 'signed char' so the
    value 0x91 is negative = -111 decimal

    printf( "%x", (unsigned char)a );
    result: 91
    Jean-Christophe, Jun 8, 2012
    #3
  4. pembed2012

    James Kuyper Guest

    On 06/08/2012 03:46 PM, pembed2012 wrote:
    > Dear all,
    >
    > I need you help.
    >
    > here the program:
    >
    > char a = 0x91;
    >
    > printf("%x",a);
    >
    > result: ff ff ff 91
    >
    > now, i was confused with the result. I think it is 91. but it seems
    > convert to something.


    char can be either signed or unsigned. If were unsigned, a would have a
    value of 91. If it were signed, 0x91 would likely to be greater than
    INT_MAX, in which case it the conversion to char will either produce an
    implementation-defined result, or raise an implementation-defined
    signal. Unless you happen to have set up a signal handler for the
    appropriate signal, I think we can rule out that possibility in this case.

    As a result, the value of 'a' could be any char value, but the most
    likely case is that it's 0x91, interpreted as an 8-bit 2's complement
    signed value, which would be a negative number.

    Whether or not 'a' is signed, it's probably the case that CHAR_MAX <
    INT_MAX on your machine, in which case it's value gets promoted to an
    'int' in the call to printf(). That's where all of the extra 'ff's
    probably come from. The "%x" specifier expects a unsigned int argument;
    as a result, the behavior of your call is undefined. However, on many
    systems it's likely to print out the same value you would get from
    printing (unsigned)(int)a with the same format specifier. That would
    appear to be ffffff91 on
    James Kuyper, Jun 8, 2012
    #4
  5. pembed2012

    James Kuyper Guest

    On 06/08/2012 04:16 PM, Jean-Christophe wrote:
    > On 8 juin, 21:46, pembed2012
    >
    >> char a = 0x91;
    >> printf("%x",a);
    >> result: ff ff ff 91
    >> why?

    >
    > 'char' means 'signed char' so the


    No, that's the way it works for short, int, long, and long long: signed
    short and short are the same exact type. However, for the character
    types, the rules are different. "char" is its own unique type, distinct
    from both unsigned char and signed char. It's required to have exactly
    the same representation as either "signed char" or "unsigned char"
    (6.2.5p15). If CHAR_MIN is 0, then char is unsigned, and if CHAR_MIN is
    negative, char is signed.
    James Kuyper, Jun 8, 2012
    #5
  6. pembed2012

    Joe Pfeiffer Guest

    pembed2012 <> writes:

    > Dear all,
    >
    > I need you help.
    >
    > here the program:
    >
    > char a = 0x91;
    >
    > printf("%x",a);
    >
    > result: ff ff ff 91
    >
    > now, i was confused with the result. I think it is 91. but it seems
    > convert to something.


    Others have given good answers in terms of the defined language
    semantics. The nuts-and-bolts of what you're seeing is that in the
    conversion from char to int, the char is being interpreted as a signed
    char, so the most significant bit is a 1. That's being sign-extended
    (hence the ff ff ff).

    Are you sure it's ff ff ff 91, and not ffffff91?
    Joe Pfeiffer, Jun 8, 2012
    #6
  7. pembed2012

    James Kuyper Guest

    On 06/08/2012 06:47 PM, Keith Thompson wrote:
    > James Kuyper <> writes:

    ....
    >> char can be either signed or unsigned. If were unsigned, a would have a
    >> value of 91.

    >
    > You mean 0x91.
    >
    >> If it were signed, 0x91 would likely to be greater than
    >> INT_MAX,

    >
    > You mean CHAR_MAX.


    You're right, of course.
    Sigh.
    --
    James Kuyper
    James Kuyper, Jun 9, 2012
    #7
  8. pembed2012

    pembed2012 Guest

    On Fri, 08 Jun 2012 16:16:46 -0400, James Kuyper wrote:

    > On 06/08/2012 03:46 PM, pembed2012 wrote:
    >> Dear all,
    >>
    >> I need you help.
    >>
    >> here the program:
    >>
    >> char a = 0x91;
    >>
    >> printf("%x",a);
    >>
    >> result: ff ff ff 91
    >>
    >> now, i was confused with the result. I think it is 91. but it seems
    >> convert to something.

    >
    > char can be either signed or unsigned. If were unsigned, a would have a
    > value of 91. If it were signed, 0x91 would likely to be greater than
    > INT_MAX, in which case it the conversion to char will either produce an
    > implementation-defined result, or raise an implementation-defined
    > signal. Unless you happen to have set up a signal handler for the
    > appropriate signal, I think we can rule out that possibility in this
    > case.
    >
    > As a result, the value of 'a' could be any char value, but the most
    > likely case is that it's 0x91, interpreted as an 8-bit 2's complement
    > signed value, which would be a negative number.
    >
    > Whether or not 'a' is signed, it's probably the case that CHAR_MAX <
    > INT_MAX on your machine, in which case it's value gets promoted to an
    > 'int' in the call to printf(). That's where all of the extra 'ff's
    > probably come from. The "%x" specifier expects a unsigned int argument;
    > as a result, the behavior of your call is undefined. However, on many
    > systems it's likely to print out the same value you would get from
    > printing (unsigned)(int)a with the same format specifier. That would
    > appear to be ffffff91 on


    i don't understand any of this, what signal handler!!

    if it is overflow why the value should be bit pattern?
    is it a rule or something else?
    pembed2012, Jun 9, 2012
    #8
  9. pembed2012

    BartC Guest

    "pembed2012" <> wrote in message
    news:jqtkr1$6nc$...
    > Dear all,
    >
    > I need you help.
    >
    > here the program:
    >
    > char a = 0x91;
    >
    > printf("%x",a);
    >
    > result: ff ff ff 91
    >
    > now, i was confused with the result. I think it is 91. but it seems
    > convert to something.


    Your char type probably only stores values from -128 to 127 (or -0x80 to
    +0x7F). Your 0x91 value is 145, outside the range. So funny things happen
    which are difficult to understand.

    Just use 'unsigned char' instead, which likely has a range of 0 to 255, or
    0x00 to 0xFF.

    --
    Bartc
    BartC, Jun 9, 2012
    #9
  10. pembed2012 <> writes:

    > On Fri, 08 Jun 2012 16:16:46 -0400, James Kuyper wrote:
    >
    >> On 06/08/2012 03:46 PM, pembed2012 wrote:
    >>> Dear all,
    >>>
    >>> I need you help.
    >>>
    >>> here the program:
    >>>
    >>> char a = 0x91;
    >>>
    >>> printf("%x",a);
    >>>
    >>> result: ff ff ff 91
    >>>
    >>> now, i was confused with the result. I think it is 91. but it seems
    >>> convert to something.

    >>
    >> char can be either signed or unsigned. If were unsigned, a would have a
    >> value of 91. If it were signed, 0x91 would likely to be greater than
    >> INT_MAX, in which case it the conversion to char will either produce an
    >> implementation-defined result, or raise an implementation-defined
    >> signal. Unless you happen to have set up a signal handler for the
    >> appropriate signal, I think we can rule out that possibility in this
    >> case.
    >>
    >> As a result, the value of 'a' could be any char value, but the most
    >> likely case is that it's 0x91, interpreted as an 8-bit 2's complement
    >> signed value, which would be a negative number.
    >>
    >> Whether or not 'a' is signed, it's probably the case that CHAR_MAX <
    >> INT_MAX on your machine, in which case it's value gets promoted to an
    >> 'int' in the call to printf(). That's where all of the extra 'ff's
    >> probably come from. The "%x" specifier expects a unsigned int argument;
    >> as a result, the behavior of your call is undefined. However, on many
    >> systems it's likely to print out the same value you would get from
    >> printing (unsigned)(int)a with the same format specifier. That would
    >> appear to be ffffff91 on

    >
    > i don't understand any of this, what signal handler!!


    The C standard says that the conversion of a value that is out of range
    to a signed integer type is "implementation defined" or it may "raise an
    implementation defined signal". On your system, plain char is signed so
    it is a signed integer type, and the value, 0x91, is out of range for a
    single-byte char so the above rule applies.

    "Implementation defined" has a precise meaning in the C standard. It
    means that the documentation for the implementation (often called rather
    loosely "the compiler") must say what happens. Few implementations
    choose to raise a signal when converting an out-of-range signed integer
    value. Yours is one that does not. Most C implementation define the
    conversion as simply copying as many of the low-order bits as are needed
    from the source to the target.

    > if it is overflow why the value should be bit pattern?
    > is it a rule or something else?


    It's not, technically, an overflow. Overflows can occur as the result
    of arithmetic, but an out-of-range conversion is not really an
    overflow -- it's just a conversion.

    Your program does several rather odd things. All of them mean that the
    results say more about what the compiler and the machine are doing
    rather than what the C language says about your program. Here is the
    highly system-specific description of what is happening:

    First, the int value 145 is converted a char type value. On your
    system, char objects can hold valued from -128 to 127 so 145 is
    out-of-range. This conversion done by simply taking the bottom 8 bits
    of

    00000000000000000000000010010001

    (that 145 as a 32-bit int) and stuffing them into the 8 bits of the char
    called 'a'.

    The program then needs the value of 'a', and it needs it converted to a
    int, because the arguments to variadic functions like printf have what
    are called "the integer promotions" applied to them. The char 'a'
    contains the bits:

    10010001

    and your system uses 2's complement representation for signed values.
    That means that 10010001 represents the value -111 (to see exactly why,
    look up "signed number representation" on, say, Wikipedia). The
    conversion of -111 to an int is well-defined (-111 is always in range
    for the type int) you just get -111! Of course, with 32-bit ints it
    looks like this:

    11111111111111111111111110010001

    (again, you may need to consult the Web to see exactly why).

    Now your program does something even odder. The printf format specifier
    %x expects an unsigned int rather than a plain int. This is,
    technically, undefined by the C language standard. If you give a value
    of the wrong type to printf, anything could happen, but in fact, it is
    likely that printf will just plough on and pretend that the

    11111111111111111111111110010001

    it sees is a unsigned int, and it will go ahead and print it in hex as
    requested:

    ffffff91

    If anyone was in any doubt, this might help explain why C is not a
    particularly good vehicle for teaching programming. Simple, short
    programs can involve one in long irrelevant explanations, or require the
    rather dismissive "you'll understand later after computer architecture
    101".

    --
    Ben.
    Ben Bacarisse, Jun 9, 2012
    #10
  11. pembed2012

    James Kuyper Guest

    On 06/09/2012 02:54 AM, pembed2012 wrote:
    > On Fri, 08 Jun 2012 16:16:46 -0400, James Kuyper wrote:
    >
    >> On 06/08/2012 03:46 PM, pembed2012 wrote:
    >>> Dear all,
    >>>
    >>> I need you help.
    >>>
    >>> here the program:
    >>>
    >>> char a = 0x91;
    >>>
    >>> printf("%x",a);
    >>>
    >>> result: ff ff ff 91
    >>>
    >>> now, i was confused with the result. I think it is 91. but it seems
    >>> convert to something.

    >>
    >> char can be either signed or unsigned. If were unsigned, a would have a
    >> value of 91. If it were signed, 0x91 would likely to be greater than
    >> INT_MAX, in which case it the conversion to char will either produce an
    >> implementation-defined result, or raise an implementation-defined
    >> signal. Unless you happen to have set up a signal handler for the
    >> appropriate signal, I think we can rule out that possibility in this
    >> case.
    >>
    >> As a result, the value of 'a' could be any char value, but the most
    >> likely case is that it's 0x91, interpreted as an 8-bit 2's complement
    >> signed value, which would be a negative number.
    >>
    >> Whether or not 'a' is signed, it's probably the case that CHAR_MAX <
    >> INT_MAX on your machine, in which case it's value gets promoted to an
    >> 'int' in the call to printf(). That's where all of the extra 'ff's
    >> probably come from. The "%x" specifier expects a unsigned int argument;
    >> as a result, the behavior of your call is undefined. However, on many
    >> systems it's likely to print out the same value you would get from
    >> printing (unsigned)(int)a with the same format specifier. That would
    >> appear to be ffffff91 on

    >
    > i don't understand any of this, what signal handler!!


    I'm sorry - that means that I wrote my answer at too advanced a level.
    But that doesn't give me much information about how I need to fix my
    explanation to make it clearer for you. Could you, perhaps, list all of
    the things you don't understand, or at the very least, the first
    significant point you don't understand?

    When I mentioned a signal handled, I did so because when you try to
    convert a value to a signed type that is outside the range of values
    representable in that type, one thing the compiler is allowed to do is
    raise a signal. If it did so, you'd have to check your compiler's
    documentation to find out which signal it raised - the standard doesn't
    specify it.

    However, it's pretty clear that your compiler didn't do this. Very few
    do. Most of them make the other choice - the conversion yields a result,
    just as it would in the normal case. On a typical system where char is
    an 8-bit 2's complement type, the most likely result would be
    0x91-0x100, or -11, the value that (if I did the calculation right)
    would be represented by a char object with a bit pattern of 0x91.
    However, the standard does not guarantee what value you would get; it
    could be any number from CHAR_MIN to CHAR_MAX. If you care about which
    value gets stored in 'a', then you shouldn't write code like this - if
    you don't care what value gets stored in 'a', why even bother defining it?

    How do I know that your compiler didn't raise a signal? There's two main
    possibilities: you installed a signal handler for the correct signal, or
    you didn't. I you installed the handler, you should have been
    complaining about why the signal handler got called. If you didn't
    install a signal handler, your program would have aborted before ever
    printing out the ffffff91.

    > if it is overflow why the value should be bit pattern?
    > is it a rule or something else?


    The value is NOT a bit pattern, any more than you are your name. The bit
    pattern represents the value, just like your name represents you. Having
    said that, I don't really understand what you're referring to. Nothing I
    wrote in my previous message mentioned bit patterns.

    When there is overflow, there doesn't have to be a value; the compiler
    could instead raise a signal. However, if the conversion does not raise
    a signal, it yields a value, which can be any number from CHAR_MIN to
    CHAR_MAX. When that value gets stored in 'a', a bit pattern is set up
    that represents the appropriate value.

    --
    James Kuyper
    James Kuyper, Jun 9, 2012
    #11
  12. On 08.06.2012 22:24, James Kuyper wrote:

    >> 'char' means 'signed char' so the

    >
    > No, that's the way it works for short, int, long, and long long: signed
    > short and short are the same exact type. However, for the character
    > types, the rules are different. "char" is its own unique type, distinct
    > from both unsigned char and signed char. It's required to have exactly
    > the same representation as either "signed char" or "unsigned char"
    > (6.2.5p15). If CHAR_MIN is 0, then char is unsigned, and if CHAR_MIN is
    > negative, char is signed.


    I'm also curious here: What was the rationale behind chosing "char" to
    behave that way?

    Best regards,
    Joe

    --
    >> Wo hattest Du das Beben nochmal GENAU vorhergesagt?

    > Zumindest nicht öffentlich!

    Ah, der neueste und bis heute genialste Streich unsere großen
    Kosmologen: Die Geheim-Vorhersage.
    - Karl Kaos über Rüdiger Thomas in dsa <hidbv3$om2$>
    Johannes Bauer, Jun 10, 2012
    #12
  13. Johannes Bauer <> writes:

    > On 08.06.2012 22:24, James Kuyper wrote:
    >
    >>> 'char' means 'signed char' so the

    >>
    >> No, that's the way it works for short, int, long, and long long: signed
    >> short and short are the same exact type. However, for the character
    >> types, the rules are different. "char" is its own unique type, distinct
    >> from both unsigned char and signed char. It's required to have exactly
    >> the same representation as either "signed char" or "unsigned char"
    >> (6.2.5p15). If CHAR_MIN is 0, then char is unsigned, and if CHAR_MIN is
    >> negative, char is signed.

    >
    > I'm also curious here: What was the rationale behind chosing "char" to
    > behave that way?


    History -- in particular the machines of the time.

    C's integer promotion reflects the idea of general-purpose registers:
    narrow type promote to int because operations are done on "natural
    width" registers, not on memory objects. If C has insisted that char be
    signed, then machines without a sign-extending byte load operation would
    be at a disadvantage, and similarly for those that always sign-extend
    its loads had C insisted that char be unsigned.

    The C reference manual in the original K&R describes a number of
    implementations (PDP-11, Honeywell 6000, IBM 370 and Interdata 8/32) and
    says: "Whether or not sign-extension occurs for characters is machine
    dependent ... Of the machine treated by this manual, only the PDP-11
    sign-extends".

    The ANSI standard formalised existing practice with the phrasing
    described above.

    --
    Ben.
    Ben Bacarisse, Jun 10, 2012
    #13
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Schnoffos
    Replies:
    2
    Views:
    1,212
    Martien Verbruggen
    Jun 27, 2003
  2. trey

    newbie: char* int and char *int

    trey, Sep 10, 2003, in forum: C Programming
    Replies:
    7
    Views:
    404
    Irrwahn Grausewitz
    Sep 10, 2003
  3. Hal Styli
    Replies:
    14
    Views:
    1,634
    Old Wolf
    Jan 20, 2004
  4. lovecreatesbeauty
    Replies:
    1
    Views:
    1,046
    Ian Collins
    May 9, 2006
  5. gert
    Replies:
    20
    Views:
    1,163
Loading...

Share This Page