[half OT] About the not-in-common range of signed and unsigned char

Discussion in 'C++' started by Francesco S. Carta, Jul 14, 2010.

  1. Hi there,
    when I create some less-than-trivial console program that involves some
    kind of pseudo-graphic interface I resort to using the glyphs that lie
    in the range [-128, -1] - the simple "char" type is signed in my
    implementation.

    You know, all those single/double borders, corners, crosses,
    pseudo-shadow (dithered) boxes and so on.

    Since those characters mess up the encoding of my files, I cannot put
    them straight into the source code as char-literals, I have to hard-code
    their numeric values.

    I noticed that, at least on my implementation, it doesn't make any
    difference if I assign a negative value to an unsigned char - the
    expected glyph shows up correctly - hence I think I wouldn't have to
    worry if the same code is run on an implementation where char is unsigned.

    My questions:

    - what assumptions (if any) can I make about the presence of those
    out-of-common-range characters and their (correct) correspondence with
    the codes I use to hard-code?

    - assuming it is possible to, how can I ensure that my program displays
    the correct "graphics" regardless of the platform / implementation it is
    compiled onto?

    Note: resorting to an external library that "does the stuff for me" is
    not an option here, I'm asking in order to learn, not just to solve an
    issue.

    Thank you for your attention.

    --
    FSC - http://userscripts.org/scripts/show/59948
    http://fscode.altervista.org - http://sardinias.com
     
    Francesco S. Carta, Jul 14, 2010
    #1
    1. Advertising

  2. Re: [half OT] About the not-in-common range of signed and unsignedchar

    Victor Bazarov <>, on 13/07/2010 19:13:13, wrote:

    > On 7/13/2010 7:01 PM, Francesco S. Carta wrote:
    >> Hi there,
    >> when I create some less-than-trivial console program that involves some
    >> kind of pseudo-graphic interface I resort to using the glyphs that lie
    >> in the range [-128, -1] - the simple "char" type is signed in my
    >> implementation.
    >>
    >> You know, all those single/double borders, corners, crosses,
    >> pseudo-shadow (dithered) boxes and so on.
    >>
    >> Since those characters mess up the encoding of my files, I cannot put
    >> them straight into the source code as char-literals, I have to hard-code
    >> their numeric values.
    >>
    >> I noticed that, at least on my implementation, it doesn't make any
    >> difference if I assign a negative value to an unsigned char - the
    >> expected glyph shows up correctly - hence I think I wouldn't have to
    >> worry if the same code is run on an implementation where char is
    >> unsigned.
    >>
    >> My questions:
    >>
    >> - what assumptions (if any) can I make about the presence of those
    >> out-of-common-range characters and their (correct) correspondence with
    >> the codes I use to hard-code?

    >
    > You need to ask this in the newsgroup for your OS and/or your terminal
    > because those things are hardware- and platform-specific. Those
    > characters are not part of the basic character set, C++ knows nothing
    > about them.
    >
    >> - assuming it is possible to, how can I ensure that my program displays
    >> the correct "graphics" regardless of the platform / implementation it is
    >> compiled onto?

    >
    > There is no way.
    >
    >> Note: resorting to an external library that "does the stuff for me" is
    >> not an option here, I'm asking in order to learn, not just to solve an
    >> issue.

    >
    > <shrug> Whatever.


    I'm sorry if my post disturbed you: I explicitly marked it as "[half
    OT]" and I posted it here for a reason, which should be evident.

    Nonetheless, thank you for your reply, Victor - that's just what I was
    looking for: the confirmation that I cannot portably resort to those
    graphics, so that I'll avoid struggling for something that isn't
    achievable - this is "learning", for me.

    --
    FSC - http://userscripts.org/scripts/show/59948
    http://fscode.altervista.org - http://sardinias.com
     
    Francesco S. Carta, Jul 14, 2010
    #2
    1. Advertising

  3. Re: [half OT] About the not-in-common range of signed and unsignedchar

    Victor Bazarov <>, on 13/07/2010 19:48:07, wrote:

    > On 7/13/2010 7:22 PM, Francesco S. Carta wrote:
    >> Victor Bazarov <>, on 13/07/2010 19:13:13, wrote:
    >>
    >>> On 7/13/2010 7:01 PM, Francesco S. Carta wrote:
    >>>> Hi there,
    >>>> when I create some less-than-trivial console program that involves some
    >>>> kind of pseudo-graphic interface I resort to using the glyphs that lie
    >>>> in the range [-128, -1] - the simple "char" type is signed in my
    >>>> implementation.
    >>>>
    >>>> You know, all those single/double borders, corners, crosses,
    >>>> pseudo-shadow (dithered) boxes and so on.
    >>>>
    >>>> Since those characters mess up the encoding of my files, I cannot put
    >>>> them straight into the source code as char-literals, I have to
    >>>> hard-code
    >>>> their numeric values.
    >>>>
    >>>> I noticed that, at least on my implementation, it doesn't make any
    >>>> difference if I assign a negative value to an unsigned char - the
    >>>> expected glyph shows up correctly - hence I think I wouldn't have to
    >>>> worry if the same code is run on an implementation where char is
    >>>> unsigned.
    >>>>
    >>>> My questions:
    >>>>
    >>>> - what assumptions (if any) can I make about the presence of those
    >>>> out-of-common-range characters and their (correct) correspondence with
    >>>> the codes I use to hard-code?
    >>>
    >>> You need to ask this in the newsgroup for your OS and/or your terminal
    >>> because those things are hardware- and platform-specific. Those
    >>> characters are not part of the basic character set, C++ knows nothing
    >>> about them.
    >>>
    >>>> - assuming it is possible to, how can I ensure that my program displays
    >>>> the correct "graphics" regardless of the platform / implementation
    >>>> it is
    >>>> compiled onto?
    >>>
    >>> There is no way.
    >>>
    >>>> Note: resorting to an external library that "does the stuff for me" is
    >>>> not an option here, I'm asking in order to learn, not just to solve an
    >>>> issue.
    >>>
    >>> <shrug> Whatever.

    >>
    >> I'm sorry if my post disturbed you: I explicitly marked it as "[half
    >> OT]" and I posted it here for a reason, which should be evident.

    >
    > It didn't disturb me. I am sorry you thought I did (why did you think
    > that?).


    Your last line above ("<shrug> Whatever.") made me think that the whole
    post disturbed or at least annoyed you. I'm glad to discover that I
    misinterpreted your post :)

    > And the only reason evident to me is that you asked a valid
    > question on C++. What other reason would one need?


    That was a "combined" reply, relative to my misinterpretation of your
    post /and/ to the fact that you pointed me to another group. The reason
    for posting it here is exactly the one you noted: it's about C++ - even
    though it was likely to be a platform-specific issue - "half OT", as I
    said ;-)

    >> Nonetheless, thank you for your reply, Victor - that's just what I was
    >> looking for: the confirmation that I cannot portably resort to those
    >> graphics, so that I'll avoid struggling for something that isn't
    >> achievable - this is "learning", for me.

    >
    > Well, you seemed to post when you already knew the answer (although I
    > can still be mistaken). You either need to use somebody else's library
    > (which will represent an abstraction layer for you, and behind the
    > scenes its code is platform-specific, regardless what language it is
    > implemented in) or implement that functionality yourself, essentially
    > reproducing the same library.


    Technically no, I didn't "know" the answer, I just suspected it, hence I
    asked for confirmation (although I didn't express my question as such).

    Although it is true that I could have just relied on my understanding of
    the standard, I was also hoping to get a "real life" reply on the lines
    of "on windows and linux you're pretty much safe assuming those
    characters [are|aren't] available and [have|haven't] the same values,
    I've tried [this] and [that], and [that other] gave me problems, YMMV,
    do some tests".

    [ besides: the threads here happen to see people dropping in with
    not-strictly-related comments which are precious, at times, because they
    lead me to investigate new things - posting stuff like this is (also)
    another chance to see those kind of "lateral" follow-ups ]

    Thank you for your clarification and for the further details.

    --
    FSC - http://userscripts.org/scripts/show/59948
    http://fscode.altervista.org - http://sardinias.com
     
    Francesco S. Carta, Jul 14, 2010
    #3
  4. Francesco S. Carta

    Jonathan Lee Guest

    Re: About the not-in-common range of signed and unsigned char

    On Jul 13, 7:01 pm, "Francesco S. Carta" <> wrote:
    > - what assumptions (if any) can I make about the presence of those
    > out-of-common-range characters and their (correct) correspondence with
    > the codes I use to hard-code?


    signed to unsigned conversion is well-defined in [conv.integral]. If
    you're storing these numbers in (signed) chars as negatives, they'll
    predictably be changed to unsigned char. You should be okay so long
    as CHAR_BIT is appropriate.

    For example, suppose you have signed char c = -41, and are going to
    cast this to char. If char is signed, no problem. If char is unsigned
    then the result is (1 << CHAR_BIT) - 41. Suppose CHAR_BIT is 8, then
    the
    result is 215. If CHAR_BIT is 9, you'll get 471. The former probably
    will probably be the same character in whatever extended ASCII as
    -41. The latter, probably not. So I guess you could have an #if
    to watch this.

    Of course, there are different versions of extended ASCII, and even
    non-ASCII so -41 isn't really guaranteed to be anything in particular.
    But you can know the result of converting to unsigned. Whereas
    conversion from unsigned to signed is not defined. I guess that's
    my point.

    > - assuming it is possible to, how can I ensure that my program displays
    > the correct "graphics" regardless of the platform / implementation it is
    > compiled onto?


    If those characters were guaranteed to be present in _some_ order,
    it might be conceivable. But they're not. How could you display
    "filled
    in square" on a platform that doesn't have such a character?

    --Jonathan
     
    Jonathan Lee, Jul 14, 2010
    #4
  5. Re: About the not-in-common range of signed and unsigned char

    Jonathan Lee <>, on 13/07/2010 18:33:22, wrote:

    > On Jul 13, 7:01 pm, "Francesco S. Carta"<> wrote:
    >> - what assumptions (if any) can I make about the presence of those
    >> out-of-common-range characters and their (correct) correspondence with
    >> the codes I use to hard-code?

    >
    > signed to unsigned conversion is well-defined in [conv.integral]. If
    > you're storing these numbers in (signed) chars as negatives, they'll
    > predictably be changed to unsigned char. You should be okay so long
    > as CHAR_BIT is appropriate.
    >
    > For example, suppose you have signed char c = -41, and are going to
    > cast this to char. If char is signed, no problem. If char is unsigned
    > then the result is (1<< CHAR_BIT) - 41. Suppose CHAR_BIT is 8, then
    > the
    > result is 215. If CHAR_BIT is 9, you'll get 471. The former probably
    > will probably be the same character in whatever extended ASCII as
    > -41. The latter, probably not. So I guess you could have an #if
    > to watch this.
    >
    > Of course, there are different versions of extended ASCII, and even
    > non-ASCII so -41 isn't really guaranteed to be anything in particular.
    > But you can know the result of converting to unsigned. Whereas
    > conversion from unsigned to signed is not defined. I guess that's
    > my point.


    I didn't consider that CHAR_BIT problem at all, thank you for pointing
    it out Jonathan.

    I think I'd work around this by checking if the normal char is signed or
    not, and filling the appropriate table with the appropriate values - so
    that I'll avoid signed/unsigned conversions completely.

    >> - assuming it is possible to, how can I ensure that my program displays
    >> the correct "graphics" regardless of the platform / implementation it is
    >> compiled onto?

    >
    > If those characters were guaranteed to be present in _some_ order,
    > it might be conceivable. But they're not. How could you display
    > "filled
    > in square" on a platform that doesn't have such a character?


    I think I've discovered my true point, I'm interested into a subset of:

    http://en.wikipedia.org/wiki/Code_page_437

    which, as it seems, "is still the primary font in the core of any EGA
    and VGA compatible graphic card".

    If I decide to spend some effort in making some portable program that
    uses them, I'd have to find a way to activate that code page or
    something comparable as explained in:

    http://en.wikipedia.org/wiki/Box_drawing_characters

    and resort to acceptable replacements (such as \, /, |, - and +) in case
    none of the above is available.

    In this way the program could be considered "portable" enough - at least
    for me ;-)

    Thanks a lot for your attention.

    --
    FSC - http://userscripts.org/scripts/show/59948
    http://fscode.altervista.org - http://sardinias.com
     
    Francesco S. Carta, Jul 14, 2010
    #5
  6. Francesco S. Carta

    James Kanze Guest

    Re: About the not-in-common range of signed and unsigned char

    On Jul 14, 10:27 am, "Francesco S. Carta" <> wrote:
    > Jonathan Lee <>, on 13/07/2010 18:33:22, wrote:
    > > On Jul 13, 7:01 pm, "Francesco S. Carta"<> wrote:
    > >> - what assumptions (if any) can I make about the presence
    > >> of those out-of-common-range characters and their (correct)
    > >> correspondence with the codes I use to hard-code?


    > > signed to unsigned conversion is well-defined in [conv.integral]. If
    > > you're storing these numbers in (signed) chars as negatives, they'll
    > > predictably be changed to unsigned char. You should be okay so long
    > > as CHAR_BIT is appropriate.


    He needs a CHAR_BIT which is at least 8, which is guaranteed.

    In practice, I'd use the positive (actually defined) values, and
    not some negative mapping, even if char is signed.

    > > For example, suppose you have signed char c = -41, and are
    > > going to cast this to char. If char is signed, no problem.
    > > If char is unsigned then the result is (1<< CHAR_BIT) - 41.
    > > Suppose CHAR_BIT is 8, then the result is 215. If CHAR_BIT
    > > is 9, you'll get 471. The former probably will probably be
    > > the same character in whatever extended ASCII as -41. The
    > > latter, probably not. So I guess you could have an #if to
    > > watch this.


    I'd use 0xD7, rather than -41. Formally, the conversion of this
    value to char, if char's are signed, is implementation defined,
    but practically, doing anything but preserving the bit pattern
    would break so much code it isn't going to happen.

    > > Of course, there are different versions of extended ASCII,
    > > and even non-ASCII so -41 isn't really guaranteed to be
    > > anything in particular. But you can know the result of
    > > converting to unsigned. Whereas conversion from unsigned to
    > > signed is not defined. I guess that's my point.


    Formally, of course, there's no such thing as "extended
    ASCII":). There are just other code sets, which happen to
    correspond exactly to ASCII for the range 0-127.

    > I didn't consider that CHAR_BIT problem at all, thank you for pointing
    > it out Jonathan.


    > I think I'd work around this by checking if the normal char is signed or
    > not, and filling the appropriate table with the appropriate values - so
    > that I'll avoid signed/unsigned conversions completely.


    > >> - assuming it is possible to, how can I ensure that my program displays
    > >> the correct "graphics" regardless of the platform / implementation it is
    > >> compiled onto?


    > > If those characters were guaranteed to be present in _some_
    > > order, it might be conceivable. But they're not. How could
    > > you display "filled in square" on a platform that doesn't
    > > have such a character?


    > I think I've discovered my true point, I'm interested into a subset of:


    > http://en.wikipedia.org/wiki/Code_page_437


    > which, as it seems, "is still the primary font in the core of any EGA
    > and VGA compatible graphic card".


    I don't think so, but I've not actually programmed anything at
    that low a level for many, many years.

    Not that it matters, since you probably can't access the graphic
    card directly.

    > If I decide to spend some effort in making some portable program that
    > uses them, I'd have to find a way to activate that code page or
    > something comparable as explained in:


    > http://en.wikipedia.org/wiki/Box_drawing_characters


    > and resort to acceptable replacements (such as \, /, |, - and +) in case
    > none of the above is available.


    Most machines don't have "code pages"; they're an MS-DOS
    invention. Most modern systems *do* support Unicode, however
    (under Windows, it's code page 65001 if you're using UTF-8
    encoding). You might have more luck with portability if you
    used Unicode characters in the range 0x2500-0x257F.

    > In this way the program could be considered "portable" enough - at least
    > for me ;-)


    It's only portable to Windows.

    --
    James Kanze
     
    James Kanze, Jul 14, 2010
    #6
  7. Re: About the not-in-common range of signed and unsigned char

    James Kanze <>, on 14/07/2010 07:22:01, wrote:

    > On Jul 14, 10:27 am, "Francesco S. Carta"<> wrote:


    <snip>

    >> I think I've discovered my true point, I'm interested into a subset of:
    >>
    >> http://en.wikipedia.org/wiki/Code_page_437


    <snip>

    > Most machines don't have "code pages"; they're an MS-DOS
    > invention. Most modern systems *do* support Unicode, however
    > (under Windows, it's code page 65001 if you're using UTF-8
    > encoding). You might have more luck with portability if you
    > used Unicode characters in the range 0x2500-0x257F.


    Heck, that's one of those (in)famous Columbus' eggs... thanks for the
    further details James, I will resort to using Unicode characters, that's
    a way better bet.

    --
    FSC - http://userscripts.org/scripts/show/59948
    http://fscode.altervista.org - http://sardinias.com
     
    Francesco S. Carta, Jul 14, 2010
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Replies:
    9
    Views:
    1,786
    Peter Nilsson
    Jul 26, 2004
  2. Steffen Fiksdal

    void*, char*, unsigned char*, signed char*

    Steffen Fiksdal, May 8, 2005, in forum: C Programming
    Replies:
    1
    Views:
    591
    Jack Klein
    May 9, 2005
  3. Junmin H.
    Replies:
    20
    Views:
    1,022
    Charlie Gordon
    Sep 20, 2007
  4. Ioannis Vranos
    Replies:
    11
    Views:
    765
    Ioannis Vranos
    Mar 28, 2008
  5. Ioannis Vranos

    Padding bits and char, unsigned char, signed char

    Ioannis Vranos, Mar 28, 2008, in forum: C Programming
    Replies:
    6
    Views:
    620
    Ben Bacarisse
    Mar 29, 2008
Loading...

Share This Page