Function returning a char

C

candide

As the functions declared in <ctype.h>, many standard functions return an int
value. However, there is no standard function returning a char while it would be
technically possible. I suppose this is explained by the fact that every char
returned by such a function is converted to int before being returned. Is it a
correct explanation ?
 
B

Barry Schwarz

As the functions declared in <ctype.h>, many standard functions return an int
value. However, there is no standard function returning a char while it would be
technically possible. I suppose this is explained by the fact that every char
returned by such a function is converted to int before being returned. Is it a
correct explanation ?

No. You can define your own functions to return char with no problem.

The classification functions in ctype.h (the ones named is...) are
boolean in nature. The return value indicates if the premise is true
or not. For example, isdigit tells you whether the argument passed is
one of the ten decimal digits. Since operators which indicate the
truth or falsity of a premise (the relational operators, the equality
operators, the logical AND and OR operators) evaluate to an int with
values 1 and 0, the language designers decided to have the ctype.h
functions return the same type and values. This eliminates the need
for any conversions in long complex boolean expressions. (This
convention was also followed for other boolean functions such as feof
and ferror and a similar convention for functions like memcmp and
strcmp)

The mapping functions (the ones named to...) must handle both values
that represent characters in the execution set and the special value
defined as EOF. They must capable of returning any of these values.
Since EOF need not be representable as a char, these functions must
return an integer of higher rank. I guess they could have used short
but they chose int (possibly because it is intended as the "natural"
type for the hardware).
 
C

candide

Barry Schwarz a écrit :
The classification functions in ctype.h (the ones named is...) are
boolean in nature.


OK, I was referring to int functions such as tolower() from the standard header
<ctype.h> or putchar() or getchar() from the standard header <stdio.h>. For
instance and quoting the standard, the later returns "the next _character_ from
the input stream (...)" and the former "writes the _character_ (...)"
(emphasized by me).
The mapping functions (the ones named to...) must handle both values
that represent characters in the execution set and the special value
defined as EOF. They must capable of returning any of these values.
Since EOF need not be representable as a char, these functions must
return an integer of higher rank.

OK, quite interesting, thanks. But, again, I was told by a skilled programmer
having over 20 years experience with C programming that a value of type char is
first converted to int before being returned. So, that's not true ????
 
E

Eric Sosman

candide said:
Barry Schwarz a écrit :



OK, I was referring to int functions such as tolower() from the standard header
<ctype.h> or putchar() or getchar() from the standard header <stdio.h>. For
instance and quoting the standard, the later returns "the next _character_ from
the input stream (...)" and the former "writes the _character_ (...)"
(emphasized by me).


OK, quite interesting, thanks. But, again, I was told by a skilled programmer
having over 20 years experience with C programming that a value of type char is
first converted to int before being returned. So, that's not true ????

Sort of true, but mostly just garbled.

In any expression, a char operand is subject to the "integer
promotions," which convert char values to int values (or more
rarely to unsigned int values). So even in

char foo() {
char ch = 'X';
return ch;
}

.... the expression `ch' in the return statement is promoted to
(unsigned?) int. It is immediately demoted to char again to become
the value of the function, but since the function call is itself
part of an expression, its value is promoted once again.

This sounds like a lot of running around, but the various
promotions and demotions turn out to be no-ops on most machines,
or at worst a "widening load" or "narrowing store" between
registers and memory.
 
C

candide

Eric Sosman a écrit :
In any expression, a char operand is subject to the "integer
promotions," which convert char values to int values (or more
rarely to unsigned int values).

In "_any_ expression" did you say ? I don't think so. From the Standard, did you
notice the footnote explaining the circumstances where integral promotion takes
place :

"(48) The integer promotions are applied only: as part of the usual arithmetic
conversions, to certain argument expressions, to the operands of the unary +, -,
and ~ operators, and to both operands of the shift operators, as speciï¬ed by
their respective subclauses."

So even in

char foo() {
char ch = 'X';
return ch;
}

... the expression `ch' in the return statement is promoted to
(unsigned?) int. It is immediately demoted to char again to become
the value of the function, but since the function call is itself
part of an expression, its value is promoted once again.

cf. the footnote quoted above.
 
B

Barry Schwarz

Barry Schwarz a écrit :



OK, I was referring to int functions such as tolower() from the standard header

tolower must return the original value of the argument if there is no
corresponding lower case value. 'A' -> 'a' but '1' -> '1'. Since one
of the allowed argument values is EOF which may not be representable
in a char then both the argument and return value must have higher
rank.
<ctype.h> or putchar() or getchar() from the standard header <stdio.h>. For
instance and quoting the standard, the later returns "the next _character_ from
the input stream (...)" and the former "writes the _character_ (...)"
(emphasized by me).

getchar must return the character obtained ****or EOF under the
appropriate conditions**** (emphasis mine). Since EOF may not be
representable as a char, the return value must have higher rank. I
expect putchar takes an int argument mostly because 'A' is an int, not
a char, and the statement
putchar('A');
should not involve a conversion.
OK, quite interesting, thanks. But, again, I was told by a skilled programmer
having over 20 years experience with C programming that a value of type char is
first converted to int before being returned. So, that's not true ????

It is converted (if necessary) to whatever type the function will
return. Ask him what happens if the function returns a short.

If your compiler generates code to perform an unnecessary conversion,
that is a quality of implementation issue. If foo is defined as
returning a char, the only requirement for
char x = foo();
is that x end up with the correct value of the char that foo returned.
How it got there is something the implementer decides.
 
T

Tim Rentsch

Eric Sosman said:
Sort of true, but mostly just garbled.

In any expression, a char operand is subject to the "integer
promotions," which convert char values to int values (or more
rarely to unsigned int values). So even in

char foo() {
char ch = 'X';
return ch;
}

... the expression `ch' in the return statement is promoted to
(unsigned?) int. It is immediately demoted to char again to become
the value of the function, but since the function call is itself
part of an expression, its value is promoted once again.

Integer promotions happen in lots of places, but this return
statement isn't one of them.

6.3.1.1 p2(48):

48) The integer promotions are applied only: as part of the usual
arithmetic conversions, to certain argument expressions, to the
operands of the unary +, -, and ~ operators, and to both operands
of the shift operators, as specified by their respective
subclauses.

(Actually, integer promotions also apply for switch() control
expressions, and possibly for control expressions of if's,
while's, and for's. The "certain argument expressions" are
arguments for a function that doesn't have a prototype,
and variadic arguments for functions that don't.)

The usual arithmetic conversions apply (for operands that have
arithmetic type) for these operators:

* % /
+ - (binary)
< <= > >=
== !=
&
^
|
?: (2nd & 3rd operands)

The integer promotions are /not/ applied:

to the right-hand side of an assignment
to arguments that correspond to parameters specified by a prototype
to the expression in a return statement
 
W

WANG Cong

Barry Schwarz wrote:

The mapping functions (the ones named to...) must handle both values
that represent characters in the execution set and the special value
defined as EOF. They must capable of returning any of these values.
Since EOF need not be representable as a char, these functions must
return an integer of higher rank. I guess they could have used short
but they chose int (possibly because it is intended as the "natural"
type for the hardware).

Hmmm, this can be a good reason, but, it is only a reason that these
functions should use a type wider than char, then why not short?

So, _I think_ the main reason is for efficiency, since int is more
suitable for registers on most arch.

But maybe there are also some historical reasons, since _I heard_
in the beginning, C only had int type (this is why 'unsigned' equals
to 'unsigned int', not 'unsigned char' or something else).

Regards.
 
N

Nate Eldredge

WANG Cong said:
Barry Schwarz wrote:



Hmmm, this can be a good reason, but, it is only a reason that these
functions should use a type wider than char, then why not short?

So, _I think_ the main reason is for efficiency, since int is more
suitable for registers on most arch.

That seems very likely.
But maybe there are also some historical reasons, since _I heard_
in the beginning, C only had int type (this is why 'unsigned' equals
to 'unsigned int', not 'unsigned char' or something else).

I don't know about that; I would guess it probably had char. But C
definitely has a flavor of being intended for assembly language
programmers, who would usually use the regular machine word for a
variable if possible. In C, int takes the place of the machine word, so
it makes sense that programmers and library authors would use it as the
"default" type. The language also reflects this by having int
implicitly assumed in several contexts.
 
E

Eric Sosman

candide said:
Eric Sosman a écrit :


In "_any_ expression" did you say ? I don't think so. From the Standard, did you
notice the footnote explaining the circumstances where integral promotion takes
place :

"(48) The integer promotions are applied only: as part of the usual arithmetic
conversions, to certain argument expressions, to the operands of the unary +, -,
and ~ operators, and to both operands of the shift operators, as speciï¬ed by
their respective subclauses."

Well, obviously I didn't notice that passage. Thank you
for pointing it out -- and I retract most of what I wrote (the
taste of foot in mouth is not very appealing, but I'm starting
to get accustomed to it ...)
 
B

Barry Schwarz

On Sun, 15 Feb 2009 15:59:59 +0800, WANG Cong

snip
But maybe there are also some historical reasons, since _I heard_
in the beginning, C only had int type (this is why 'unsigned' equals
to 'unsigned int', not 'unsigned char' or something else).

I can't say for certain but I'm almost positive that C started with a
char type in addition to int so it could handle strings.

In any event, early versions of C (up to at least C89) had implied int
for declarations that didn't specify. So
int getchar(void);
was exactly the same declaration as
getchar(void);

From there, it seems a very small leap to conclude that
unsigned int x;
is the same as
unsigned x;

I think this is the more likely explanation.

Even though implied int has been eliminated, the C99 standard still
allows it to be omitted from signed and unsigned declarations and from
short, long, and long long as well as the signed and unsigned modified
versions of these declarations. I have no doubt that this was done to
preserve backward compatibility.
 
K

Keith Thompson

Barry Schwarz said:
On Sun, 15 Feb 2009 15:59:59 +0800, WANG Cong

snip


I can't say for certain but I'm almost positive that C started with a
char type in addition to int so it could handle strings.

Yes, as well as float and probably double.
In any event, early versions of C (up to at least C89) had implied int
for declarations that didn't specify. So
int getchar(void);
was exactly the same declaration as
getchar(void);

From there, it seems a very small leap to conclude that
unsigned int x;
is the same as
unsigned x;

I think this is the more likely explanation.
[...]

C's ancestor languages, BCPL and B, had no real types other than a
16-bit word. Here's a sample function from Ken Thompson's "Users'
Reference to B", via Wikipedia:

/* The following function will print a non-negative number, n, to
the base b, where 2<=b<=10, This routine uses the fact that
in the ASCII character set, the digits 0 to 9 have sequential
code values. */

printn(n,b) {
extrn putchar;
auto a;

if(a=n/b) /* assignment, not test for equality */
printn(a, b); /* recursive */
putchar(n%b + '0');
}

Note that the declaration of putchar doesn't even specify that it's a
function.

Early C retained the ability to declare parameters and variables
without specifying their types; since C's int corresponding to B's
word, the implicit int rule made sense.

Pre-ANSI C added the ability to specify types other than int, and the
requirement to distinguish between variables and functions. ANSI C,
of course, added function prototypes, but didn't require them, and
implicit int wasn't dropped until C99.

I can get the above to compile as C either by changing
extrn putchar;
to
extern putchar();
or just by deleting that line.
 
W

WANG Cong

Keith said:
Barry Schwarz said:
On Sun, 15 Feb 2009 15:59:59 +0800, WANG Cong

snip


I can't say for certain but I'm almost positive that C started with a
char type in addition to int so it could handle strings.

Yes, as well as float and probably double.
In any event, early versions of C (up to at least C89) had implied int
for declarations that didn't specify. So
int getchar(void);
was exactly the same declaration as
getchar(void);

From there, it seems a very small leap to conclude that
unsigned int x;
is the same as
unsigned x;

I think this is the more likely explanation.
[...]

C's ancestor languages, BCPL and B, had no real types other than a
16-bit word. Here's a sample function from Ken Thompson's "Users'
Reference to B", via Wikipedia:

/* The following function will print a non-negative number, n, to
the base b, where 2<=b<=10, This routine uses the fact that
in the ASCII character set, the digits 0 to 9 have sequential
code values. */

printn(n,b) {
extrn putchar;
auto a;

if(a=n/b) /* assignment, not test for equality */
printn(a, b); /* recursive */
putchar(n%b + '0');
}

Note that the declaration of putchar doesn't even specify that it's a
function.

Early C retained the ability to declare parameters and variables
without specifying their types; since C's int corresponding to B's
word, the implicit int rule made sense.

Pre-ANSI C added the ability to specify types other than int, and the
requirement to distinguish between variables and functions. ANSI C,
of course, added function prototypes, but didn't require them, and
implicit int wasn't dropped until C99.

I can get the above to compile as C either by changing
extrn putchar;
to
extern putchar();
or just by deleting that line.

Thanks for your information, it is really useful for me!
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,779
Messages
2,569,606
Members
45,239
Latest member
Alex Young

Latest Threads

Top