printing % with printf(), use of \ (escape) character

T

teachtiro

Hi,

'C' says \ is the escape character to be used when characters are
to be interpreted in an uncommon sense, e.g. \t usage in printf(),
but for printing % through printf(), i have read [pg.154 of 'The C
programming language' -kernighan, Ritchie] that %% should be used.
Wouldn't it have been better (from design perspective) if the same
escape character had been used in this case too.

Forgive me for posting without verfying things with any standard
compiler, i don't have the means for now.

tiro.
 
L

Lew Pitcher

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi,

'C' says \ is the escape character to be used when characters are
to be interpreted in an uncommon sense, e.g. \t usage in printf(),
but for printing % through printf(), i have read [pg.154 of 'The C
programming language' -kernighan, Ritchie] that %% should be used.
Wouldn't it have been better (from design perspective) if the same
escape character had been used in this case too.

That's both correct, and wrong. You've confused two different things, C
character "escape sequences" and printf() format sequences.

C recognizes that certain characters cannot be easily entered into a program as
data (like the "backspace" character, the "newline" character, or even the "tab"
character). For those characters, C recognizes a simple two-character escape
sequence consisting of a '\' followed by a second character from a limited set.
Since the '\' character is part of this escape sequence, there is a special
escape sequence jut to reproduce the '\' character.

So, you get '\t' for a tab character, '\n' for a newline character, '\\' for the
backslash character, and so on, for about 10 or so characters. These values can
be used whereever a character data item is entered; for instance
char TabChar = '\t'; /* for a single tab character */
char MSDOS_Path[] = "C:\\MSDOS\\"; /* for the backslash character */

You can even printf() with these characters...
printf("\n\n\tThis line is two down, and one tabstop in \n");

Now, printf() is a function that, as it's first argument, takes a string that
tells it what format the rest of the arguments are. This string also provides
some constants that seperate the values represented by the other arguments when
printf() reproduces it's data to stdout.

The printf() function uses the '%' character in the format string to delimit
where it is to insert the next argument, and to provide a format and data type
for that argument. Of course, this means that, _in the format string_, if you
want a percent character to be reproduced verbatum, you'll have to escape it
according to the rules that printf() uses; "%%" _as a format string_ causes
printf() to reproduce a single percent sign instead of looking for the next
argument.

But, this isn't the only way to get printf() to reproduce the percent sign. You
can give printf() a character or string argument consisting of an _unescaped_
percent sign as any of the arguments _other than the format string.

So, here's an example...

printf("Three percents:\n1 %%\n2 %c\n3 %s\n",'%',"%");





- --
Lew Pitcher
IT Specialist, Enterprise Data Systems,
Enterprise Technology Solutions, TD Bank Financial Group

(Opinions expressed are my own, not my employers')
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (MingW32)

iD8DBQFCAo9NagVFX4UWr64RAlCIAJ9tsBzq4mkLaw/6DnbBcVm3YlcuoACfcckA
xJYRuCoA5dPmRe1rzJ2k02k=
=yXLB
-----END PGP SIGNATURE-----
 
E

Eric Sosman

Hi,

'C' says \ is the escape character to be used when characters are
to be interpreted in an uncommon sense, e.g. \t usage in printf(),
but for printing % through printf(), i have read [pg.154 of 'The C
programming language' -kernighan, Ritchie] that %% should be used.
Wouldn't it have been better (from design perspective) if the same
escape character had been used in this case too.

Forgive me for posting without verfying things with any standard
compiler, i don't have the means for now.

The two conventions "escape" from two different
processing mechanisms operating at two different levels.

The \ escape tells the C compiler to do something
unusual, like generating a single tab character when it
sees a \ followed by a t. The compiler does what it does,
and when all's done your program has a string containing
a tab character; both the \ and the t are gone. They have
gone, in a sense, to the same place the " quotes went: they
were source-code constructs that produced some effect in
the finished program and then vanished, leaving only the
program itself behind.

So now you've got a program with some strings in it,
and the strings don't contain any backslash characters
(unless you created them with the source sequence \\, of
course). Now you hand one of these strings to printf() as
a format, and printf() starts copying the format string's
characters to the output. (Note that there's nothing special
about the tab character; printf() just copies it the same
way it copies H e l l o.) However, you need a way to tell
printf() to stop copying and do something else, like output
the characters that represent an integer value. So printf()
is sensitive to the % character, which means "stop copying
and do something different, as specified by the next few
characters (which also aren't copied)."

Of course, this means you can't output a % character by
just copying it from the format; as soon as printf() sees %
it switches into "do something special" mode. So to make it
possible to output a percent sign, printf() recognizes the
sequence %% as "do something special, to wit, output a %
character."

printf() could have been designed to use something other
than the % character as its "special stuff" marker; in fact,
it could perfectly well have used the backslash character.
But remember that it takes extra effort to get the compiler
to generate a backslash character in a string: backslash means
"special stuff" to the compiler, so you need to write two of
them to generate one as "payload." Your printf() formats
would wind up looking like

printf ("Answer = \\d\n", 42);

That might not look too awful, but consider: this different
printf() would also need a way to output an actual backslash
character, just as the real printf() needs a special mechanism
to output a percent sign. Well, how about doubling them up?

printf ("Answer between backslashes = \\\\\\d\\\\\n", 42);

Ugh, and double ugh. (Sadly, many "regular expression" packages
run into exactly this problem: they use \ as their own "special"
indicator, and they double it to mean "just an ordinary backslash
after all," and then it gets doubled again to push it through the
C compiler ... Trying to read regular expression strings embedded
in C source code can make your eyes go twisty.)

Summary: Different escape mechanisms to talk to different
entities that operate at different times.
 
T

teachtiro

Hi Eric,
Thanks for the response,
i am sorry for a late reply,
i am a newbie in the C world.
--------------------------------
Note that there's nothing special
about the tab character; printf() just copies it the same
way it copies H e l l o.) However, you need a way to tell
printf() to stop copying and do something else, like output
the characters that represent an integer value. So printf()
is sensitive to the % character, which means "stop copying
and do something different, as specified by the next few
characters (which also aren't copied)."
-----------------------------------------------
can't that be implemented by using a different character with \, for
e.g., they could have said to printf to consider \ as a special
character and said, take \d as a sign of representing corresponding
variale as a decimal
i.e. instead of saying \\d for the purpose, we could use \d or some \k
(if d is alreay in use)
the fact that printf() and c compiler work at different levels should
make it more feasible


tiro.


Eric said:
Hi,

'C' says \ is the escape character to be used when characters are
to be interpreted in an uncommon sense, e.g. \t usage in printf(),
but for printing % through printf(), i have read [pg.154 of 'The C
programming language' -kernighan, Ritchie] that %% should be used.
Wouldn't it have been better (from design perspective) if the same
escape character had been used in this case too.

Forgive me for posting without verfying things with any standard
compiler, i don't have the means for now.

The two conventions "escape" from two different
processing mechanisms operating at two different levels.

The \ escape tells the C compiler to do something
unusual, like generating a single tab character when it
sees a \ followed by a t. The compiler does what it does,
and when all's done your program has a string containing
a tab character; both the \ and the t are gone. They have
gone, in a sense, to the same place the " quotes went: they
were source-code constructs that produced some effect in
the finished program and then vanished, leaving only the
program itself behind.

So now you've got a program with some strings in it,
and the strings don't contain any backslash characters
(unless you created them with the source sequence \\, of
course). Now you hand one of these strings to printf() as
a format, and printf() starts copying the format string's
characters to the output. (Note that there's nothing special
about the tab character; printf() just copies it the same
way it copies H e l l o.) However, you need a way to tell
printf() to stop copying and do something else, like output
the characters that represent an integer value. So printf()
is sensitive to the % character, which means "stop copying
and do something different, as specified by the next few
characters (which also aren't copied)."

Of course, this means you can't output a % character by
just copying it from the format; as soon as printf() sees %
it switches into "do something special" mode. So to make it
possible to output a percent sign, printf() recognizes the
sequence %% as "do something special, to wit, output a %
character."

printf() could have been designed to use something other
than the % character as its "special stuff" marker; in fact,
it could perfectly well have used the backslash character.
But remember that it takes extra effort to get the compiler
to generate a backslash character in a string: backslash means
"special stuff" to the compiler, so you need to write two of
them to generate one as "payload." Your printf() formats
would wind up looking like

printf ("Answer = \\d\n", 42);

That might not look too awful, but consider: this different
printf() would also need a way to output an actual backslash
character, just as the real printf() needs a special mechanism
to output a percent sign. Well, how about doubling them up?

printf ("Answer between backslashes = \\\\\\d\\\\\n", 42);

Ugh, and double ugh. (Sadly, many "regular expression" packages
run into exactly this problem: they use \ as their own "special"
indicator, and they double it to mean "just an ordinary backslash
after all," and then it gets doubled again to push it through the
C compiler ... Trying to read regular expression strings embedded
in C source code can make your eyes go twisty.)

Summary: Different escape mechanisms to talk to different
entities that operate at different times.
 
E

Eric Sosman

Hi Eric,
Thanks for the response,
i am sorry for a late reply,
i am a newbie in the C world.
--------------------------------
Note that there's nothing special


-----------------------------------------------
can't that be implemented by using a different character with \, for
e.g., they could have said to printf to consider \ as a special
character and said, take \d as a sign of representing corresponding
variale as a decimal
i.e. instead of saying \\d for the purpose, we could use \d or some \k
(if d is alreay in use)
the fact that printf() and c compiler work at different levels should
make it more feasible

printf() could have been defined to use \ instead of %.
Any character other than '\0' could have been used: & or #
or Q or ' ' or : or anything you can imagine.

However, if \ had been chosen there would have been a
problem. Let's suppose \ were the chosen character; here
are some output formats you might try to use:

printf ("\d\n", 1);
printf ("\x\n", 2u);
printf ("\07d\n", 3);
printf ("\f\n", 4.0);

None of these would work as you want them to. The first
would cause the compiler to complain about an unknown escape
sequence; '\d' is not a defined character escape. The second
would also provoke a complaint, because when the compiler sees
\x it expects to find some hexadecimal digits next, as in
"\x1b" -- the \x without its hex digits is an undefined escape
sequence.

The final two would compile, but the results would disappoint
you. The third attempts to print a decimal number in a seven-
digit field with leading zeroes, but what it actually does is
print the two characters '\07' and '\n' without converting any
numbers at all -- on an ASCII system, this rings the bell and
starts a new line. The fourth is trying to convert a floating-
point number, but what it actually prints is a form feed followed
by a new line; nothing gets converted.

All these problems are because \ in a string literal changes
the meaning of the next few characters. To prevent that from
happening and to get an actual \ into the string so printf()
could find it, you would need to write

printf ("\\d\n", 1);
printf ("\\x\n", 2u);
printf ("\\07d\n", 3);
printf ("\\f\n", 4.0);

Here's the fact you seem to be overlooking: printf() doesn't
see the format string until after the compiler has processed it.
The compiler applies the same rules to the printf() format string
as it does to any other string literal: it processes \ escapes,
it splices adjacent literals together, and so on, doing exactly
the same thing for printf()'s format as for every other string
literal in the source code. printf() is not a special case.
 
A

Arthur J. O'Dwyer

Hi Eric,
Thanks for the response,
i am sorry for a late reply,
i am a newbie in the C world.

You're also a newbie in the Usenet world. For future reference:
(1) Trim away any part of the last message that you're not responding
to. (2) You should attribute quotes properly, like this:

Eric said:
Note that there's nothing special
about the tab character; printf() just copies it the same
way it copies H e l l o.) However, you need a way to tell
printf() to stop copying and do something else, like output
the characters that represent an integer value. So printf()
is sensitive to the % character, which means "stop copying
and do something different, as specified by the next few
characters (which also aren't copied)."

[C]an't that be implemented by using a different character with \, for
e.g., they could have said to printf to consider \ as a special
character and said, take \d as a sign of representing corresponding
variale as a decimal

No. What is "\d"? It doesn't exist. There is no escape code \d;
that's just a syntax error. (Try it!) The escape codes in C are
\n, \t, \a, \b, \v, \\, \", \', and probably a few more I didn't mention.
So the C code

char *a = "\d";

contains a syntax error. For the same reason, the C code

printf("\d");

contains the same syntax error. Now, we could write

printf("\\d");

That would print a backslash (keyed by the escape code \\) followed
by the letter d.

The printf function happens to associate special actions with the
percent sign, %. So for example we can write

char *a = "%s\n"; /* percent sign, s, escape-code-for-newline */
printf(a, a);

That tells the compiler to pass the string 'a' (a string of length 3
containing the characters '%', 's', and '\n', with a null byte at the
end of the string) to 'printf' as both the first and the second argument.
'printf' sees the percent sign followed by an 's', and prints the value
of the second argument:

%s (newline)

Then it sees the newline and echoes it to the screen.

(newline)

So the output is "%s" followed by two newlines.
Read that through slowly and work it out on paper. Does it make
more sense now?

HTH,
-Arthur
 
T

teachtiro

Hi,
i got it through the below explanation

Eric Sosman wrote:
"Here's the fact you seem to be overlooking: printf() doesn't
see the format string until after the compiler has processed it.
The compiler applies the same rules to the printf() format string
as it does to any other string literal: it processes \ escapes,
it splices adjacent literals together, and so on, doing exactly
the same thing for printf()'s format as for every other string
literal in the source code. printf() is not a special case."


thank u
 
T

teachtiro

i got it through the below explanation

Eric:
"Here's the fact you seem to be overlooking: printf() doesn't
see the format string until after the compiler has processed it. "

thank u
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,764
Messages
2,569,564
Members
45,041
Latest member
RomeoFarnh

Latest Threads

Top