Going through the code (K&R 1-19.c)

N

nabis

First of all let me apologize for being so "green".
You must have been pretty fed up with newbish questions, but,
still I felt, that's the only place I could get a decent answer.

Haven't been able to solve the 1-19 exercise (page 31) from K&R book,
I decided, it's time to cheat, so I took peek at:
http://users.powernet.co.uk/eton/kandr2/krx119.html

Still I would not understand certain things, I will place the
code, with my comments as me trying to figure it out. Grep for "why".



#include <stdio.h>
#define MAX_LINE 1000

void discard_nl(char s[]);
int reverse(char s[]);
int getline(char s[], int lim);

/* inserts a "end-of-string" instead of \n; why do we need \0? */
void discard_nl(char s[])
{
int i;
for (i = 0; s != '\0'; i++)
{
if (s == '\n')
s = '\0';
/* replaces \n with \0 */
}
}

/* the reverse function itself */
int reverse(char s[])
{
char ch;
int i, j;

for(j = 0; s[j] != '\0'; j++) /* going through array \ */
{ /* indexes */
;
}
--j; /* what is it? why is the array one element less? */

for(i =0; i < j; i++)
{
ch = s; /* storing each value in ch */
s = s[j]; /* first value equals last value */
s[j] = ch; /* las value equals first (ch) value */
--j; /* shrinking the number of i < j to be
* checked */
}
return 0;
}
/* getline: all chars up to \n; \n and \0 inserted afterwords, why? */
int getline(char s[], int lim)
{
int c, i;

for(i = 0; i < lim -1 && (c = getchar()) != EOF && c != '\n'; ++i)
s = c;
if (c == '\n')
s[i++] = c; /* or: s = '\n'; ++i; */
s = '\0'; /* inserting "end-of-string" \0 */
return i;
}

int main()
{

char line[MAX_LINE];
while(getline(line, MAX_LINE) > 0) /* changed it to MAX_LINE \ */
{ /* instead of "sizeof line" \ */
discard_nl(line); /* by me */
reverse(line);
printf("%s\n", line);
}
return 0;
}

Any added comments are welcome. I won't be disappointed if someone skips
the post. Thank you. -nabis
 
E

Emmanuel Delahaye

nabis wrote on 31/07/04 :
Still I would not understand certain things, I will place the
code, with my comments as me trying to figure it out. Grep for "why".

#include <stdio.h>
#define MAX_LINE 1000

void discard_nl(char s[]);
int reverse(char s[]);
int getline(char s[], int lim);

/* inserts a "end-of-string" instead of \n; why do we need \0? */

'\0' is the character-oriented-way version of 0. You can use 0 if you
feel more comfortable.

(Note, but I'm quite sure you are aware of that: 0 is the end-of-string
marker)
void discard_nl(char s[])
{
int i;
for (i = 0; s != '\0'; i++)
{
if (s == '\n')
s = '\0';
/* replaces \n with \0 */
}
}

/* the reverse function itself */
int reverse(char s[])
{
char ch;
int i, j;

for(j = 0; s[j] != '\0'; j++) /* going through array \ */
{ /* indexes */
;
}
--j; /* what is it? why is the array one element less? */


The best is to draw the string and index on a paper and see what
happens step by step at each turn of the loop. This kind of job is part
of the debugging process a programmer is supposed to masterize.

for(i =0; i < j; i++)
{
ch = s; /* storing each value in ch */
s = s[j]; /* first value equals last value */
s[j] = ch; /* las value equals first (ch) value */
--j; /* shrinking the number of i < j to be
* checked */
}
return 0;
}

/* getline: all chars up to \n; \n and \0 inserted afterwords, why? */

Because of the purpose of the function: 'Get a line'. The end-of-line
marker belongs to the line (in normal conditions), and a decent
C-string must be terminated by a 0 (always).

Note that is there is no room enough, the '\n' is not present, hence
this status (trimmed line) can be detected by the user. This behaviour
is similar to the one of fgets().
int getline(char s[], int lim)
{
int c, i;

for(i = 0; i < lim -1 && (c = getchar()) != EOF && c != '\n'; ++i)
s = c;
if (c == '\n')
s[i++] = c; /* or: s = '\n'; ++i; */
s = '\0'; /* inserting "end-of-string" \0 */
return i;
}

int main()
{

char line[MAX_LINE];
while(getline(line, MAX_LINE) > 0) /* changed it to MAX_LINE \ */
{ /* instead of "sizeof line" \ */
discard_nl(line); /* by me */


Why ? It's not wrong, but unnecessary. 'sizeof line' was the good way
(as far as 'line' is an array of char).
 
N

nabis

Thank you for your answers.
I tried to see what happens to a string "hello" as it passes through the
functions in order they appear in main(). It clarified some things.
My question is, was it necessary to insert a '\n' in getline() and to remove
this same '\n' in discard_nl(), also do we need to carry '\0' through all
the functions, can't we inset it in the last reverse() function?

-nabis
-----------------------------------------------------------------------------
(Just my notes)
1)getline:

s[0] = 'h'; s[1] = 'e'; s[2] = 'l'; s[3] = 'l'; s[4] = 'o' (exits the loop at
s[5] == '\n')
s[5] = '\n' ; s[6] = '\0' (newline and end-of-string inserted)

the for loop, even after "i < lim -1 && (c = getchar()) != EOF && c != '\n'"
has been tested, and proven false, the ++i increment had taken place, so
we can assign '\n' to s[5]?) Probably, yes:
for (i = 0; i < 10; ++i)
printf("%d ", i);
printf("\n%d\n", i);
output:
0 1 2 3 4 5 6 7 8 9
10


2)discard_nl:

results in
s[0] = 'h'; s[1] = 'e'; s[2] = 'l'; s[3] = 'l'; s[4] = 'o';
s[5] = '\0'; s[6] = '\0';
(there are *two* '\0' in s[] now, am I wrong?)


3)reverse:

j = 4 (after the first for loop and --j, which is right, we want to
manipulate s[0] through s[4])
second for loop:
i = 0 ; ch = s[0] = 'h';
s[0] = s[4] = 'o';
s[4] = 'h';
j = 3;
i = 1; ch = s[1] = 'e';
s[1] = s[3] = 'l';
s[3] = 'e';
j = 2;
i = 2; ch = s[2] = 'l';
s[2] = s[2] = 'l';
s[2] = 'l';
j = 3;
i < j is false, exiting the loop.

our s[] = "olleh\0\0" at this moment?
01234 5 6 (indexes)
 
S

Shug Boabie

nabis said:
for(j = 0; s[j] != '\0'; j++)

here we (in order) increment j and then check the j'th element of s[] to see if it is NULL.
--j; /* what is it? why is the array one element less? */

ok... when the for loop exits, j is the index of the first NULL element in the array s[]. you want to decrease j to the last non-NULL value, as we will see...
for(i =0; i < j; i++)
{
ch = s;
s = s[j];


here you are assigning s to the value of s[j]. you want this particular s[j] to be the last non-NULL character. or else you will assign s[0]=NULL and everything thereafter will be off-by-one.
 
B

Barry Schwarz

First of all let me apologize for being so "green".
You must have been pretty fed up with newbish questions, but,
still I felt, that's the only place I could get a decent answer.

Haven't been able to solve the 1-19 exercise (page 31) from K&R book,
I decided, it's time to cheat, so I took peek at:
http://users.powernet.co.uk/eton/kandr2/krx119.html

Still I would not understand certain things, I will place the
code, with my comments as me trying to figure it out. Grep for "why".



#include <stdio.h>
#define MAX_LINE 1000

void discard_nl(char s[]);
int reverse(char s[]);
int getline(char s[], int lim);

/* inserts a "end-of-string" instead of \n; why do we need \0? */

Not all arrays of char are strings. What requirements must an array
of char have in order to be called a "string" in C?
void discard_nl(char s[])
{
int i;
for (i = 0; s != '\0'; i++)
{
if (s == '\n')
s = '\0';
/* replaces \n with \0 */
}
}

/* the reverse function itself */
int reverse(char s[])
{
char ch;
int i, j;

for(j = 0; s[j] != '\0'; j++) /* going through array \ */
{ /* indexes */
;
}
--j; /* what is it? why is the array one element less? */


On a piece of paper, write down a small string (including the terminal
'\0'). Manually perform each step of the function's code without the
"questionable" statement above. Repeat this exercise with the
statement. Notice the difference between this result and the first.
Why is the second correct and the first not?
for(i =0; i < j; i++)
{
ch = s; /* storing each value in ch */
s = s[j]; /* first value equals last value */
s[j] = ch; /* las value equals first (ch) value */
--j; /* shrinking the number of i < j to be
* checked */
}
return 0;
}
/* getline: all chars up to \n; \n and \0 inserted afterwords, why? */


Why the '\n'? This is very similar to what the fgets function (which
is part of the standard library) will do. That function has not been
discussed yet but when it is it will not be so strange.

Why the '\0'? See first remark above.
int getline(char s[], int lim)
{
int c, i;

for(i = 0; i < lim -1 && (c = getchar()) != EOF && c != '\n'; ++i)
s = c;
if (c == '\n')
s[i++] = c; /* or: s = '\n'; ++i; */
s = '\0'; /* inserting "end-of-string" \0 */
return i;
}

int main()
{

char line[MAX_LINE];
while(getline(line, MAX_LINE) > 0) /* changed it to MAX_LINE \ */
{ /* instead of "sizeof line" \ */
discard_nl(line); /* by me */
reverse(line);
printf("%s\n", line);
}
return 0;
}

Any added comments are welcome. I won't be disappointed if someone skips
the post. Thank you. -nabis




<<Remove the del for email>>
 
K

Keith Thompson

Shug Boabie said:
nabis said:
for(j = 0; s[j] != '\0'; j++)

here we (in order) increment j and then check the j'th element of
s[] to see if it is NULL.

NULL is a macro that expands to a null pointer constant; it's not a
null character.

The best way to refer to a null character is '\0'.
 
E

Emmanuel Delahaye

(supersedes <[email protected]>)

nabis wrote on 31/07/04 :
Still I would not understand certain things, I will place the
code, with my comments as me trying to figure it out. Grep for "why".

#include <stdio.h>
#define MAX_LINE 1000

void discard_nl(char s[]);
int reverse(char s[]);
int getline(char s[], int lim);

/* inserts a "end-of-string" instead of \n; why do we need \0? */

'\0' is the character-oriented-way version of 0. You can use 0 if you
feel more comfortable.

(Note, but I'm quite sure you are aware of that: 0 is the end-of-string
marker)
void discard_nl(char s[])
{
int i;
for (i = 0; s != '\0'; i++)
{
if (s == '\n')
s = '\0';
/* replaces \n with \0 */
}
}

/* the reverse function itself */
int reverse(char s[])
{
char ch;
int i, j;

for(j = 0; s[j] != '\0'; j++) /* going through array \ */
{ /* indexes */
;
}
--j; /* what is it? why is the array one element less? */


The best is to draw the string and index on a paper and see what
happens step by step at each turn of the loop. This kind of job is part
of the debugging process a programmer is supposed to masterize.
for(i =0; i < j; i++)
{
ch = s; /* storing each value in ch */
s = s[j]; /* first value equals last value */
s[j] = ch; /* las value equals first (ch) value */
--j; /* shrinking the number of i < j to be
* checked */
}
return 0;
}

/* getline: all chars up to \n; \n and \0 inserted afterwords, why? */

Because of the purpose of the function: 'Get a line'. The end-of-line
marker belongs to the line (in normal conditions), and a decent
C-string must be terminated by a 0 (always).

Note that if there is no room enough, the '\n' is not present, hence
this status (trimmed line) can be detected by the user. This behaviour
is similar to the one of fgets().
int getline(char s[], int lim)
{
int c, i;

for(i = 0; i < lim -1 && (c = getchar()) != EOF && c != '\n'; ++i)
s = c;
if (c == '\n')
s[i++] = c; /* or: s = '\n'; ++i; */
s = '\0'; /* inserting "end-of-string" \0 */
return i;
}

int main()
{

char line[MAX_LINE];
while(getline(line, MAX_LINE) > 0) /* changed it to MAX_LINE \
*/
{ /* instead of "sizeof line" \
*/
discard_nl(line); /* by me
*/


Why ? It's not wrong, but unnecessary. 'sizeof line' was the good way
(as far as 'line' is an array of char).
 
C

Chris Torek

My question is, was it necessary to insert a '\n' in getline() and to remove
this same '\n' in discard_nl() ...

Clearly not; but if you already have working code that does "a
little bit too much" you can re-use it by "un-doing" the extra
work.

The getline() function in K&R is preparing you for using the
Standard C fgets() function, which also keeps the '\n' that you
will often wind up discarding afterward.
also do we need to carry '\0' through all
the functions, can't we inset it in the last reverse() function?

Again, "clearly not" -- but the '\0' in a C string has a reason
for its existence, which also relates to:
s[0] = 'h'; s[1] = 'e'; s[2] = 'l'; s[3] = 'l'; s[4] = 'o';
s[5] = '\0'; s[6] = '\0';
(there are *two* '\0' in s[] now, am I wrong?)

You are correct.

C's strings are not really a *type* but rather a *data structure*:
any sequence of characters that ends with '\0' can be considered a
"string". Functions that deal with "C strings" simply look for the
first (and perhaps only, or perhaps one of many) '\0' bytes -- once
they see it they stop. A function that is documented to take "a
string" may (depending on whoever wrote it) keep going and going,
way past the end of an array of "char", looking for the '\0' that
tells it to stop, if you pass it an array that does not contain a
terminating '\0'.

Extra bytes in any array past the first '\0' get ignored, so you
can shorten a string that is stored in some "array N of char" array
by writing a '\0' before its current '\0' end-marker. This is
often convenient (as is the case in discard_nl()).

Hence, if you wanted not to carry the '\0' through all the various
functions, you would have to stop using C strings -- which only
stop when they hit that '\0'. Any code that depends on that '\0'
(such as a call to strlen(), or an inline expansion of strlen that
"manually" looks for '\0') would have to change.
 
B

Barry Schwarz

Thank you for your answers.
I tried to see what happens to a string "hello" as it passes through the
functions in order they appear in main(). It clarified some things.
My question is, was it necessary to insert a '\n' in getline() and to remove

Probably not but the '\n' is considered part of the line so this
preserves some measure of consistency.
this same '\n' in discard_nl(), also do we need to carry '\0' through all
the functions, can't we inset it in the last reverse() function?

How would you know where to insert it?
-nabis
-----------------------------------------------------------------------------
(Just my notes)
1)getline:

s[0] = 'h'; s[1] = 'e'; s[2] = 'l'; s[3] = 'l'; s[4] = 'o' (exits the loop at
s[5] == '\n')
s[5] = '\n' ; s[6] = '\0' (newline and end-of-string inserted)

the for loop, even after "i < lim -1 && (c = getchar()) != EOF && c != '\n'"
has been tested, and proven false, the ++i increment had taken place, so
we can assign '\n' to s[5]?) Probably, yes:
for (i = 0; i < 10; ++i)
printf("%d ", i);
printf("\n%d\n", i);
output:
0 1 2 3 4 5 6 7 8 9
10


2)discard_nl:

results in
s[0] = 'h'; s[1] = 'e'; s[2] = 'l'; s[3] = 'l'; s[4] = 'o';
s[5] = '\0'; s[6] = '\0';
(there are *two* '\0' in s[] now, am I wrong?)

No. But the string is terminated by the first one. While the second
'\0' is in the array, it is not part of the string.
3)reverse:

j = 4 (after the first for loop and --j, which is right, we want to
manipulate s[0] through s[4])
second for loop:
i = 0 ; ch = s[0] = 'h';
s[0] = s[4] = 'o';
s[4] = 'h';
j = 3;
i = 1; ch = s[1] = 'e';
s[1] = s[3] = 'l';
s[3] = 'e';
j = 2;
i = 2; ch = s[2] = 'l';
s[2] = s[2] = 'l';
s[2] = 'l';
j = 3;
i < j is false, exiting the loop.

our s[] = "olleh\0\0" at this moment?
01234 5 6 (indexes)

Yup


<<Remove the del for email>>
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,777
Messages
2,569,604
Members
45,214
Latest member
JFrancisDavis

Latest Threads

Top