Giving the histogram a shot...

E

ext_u

Ok I thought I would try to take the program one thing at a time. (If
you remember my last post I am trying to make a histogram with data on
the size of each word)
Anways first .. I obviously need to determine what a word actually is.
I wrote this program on my own without looking at the book or any
other resource once.

#include <stdio.h>
main()
{
int c;
int nword, nother;

nword = nother = 0;

while ((c = getchar()) != EOF)
{
if (c == ' ' ¦¦ '\t' ¦¦ '\n')
++nother;
else
++nword;
}
printf("Words = %d\nOther = %d", nword, nother);
}

I am basically just trying to tell the computer anything that is not a
blank space, a tab , or a newline is a word...
BUT everytime I run the program it adds to only the nother variable.
I can't figure out why, no matter what I type. I thought I had
written this program well and I even sketched it out on paper
beforehand , hehe.

I know you guys will probably find a horribly noobish mistake , but
please remember I started learning C all of like 48 hours ago.
 
M

Mike Wahler

ext_u said:
Ok I thought I would try to take the program one thing at a time. (If
you remember my last post I am trying to make a histogram with data on
the size of each word)
Anways first .. I obviously need to determine what a word actually is.
I wrote this program on my own without looking at the book or any
other resource once.

#include <stdio.h>
main()

int main()

Encouragement:

Very good. Many novices make the mistake of
defining a 'char' for getchar() to store data
in. It must be 'int' as you have it, so it
can store EOF which is not guaranteed to fit
in a char.

int nword, nother;

nword = nother = 0;

Informative:

Rather than defining and then assigning after the fact,
you can give your variable initial values at definition time:

int nword = 0;
int nother = 0;

I recommend defining only one variable per line.
The reasons for this will become evident as you progress
(essentially makes the code easier to read and maintain,
and prevents possibly 'silly' mistakes, especially when
you start to work with pointers).

while ((c = getchar()) != EOF)
{
if (c == ' ' ¦¦ '\t' ¦¦ '\n')
++nother;
else
++nword;
}
printf("Words = %d\nOther = %d", nword, nother);
}

I am basically just trying to tell the computer anything that is not a
blank space, a tab , or a newline is a word...

More encouragement:

Well, I'm sure you realize that's not the ultimate goal,
but I'm glad to see you simplify things so you can get
*something* working.
BUT everytime I run the program it adds to only the nother variable.
I can't figure out why, no matter what I type. I thought I had
written this program well and I even sketched it out on paper
beforehand , hehe.

That is a very good idea, although in this case, it doesn't
help. :-( Your problem is a misunderstanding of operator
syntax.
I know you guys will probably find a horribly noobish mistake , but
please remember I started learning C all of like 48 hours ago.

Yes, you did make a (very common) novice mistake. Don't feel bad,
many others have done this. The problem is with your statement:

if (c == ' ' ¦¦ '\t' ¦¦ '\n')
++nother;

The comparison operator (==) takes exactly two operands.
You supplied 'c' as the 'left-hand' operand, and the
expression, ( ' ' || '\t' || '\n' ) as the 'right-hand'
operand. The 'logical or' operator (||) returns true
if either of its operands yields a nonzero (true) value.
None of ' ', '\t', or '\n' have a value of zero, so 'or-ing'
any or all of them together will always yield nonzero (true).

To express 'if c is equal to any of ' ', '\t', or '\n', you
need to express three distinct comparisons, 'or-d' together.
Write:

if (c == ' ' || c == '\t' || c == '\n')
++nother;
else
++nword;

You made a very good try. Great work.

-Mike
 
M

Mike Wahler

After looking at Kevin's reply, and seeing how mine
appears, it seems you might be using the wrong characters
to express logical 'or'. On a U.S. PC keyboard, it's the
shifted 'backslash' key. If you have some other keyboard,
I don't know which it is. Better check that out.

[snip]
Yes, you did make a (very common) novice mistake. Don't feel bad,
many others have done this. The problem is with your statement:

if (c == ' ' ¦¦ '\t' ¦¦ '\n')
++nother;

I copy/pasted the above from your post.
The comparison operator (==) takes exactly two operands.
You supplied 'c' as the 'left-hand' operand, and the
expression, ( ' ' || '\t' || '\n' ) as the 'right-hand'
operand. The 'logical or' operator (||) returns true
if either of its operands yields a nonzero (true) value.
None of ' ', '\t', or '\n' have a value of zero, so 'or-ing'
any or all of them together will always yield nonzero (true).

To express 'if c is equal to any of ' ', '\t', or '\n', you
need to express three distinct comparisons, 'or-d' together.
Write:

if (c == ' ' || c == '\t' || c == '\n')
++nother;
else
++nword;

Note how the 'or' operator appears different here.

-Mike
 
C

CBFalconer

ext_u said:
Ok I thought I would try to take the program one thing at a time. (If
you remember my last post I am trying to make a histogram with data on
the size of each word)

Don't keep starting new threads about the same thing. By posting
a reply in the original thread, and snipping what isn't germane,
you don't have to remind people about it. And it also makes it
easier for them to look back, if needed.
Anways first .. I obviously need to determine what a word actually is.
I wrote this program on my own without looking at the book or any
other resource once.
Good.


#include <stdio.h>
main()

get in the habit of writing "int main(void)" or
"int main(int argc; char *argv)"
{
int c;
int nword, nother;

nword = nother = 0;

while ((c = getchar()) != EOF)
{
if (c == ' ' ¦¦ '\t' ¦¦ '\n')

This should be the logical or of three logical statements. As it
is it won't do what you want. Try:

if ((c == ' ') || (c == '\t') || (c == '\n'))

think about it, and you will see the difference. Some may say the
parentheses are redundant, but it makes the statement perfectly
clear.

BTW, better to use an indentation of 3 or 4 spaces, 8 is too
much. So don't use tabs (if you are using them).
++nother;
else
++nword;
}
printf("Words = %d\nOther = %d", nword, nother);
}

I am basically just trying to tell the computer anything that is not a
blank space, a tab , or a newline is a word...

You are saying that every char is a word, unless it is ...
BUT everytime I run the program it adds to only the nother variable.
I can't figure out why, no matter what I type. I thought I had
written this program well and I even sketched it out on paper
beforehand , hehe.

I know you guys will probably find a horribly noobish mistake , but
please remember I started learning C all of like 48 hours ago.

You are doing fine.
 
M

Morris Dovey

ext_u said:
I know you guys will probably find a horribly noobish mistake , but
please remember I started learning C all of like 48 hours ago.

<snip>

We're interested in four things:

[1] Determining if a character is part of a word
[2] Finding the first character of each word.
[3] Finding the first character after a word
[4] Counting the characters in a word

I took a try at the problem; and wrote main first; and just
assumed that I could write a word_char() function - so all I
needed to worry about were [2], [3], and [4]. I added EOF to your
list of getchar() input values that could not appear in a word
and then added the word_char() function to satisfy [1].

#include <stdio.h>

int word_char(int c)
{ return (c != ' ') && (c != '\t') && (c != '\n') && (c != EOF);
}

int main(void)
{ int c, chars=0, count[64], i, in_word=0;

for (i=0; i<64; i++) count = 0;

do
{ c = getchar();
if (!in_word && word_char(c))
{ in_word = 1;
chars = 1;
}
else if (in_word)
{ if (word_char(c)) ++chars;
else
{ ++count[chars];
in_word = 0;
}
}
} while (c != EOF);

for (i=1; i<64; i++)
{ if (count)
printf("There were %d words with %d letters\n",
count, i);
}
return 0;
}

The program makes the assumption that there won't be any words
with more than 64 characters, and may behave /very/ badly if a
longer word is encountered; but I wanted to write a simple "quick
and dirty" example. I made the assumption that you've already
encountered the do {} while () loop construction.

HTH
 
K

Kevin Easton

Mike Wahler said:
Yes, you did make a (very common) novice mistake. Don't feel bad,
many others have done this. The problem is with your statement:

if (c == ' ' ?? '\t' ?? '\n')
++nother;

The comparison operator (==) takes exactly two operands.
You supplied 'c' as the 'left-hand' operand, and the
expression, ( ' ' || '\t' || '\n' ) as the 'right-hand'
operand.

At risk of over-complicating the thread, == has higher precedence than
|| (for the OP, the table on page 53 of K&R2 is worth bookmarking...),
so the operands of the == operator in this case are c and ' '. In fact,
you rely on this precedence later in your message:

[...]
Write:

if (c == ' ' || c == '\t' || c == '\n')

- Kevin.
 
C

CBFalconer

CBFalconer said:
get in the habit of writing "int main(void)" or
"int main(int argc; char *argv)"

Make that last "char **argv". Someone e-mailed me but didn't
bother to put the correction up here. And no, the compiler won't
diagnose it.
 
R

Randy Howard

Make that last "char **argv". Someone e-mailed me but didn't
bother to put the correction up here. And no, the compiler won't
diagnose it.

You sure you want that semicolon up there instead of a comma? :)
 
C

Chris Dollin

ext_u said:
Ok I thought I would try to take the program one thing at a time. (If
you remember my last post I am trying to make a histogram with data on
the size of each word)
if (c == ' ' ¦¦ '\t' ¦¦ '\n')

This doesn't mean what you thought it meant. It means

if c == ' '
or '\t' != 0
or '\n' != 0

and, since neither the tab character nor the newline character are
equal to 0, the condition is always true. You want

if (c == ' ' || c == '\t' || c == '\n') ...
or
#include <ctype.h>

if (isspace( c )) ...

if you're prepared to accept form feed, carraige return, and vertical
tab as separators as well.
 
C

CBFalconer

Randy said:
(e-mail address removed) says...

You sure you want that semicolon up there instead of a comma? :)

Woops. Yes, that one would get diagnosed. :)

I found something from ext_u on my spam trap. If he has trouble
with DJGPP the place to go is comp.os.msdos.djgpp.
 
M

Malcolm

ext_u said:
Ok I thought I would try to take the program one thing at a time.
What you want to do is break down the program into discrete functions.

eg to count the number of words, why not write a function

int countwords(char *line);

That takes a line of text as a parameter (NUL-terminated) and counts the
number of words it contains?
 
K

Kevin D. Quitt

There is nothing wrong with the approach you've taken (scanning along a
character at a time), and others have covered the details of your
problems. You might also consider another avenue: the use of the standard
library string functions:

7.21.5.3 The strcspn function
Synopsis
1 #include <string.h>
size_t strcspn(const char *s1, const char *s2);

Description
2 The strcspn function computes the length of the maximum initial
segment of the string pointed to by s1 which consists entirely of
characters not from the string pointed to by s2.

Returns
3 The strcspn function returns the length of the segment.

By combining calls to strcspn to determine the length of a word (and
bypass it), and strspn to determine the length of white space (and bypass
it) you can very quickly step through the data in clean manner that makes
reading the code straightforward.
 
R

Richard Heathfield

ext_u said:
Ok I thought I would try to take the program one thing at a time.

Excellent!

I see that you have already had lots of technical help on this, so I'd like
to take a slightly different approach.
(If
you remember my last post I am trying to make a histogram with data on
the size of each word)
Anways first .. I obviously need to determine what a word actually is.
I wrote this program on my own without looking at the book or any
other resource once.

Don't be afraid to consult the book even when doing the exercises, unless
you are very confident that you know precisely what to do.

I know you guys will probably find a horribly noobish mistake , but
please remember I started learning C all of like 48 hours ago.

You might be taking it a bit quickly. "The C Programming Language" is not a
"Dummies" book. It is "information-dense". That is to say, there's a huge
amount of information on every page. It might be a good idea to go a little
more slowly, and be sure that you are absorbing each new concept as it
unfolds.
 
N

Nick Austin

If he has problems with the char '|' for any reason, or simply for
clarity, he can #include <iso646.h> and then use "or", "and" for
those confusing (to the newbie at least) tokens. His line would
then read:

if ((c == ' ') or (c == '\t') or (c == '\n'))

And the not so readable:

if ((c == ' ') ??! (c == '\t') ??! (c == '\n'))

Nick.
 
M

Mike Wahler

Alex said:
Unfortunately, this lacks the short-circuit evaluation mechanism :)

But it still gives the correct result. :)


User: "The machine froze up."
Me: "There's a short circuit between the chair and the keyboard"

-Mike
 
T

Tim Hagan

Malcolm said:
What you want to do is break down the program into discrete functions.

Nit #1: Functions haven't been covered at this point in the book (K&R2 p24);
they are introduced in the next section.
eg to count the number of words, why not write a function

Nit #2: Exercise 1-13 asks for a histogram of the *lengths* of the words in
the input and Exercise 1-14 asks for a histogram of the frequencies of
different *characters* in the input. Neither needs the number of words.

To the OP: There is a lot of good advice in this thread about how to approach
the problem. Another thing to keep in mind is how you are going the *test*
your program to make sure that does what you intended. Sometimes this can be
more difficult than writing the actual code.
 
E

ext_u

Ok..
I finally got the word counting program I was working on to work. I
had a problem getting it to count each char and each word seperatly.
In the original program it was counting words and chars as the same
thing..

Here is my new try .. only took me a couple hours :)

#include <stdio.h>
main()
{
int c;
int nother, nword;
int i;

i = 0;
nword = 0;
nother = 0;

while ((c = getchar ()) != EOF){
++nother;
if (c == ' ' || c == '\t' || c == '\n')
i = 0;
else if (i == 0) {
i = 1;
++nword;
}
}
printf("words: %d chars: %d", nword, nother);
}
 
A

Alex

But it still gives the correct result. :)

User: "The machine froze up."
Me: "There's a short circuit between the chair and the keyboard"

Indeed. But you may potentially take a horrible performance hit. :)

Alex
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,906
Latest member
SkinfixSkintag

Latest Threads

Top