Count Words

F

Foodbank

Hello,

I'm trying to develop a program that will enable me to count the number
of words in a text file. As a plus, I'd like to be able to count how
many different words there are too. I have a decent start on the
program, but am quite unsure of where to move from here. I need to
malloc space for the array, but am not sure how to. Also, I believe
that strlen may come into play. Although I've browsed similar topics,
I'm also still unsure of the loop formatting to actually count the
words.

Any help would be greatly appreciated.

Thanks,
James



#include <stdio.h>
#include <stdlib.h>
#define MAXWORDS 4000 //less than 4000 total words in the
//text file
char *word[MAXWORDS];
int wordcount[MAXWORDS];
#define MAXWLEN 30 //no words larger than 30 characters
char buff[MAXWLEN];
int nwords, totalwords;
main() {
int i;
while(get_word(buff)) {

/**** The part where I am stuck on ****/


}
for(i = 0; i < nwords; i++)
totalwords += wordcount; //if I keep getting
//words, the loop will
//continue

printf("there were %d different words out of %d totalwords\n",
nwords, totalwords);
}

//-----ignore the section below, it defines what a word is to the
//-----program
I already have code to define to the compiler what a word is, so I
don't need help on that end. Therefore, I removed the code to save
space, which would've been at this location.
 
V

Vimal Aravindashan

Foodbank said:
Hello,

I'm trying to develop a program that will enable me to count the number
of words in a text file. As a plus, I'd like to be able to count how
many different words there are too. I have a decent start on the
program, but am quite unsure of where to move from here. I need to
malloc space for the array, but am not sure how to. Also, I believe
that strlen may come into play. Although I've browsed similar topics,
I'm also still unsure of the loop formatting to actually count the
words.

Any help would be greatly appreciated.

Thanks,
James

[snip]

It would be best if you used the hash_map container provided in the STL
package. Read the STL documentation for more help on hash_map. Also, STL
is C++, so if you do decide to take my advice into consideration, please
make it a point post any further questions to c.l.c++

Cheers,
Vimal.
 
F

Foodbank

Thank you for your response, but it has nothing to do with what I'm
looking for. I don't even know what the hash_map container is. Also,
you stated it is for C++, I am using C.

Any more help is greatly appreciated.

Thanks,
James




Vimal said:
Foodbank said:
Hello,

I'm trying to develop a program that will enable me to count the number
of words in a text file. As a plus, I'd like to be able to count how
many different words there are too. I have a decent start on the
program, but am quite unsure of where to move from here. I need to
malloc space for the array, but am not sure how to. Also, I believe
that strlen may come into play. Although I've browsed similar topics,
I'm also still unsure of the loop formatting to actually count the
words.

Any help would be greatly appreciated.

Thanks,
James

[snip]

It would be best if you used the hash_map container provided in the STL
package. Read the STL documentation for more help on hash_map. Also, STL
is C++, so if you do decide to take my advice into consideration, please
make it a point post any further questions to c.l.c++

Cheers,
Vimal.
 
T

tedu

Foodbank said:
Hello,

I'm trying to develop a program that will enable me to count the number
of words in a text file. As a plus, I'd like to be able to count how
many different words there are too. I have a decent start on the
program, but am quite unsure of where to move from here. I need to
malloc space for the array, but am not sure how to. Also, I believe
that strlen may come into play. Although I've browsed similar topics,
I'm also still unsure of the loop formatting to actually count the
words.

A school assignment I had to do exactly this was broken down into
several parts:
1. a tokenizer (get_word())
2. dynamic array support (insert_at(), get_at(), replace_at(),
append())
3. a hash table using the above.
4. the final program. hash each word, then you can count the number
of times it appears.
 
G

Gordon Burditt

I'm trying to develop a program that will enable me to count the number
of words in a text file.

Why? (other than to get a good grade on your homework assignment).

What is your definition of a "word"? How many words are there
on the following lines:

don't
3.141592
O'Brien
supercali-\nfragilisticexpialadocious
(where \n represents a newline and - represents a hyphen)
.
#&$(#&$(#

Gordon L. Burditt
 
E

Emmanuel Delahaye

Vimal Aravindashan wrote on 26/09/05 :
It would be best if you used the hash_map container provided in the STL
package. Read the STL documentation for more help on hash_map. Also, STL is
C++, so if you do decide to take my advice into consideration, please make it
a point post any further questions to c.l.c++

How is any of this a response to a C-question ?

--
Emmanuel
The C-FAQ: http://www.eskimo.com/~scs/C-faq/faq.html
The C-library: http://www.dinkumware.com/refxc.html

"It's specified. But anyone who writes code like that should be
transmogrified into earthworms and fed to ducks." -- Chris Dollin CLC
 
T

tedu

Gordon said:
Why? (other than to get a good grade on your homework assignment).

What is your definition of a "word"?

depends on what you pass to strcspn.
How many words are there on the following lines: assuming strcspn(p, "\n \t.,")

don't 1
2
1
supercali-\nfragilisticexpialadocious
(where \n represents a newline and - represents a hyphen) 2
0
#&$(#&$(#
1
 
R

Richard Bos

tedu said:
depends on what you pass to strcspn.

No, it's precisely the other way around. What you pass to strcspn()
depends on how you define a "word".

I say zero or one.

One, clearly, but only if you know that for a word.

None.

That's my definition; now _your_ job is to write a strcspn() that
matches it. Going the other way puts the cart before the horse. It's a
common error in programmers, and with my sysadmin hat on I would very
much like to eradicate it.

Richard
 
F

Foodbank

Hi everyone,

I've made some progress, but I'm getting incorrect word counts. Can
anyone check out my code and see what I might be doing wrong?

Thanks.


#include <stdio.h>
#include <stdlib.h>
#define MAXWORDS 4000
char *word[MAXWORDS];
int wordcount[MAXWORDS];
#define MAXWLEN 30
char buff[MAXWLEN];
int nwords, totalwords;
main() {
int i;
while(get_word(buff)) {

for(i = 0; i < nwords; i++)
if(!strcmp(buff, word))
wordcount++;

word = (char *) malloc( strlen(buff) + 1);
strcpy(word, buff);
wordcount = 1;
nwords++;
}
for(i = 0; i < nwords; i++)
totalwords += wordcount;
printf("there were %d unique words out of %d totalwords\n",
nwords, totalwords);
}

//*************I've deleted the code that tells the compiler what a
word is, I don't need help on that
 
T

tedu

Foodbank said:
while(get_word(buff)) {

for(i = 0; i < nwords; i++)
if(!strcmp(buff, word))
wordcount++;
word = (char *) malloc( strlen(buff) + 1);
strcpy(word, buff);
wordcount = 1;


i don't think you want to do the above three lines every time you see a
word.
nwords++;
}

it'd also help to make sure your indentation gets posted correctly.
 
V

Vimal Aravindashan

Emmanuel said:
Vimal Aravindashan wrote on 26/09/05 :



How is any of this a response to a C-question ?

Read the OP's message again:
> Hello,
>
> I'm trying to develop a program that will enable me to count the number
> of words in a text file. As a plus, I'd like to be able to count how
> many different words there are too. I have a decent start on the
> program, but am quite unsure of where to move from here. I need to

In the problem statement the OP does not say that it has to be done in C
(Although, he has mentioned it his reply). The fact that he already has
a good start doesn't matter much if he is going to be stuck with
re-inventing the wheel. If it was up to me, then I would first get the
design right, and then figure out which language is best (unless there
is a constraint on the same, as in this case) to translate my design
into code. Moreover, the OP does say "any help" is welcome, so a
re-direction should really hurt much if it is going to save him quite
some time. ;-)

Cheers,
Vimal.
 
O

osmium

Foodbank said:
I've made some progress, but I'm getting incorrect word counts. Can
anyone check out my code and see what I might be doing wrong?

Thanks.


#include <stdio.h>
#include <stdlib.h>
#define MAXWORDS 4000
char *word[MAXWORDS];
int wordcount[MAXWORDS];
#define MAXWLEN 30
char buff[MAXWLEN];
int nwords, totalwords;

Shouldn't you give those an initial value?

<snip>
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,766
Messages
2,569,569
Members
45,043
Latest member
CannalabsCBDReview

Latest Threads

Top