Count Words

Discussion in 'C Programming' started by Foodbank, Sep 26, 2005.

  1. Foodbank

    Foodbank Guest

    Hello,

    I'm trying to develop a program that will enable me to count the number
    of words in a text file. As a plus, I'd like to be able to count how
    many different words there are too. I have a decent start on the
    program, but am quite unsure of where to move from here. I need to
    malloc space for the array, but am not sure how to. Also, I believe
    that strlen may come into play. Although I've browsed similar topics,
    I'm also still unsure of the loop formatting to actually count the
    words.

    Any help would be greatly appreciated.

    Thanks,
    James



    #include <stdio.h>
    #include <stdlib.h>
    #define MAXWORDS 4000 //less than 4000 total words in the
    //text file
    char *word[MAXWORDS];
    int wordcount[MAXWORDS];
    #define MAXWLEN 30 //no words larger than 30 characters
    char buff[MAXWLEN];
    int nwords, totalwords;
    main() {
    int i;
    while(get_word(buff)) {

    /**** The part where I am stuck on ****/


    }
    for(i = 0; i < nwords; i++)
    totalwords += wordcount; //if I keep getting
    //words, the loop will
    //continue

    printf("there were %d different words out of %d totalwords\n",
    nwords, totalwords);
    }

    //-----ignore the section below, it defines what a word is to the
    //-----program
    I already have code to define to the compiler what a word is, so I
    don't need help on that end. Therefore, I removed the code to save
    space, which would've been at this location.
    Foodbank, Sep 26, 2005
    #1
    1. Advertising

  2. Foodbank wrote:
    > Hello,
    >
    > I'm trying to develop a program that will enable me to count the number
    > of words in a text file. As a plus, I'd like to be able to count how
    > many different words there are too. I have a decent start on the
    > program, but am quite unsure of where to move from here. I need to
    > malloc space for the array, but am not sure how to. Also, I believe
    > that strlen may come into play. Although I've browsed similar topics,
    > I'm also still unsure of the loop formatting to actually count the
    > words.
    >
    > Any help would be greatly appreciated.
    >
    > Thanks,
    > James
    >
    > [snip]


    It would be best if you used the hash_map container provided in the STL
    package. Read the STL documentation for more help on hash_map. Also, STL
    is C++, so if you do decide to take my advice into consideration, please
    make it a point post any further questions to c.l.c++

    Cheers,
    Vimal.


    --
    If you would be a real seeker after truth, it is necessary that at least
    once in your life you doubt, as far as possible, all things."
    -- Rene Descartes
    Vimal Aravindashan, Sep 26, 2005
    #2
    1. Advertising

  3. Foodbank

    Foodbank Guest

    Thank you for your response, but it has nothing to do with what I'm
    looking for. I don't even know what the hash_map container is. Also,
    you stated it is for C++, I am using C.

    Any more help is greatly appreciated.

    Thanks,
    James




    Vimal Aravindashan wrote:
    > Foodbank wrote:
    > > Hello,
    > >
    > > I'm trying to develop a program that will enable me to count the number
    > > of words in a text file. As a plus, I'd like to be able to count how
    > > many different words there are too. I have a decent start on the
    > > program, but am quite unsure of where to move from here. I need to
    > > malloc space for the array, but am not sure how to. Also, I believe
    > > that strlen may come into play. Although I've browsed similar topics,
    > > I'm also still unsure of the loop formatting to actually count the
    > > words.
    > >
    > > Any help would be greatly appreciated.
    > >
    > > Thanks,
    > > James
    > >
    > > [snip]

    >
    > It would be best if you used the hash_map container provided in the STL
    > package. Read the STL documentation for more help on hash_map. Also, STL
    > is C++, so if you do decide to take my advice into consideration, please
    > make it a point post any further questions to c.l.c++
    >
    > Cheers,
    > Vimal.
    >
    >
    > --
    > If you would be a real seeker after truth, it is necessary that at least
    > once in your life you doubt, as far as possible, all things."
    > -- Rene Descartes
    Foodbank, Sep 26, 2005
    #3
  4. Foodbank

    tedu Guest

    Foodbank wrote:
    > Hello,
    >
    > I'm trying to develop a program that will enable me to count the number
    > of words in a text file. As a plus, I'd like to be able to count how
    > many different words there are too. I have a decent start on the
    > program, but am quite unsure of where to move from here. I need to
    > malloc space for the array, but am not sure how to. Also, I believe
    > that strlen may come into play. Although I've browsed similar topics,
    > I'm also still unsure of the loop formatting to actually count the
    > words.


    A school assignment I had to do exactly this was broken down into
    several parts:
    1. a tokenizer (get_word())
    2. dynamic array support (insert_at(), get_at(), replace_at(),
    append())
    3. a hash table using the above.
    4. the final program. hash each word, then you can count the number
    of times it appears.
    tedu, Sep 26, 2005
    #4
  5. >I'm trying to develop a program that will enable me to count the number
    >of words in a text file.


    Why? (other than to get a good grade on your homework assignment).

    What is your definition of a "word"? How many words are there
    on the following lines:

    don't
    3.141592
    O'Brien
    supercali-\nfragilisticexpialadocious
    (where \n represents a newline and - represents a hyphen)
    .
    #&$(#&$(#

    Gordon L. Burditt
    Gordon Burditt, Sep 26, 2005
    #5
  6. Vimal Aravindashan wrote on 26/09/05 :
    > It would be best if you used the hash_map container provided in the STL
    > package. Read the STL documentation for more help on hash_map. Also, STL is
    > C++, so if you do decide to take my advice into consideration, please make it
    > a point post any further questions to c.l.c++


    How is any of this a response to a C-question ?

    --
    Emmanuel
    The C-FAQ: http://www.eskimo.com/~scs/C-faq/faq.html
    The C-library: http://www.dinkumware.com/refxc.html

    "It's specified. But anyone who writes code like that should be
    transmogrified into earthworms and fed to ducks." -- Chris Dollin CLC
    Emmanuel Delahaye, Sep 26, 2005
    #6
  7. Foodbank

    tedu Guest

    Gordon Burditt wrote:
    > >I'm trying to develop a program that will enable me to count the number
    > >of words in a text file.

    >
    > Why? (other than to get a good grade on your homework assignment).
    >
    > What is your definition of a "word"?


    depends on what you pass to strcspn.

    > How many words are there on the following lines:

    assuming strcspn(p, "\n \t.,")
    >
    > don't

    1
    > 3.141592

    2
    > O'Brien

    1
    > supercali-\nfragilisticexpialadocious
    > (where \n represents a newline and - represents a hyphen)

    2
    > .

    0
    > #&$(#&$(#

    1
    tedu, Sep 26, 2005
    #7
  8. Foodbank

    Richard Bos Guest

    "tedu" <> wrote:

    > Gordon Burditt wrote:
    > > >I'm trying to develop a program that will enable me to count the number
    > > >of words in a text file.

    > >
    > > Why? (other than to get a good grade on your homework assignment).
    > >
    > > What is your definition of a "word"?

    >
    > depends on what you pass to strcspn.


    No, it's precisely the other way around. What you pass to strcspn()
    depends on how you define a "word".

    > > How many words are there on the following lines:

    > assuming strcspn(p, "\n \t.,")
    > >
    > > don't

    > 1
    > > 3.141592

    > 2


    I say zero or one.

    > > O'Brien

    > 1
    > > supercali-\nfragilisticexpialadocious
    > > (where \n represents a newline and - represents a hyphen)

    > 2


    One, clearly, but only if you know that for a word.

    > > .

    > 0
    > > #&$(#&$(#

    > 1


    None.

    That's my definition; now _your_ job is to write a strcspn() that
    matches it. Going the other way puts the cart before the horse. It's a
    common error in programmers, and with my sysadmin hat on I would very
    much like to eradicate it.

    Richard
    Richard Bos, Sep 27, 2005
    #8
  9. Foodbank

    Foodbank Guest

    Hi everyone,

    I've made some progress, but I'm getting incorrect word counts. Can
    anyone check out my code and see what I might be doing wrong?

    Thanks.


    #include <stdio.h>
    #include <stdlib.h>
    #define MAXWORDS 4000
    char *word[MAXWORDS];
    int wordcount[MAXWORDS];
    #define MAXWLEN 30
    char buff[MAXWLEN];
    int nwords, totalwords;
    main() {
    int i;
    while(get_word(buff)) {

    for(i = 0; i < nwords; i++)
    if(!strcmp(buff, word))
    wordcount++;

    word = (char *) malloc( strlen(buff) + 1);
    strcpy(word, buff);
    wordcount = 1;
    nwords++;
    }
    for(i = 0; i < nwords; i++)
    totalwords += wordcount;
    printf("there were %d unique words out of %d totalwords\n",
    nwords, totalwords);
    }

    //*************I've deleted the code that tells the compiler what a
    word is, I don't need help on that
    Foodbank, Sep 27, 2005
    #9
  10. Foodbank

    tedu Guest

    Foodbank wrote:

    > while(get_word(buff)) {
    >
    > for(i = 0; i < nwords; i++)
    > if(!strcmp(buff, word))
    > wordcount++;
    > word = (char *) malloc( strlen(buff) + 1);
    > strcpy(word, buff);
    > wordcount = 1;


    i don't think you want to do the above three lines every time you see a
    word.

    > nwords++;
    > }


    it'd also help to make sure your indentation gets posted correctly.
    tedu, Sep 27, 2005
    #10
  11. Emmanuel Delahaye wrote:
    > Vimal Aravindashan wrote on 26/09/05 :
    >
    >> It would be best if you used the hash_map container provided in the
    >> STL package. Read the STL documentation for more help on hash_map.
    >> Also, STL is C++, so if you do decide to take my advice into
    >> consideration, please make it a point post any further questions to
    >> c.l.c++

    >
    >
    > How is any of this a response to a C-question ?
    >


    Read the OP's message again:

    Foodbank wrote:
    > Hello,
    >
    > I'm trying to develop a program that will enable me to count the number
    > of words in a text file. As a plus, I'd like to be able to count how
    > many different words there are too. I have a decent start on the
    > program, but am quite unsure of where to move from here. I need to


    In the problem statement the OP does not say that it has to be done in C
    (Although, he has mentioned it his reply). The fact that he already has
    a good start doesn't matter much if he is going to be stuck with
    re-inventing the wheel. If it was up to me, then I would first get the
    design right, and then figure out which language is best (unless there
    is a constraint on the same, as in this case) to translate my design
    into code. Moreover, the OP does say "any help" is welcome, so a
    re-direction should really hurt much if it is going to save him quite
    some time. ;-)

    Cheers,
    Vimal.


    --
    If you would be a real seeker after truth, it is necessary that at least
    once in your life you doubt, as far as possible, all things."
    -- Rene Descartes
    Vimal Aravindashan, Sep 27, 2005
    #11
  12. Foodbank

    osmium Guest

    "Foodbank" wrote:

    > I've made some progress, but I'm getting incorrect word counts. Can
    > anyone check out my code and see what I might be doing wrong?
    >
    > Thanks.
    >
    >
    > #include <stdio.h>
    > #include <stdlib.h>
    > #define MAXWORDS 4000
    > char *word[MAXWORDS];
    > int wordcount[MAXWORDS];
    > #define MAXWLEN 30
    > char buff[MAXWLEN];
    > int nwords, totalwords;


    Shouldn't you give those an initial value?

    <snip>
    osmium, Sep 27, 2005
    #12
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Peter Strøiman
    Replies:
    1
    Views:
    2,064
    Peter Strøiman
    Aug 23, 2005
  2. Richard Heathfield
    Replies:
    7
    Views:
    345
    Barry Schwarz
    Oct 5, 2003
  3. utab

    Words Words

    utab, Feb 16, 2006, in forum: C++
    Replies:
    6
    Views:
    411
    Daniel T.
    Feb 16, 2006
  4. BerlinBrown
    Replies:
    6
    Views:
    4,413
  5. Lasse Edsvik

    replace words with bold words

    Lasse Edsvik, Oct 5, 2003, in forum: ASP General
    Replies:
    9
    Views:
    226
Loading...

Share This Page