Create program to count words in a file

A

aaron

I have a few documents in which I need to get a total word count.
Can anyone help?
I'd like to create this program so I can get the result after entering
the filename, and
have the option to enter another filename and get results again...
enter quit to end program.
 
A

aaron

This seems to accomplish what I need. Is it clean?

#include <iostream>
// #include <cassert> for Assert Fuction
#include <fstream>
#include <iostream>

using namespace std;

int main()
{

int count;
string fileName;
string astring;
ifstream inFile;

cout << "Enter the name of the file "
<< "(type Quit to end): ";
cin >> fileName;


while (fileName != "Quit")
{
inFile.open(fileName.c_str());

if (!inFile) // How do I use assert function here???
{
cout << "** Can't open the file **" << endl;
return 1;
}

count = 0;

while (inFile >> astring)
count++;

cout << "Number of words in file: " << count << endl;

inFile.close();
inFile.clear();

cout << "Enter the name of the file "
<< "(type Quit to end): ";
cin >> fileName;
}
return 0;
}
 
J

Jerry Coffin

This seems to accomplish what I need. Is it clean?

Not particularly.
using namespace std;

Most experienced C++ programmers would consider this a
fairly ugly hack -- it's not necessarily terrible, but
certainly not particularly clean either.

[ ... ]
if (!inFile) // How do I use assert function here???

First of all, assert is a macro, not a function (an
important distinction in this case).

Second, assert isn't really suitable for this situation
anyway. It's basically intended to verifying program
logic -- i.e. if its condition is false, it indicates a
problem in your program, rather than data that's supplied
to the program. This is looking at the input data, so an
'if' is a perfectly reasonable way to handle it.

For real use, an interactive program like this that
requires the user to type in file names as it runs is
rarely as useful as a program that accepts file names on
the command line so the user can specify all of them up-
front. Along with that, I'd move the counting into a
separate function:

int count(char const *filename) {
int words = 0;
std::string word;
std::ifstream infile(filename);

if (!infile)
return -1;

while (infile >> word)
++words;

return words;
}

int main(int argc, char **argv) {

for (int i=1; i<argc; i++) {
int words = count(argv);

std::cout << argv << ": ";

if (words < 0)
std::cout << "Unable to open file!\n";
else
std::cout << count(argv) << "\n";
}
return 0;
}

One other possibility would be to use std::distance
instead of an explicit loop to count the words in the
file:

int count(char const *filename) {
std::ifstream infile(filename);

if (!infile)
return -1;

return std::distance(
std::istream_iterator<std::string>(infile),
std::istream_iterator<std::string>());
}

Side note: A typical UNIX shell will expand wildcards and
such, so this will automatically support things like "wc
*.txt". Most shells in Windows don't do that, but most
compilers include code to do it that you can link into
your program if you want it (its name varies from one
compiler to the next though -- e.g. Microsoft calls it
setargv.obj, Borland calls it wildargs.obj, and so on).
 
F

Fei Liu

Jerry said:
This seems to accomplish what I need. Is it clean?

Not particularly.
using namespace std;

Most experienced C++ programmers would consider this a
fairly ugly hack -- it's not necessarily terrible, but
certainly not particularly clean either.

[ ... ]

do you mean 'using namespace std;' is an ugly hack?...
if (!inFile) // How do I use assert function here???

First of all, assert is a macro, not a function (an
important distinction in this case).

Second, assert isn't really suitable for this situation
anyway. It's basically intended to verifying program
logic -- i.e. if its condition is false, it indicates a
problem in your program, rather than data that's supplied
to the program. This is looking at the input data, so an
'if' is a perfectly reasonable way to handle it.

For real use, an interactive program like this that
requires the user to type in file names as it runs is
rarely as useful as a program that accepts file names on
the command line so the user can specify all of them up-
front. Along with that, I'd move the counting into a
separate function:

int count(char const *filename) {
int words = 0;
std::string word;
std::ifstream infile(filename);

if (!infile)
return -1;

while (infile >> word)
++words;

return words;
}

int main(int argc, char **argv) {

for (int i=1; i<argc; i++) {
int words = count(argv);

std::cout << argv << ": ";

if (words < 0)
std::cout << "Unable to open file!\n";
else
std::cout << count(argv) << "\n";
}
return 0;
}

One other possibility would be to use std::distance
instead of an explicit loop to count the words in the
file:

int count(char const *filename) {
std::ifstream infile(filename);

if (!infile)
return -1;

return std::distance(
std::istream_iterator<std::string>(infile),
std::istream_iterator<std::string>());
}

Side note: A typical UNIX shell will expand wildcards and
such, so this will automatically support things like "wc
*.txt". Most shells in Windows don't do that, but most
compilers include code to do it that you can link into
your program if you want it (its name varies from one
compiler to the next though -- e.g. Microsoft calls it
setargv.obj, Borland calls it wildargs.obj, and so on).

--
Later,
Jerry.

The universe is a figment of its own imagination.
 
V

Victor Bazarov

Fei said:
Jerry said:
This seems to accomplish what I need. Is it clean?

Not particularly.

using namespace std;

Most experienced C++ programmers would consider this a
fairly ugly hack -- it's not necessarily terrible, but
certainly not particularly clean either.

[ ... ]


do you mean 'using namespace std;' is an ugly hack?...

if (!inFile) // How do I use assert function here???
[...and more than 60 lines quoted without merit...]

Please learn to quote only the relevant portions. Thanks.

V
 
P

Phlip

Fei said:
do you mean 'using namespace std;' is an ugly hack?...

The C++ thought leaders formerly said it was, and have since softened their
stance to "mostly harmless".

This repost best describes the issues involved:

Do you remember the scene in Star Trek Old Generation where they could not
get the hatch to a grain silo open, and when Kirk finally opened it
thousands of tribbles rained down all over him?

Kirk represents your source file.

The grain silo represents all the header files your source file includes.

The tribbles each represents an identifier declared inside 'namespace std'
in those headers.

Raining down all over Kirk represents all those identifiers polluting your
local namespace.

'using namespace std' represents sliding the hatch open.

The purpose of the 'namespace' keyword is to prevent this pollution. You
must keep all the tribbles in the grain silo, and only take down the one or
two that you need:

using std::cout;
using std::endl;

Folks use 'using namespace std' in this newsgroup because trivial example
code often uses it; the code is not large enough to have enough of its own
identifiers to potentially conflict with the 'std' ones. But nobody should
use 'using namespace std', and those who post sample code to this newsgroup
should set a good example.
 
J

Jerry Coffin

[ ... ]
Most experienced C++ programmers would consider this a
fairly ugly hack -- it's not necessarily terrible, but
certainly not particularly clean either.

[ ... ]

do you mean 'using namespace std;' is an ugly hack?...

I mean "using namespace std;" is a _fairly_ ugly hack
when it's at file scope. I think without qualifications,
it's overstated, and I think that at other scopes it's
somewhat more acceptable (though still rarely ideal).
 
F

Fei Liu

Phlip said:
The C++ thought leaders formerly said it was, and have since softened their
stance to "mostly harmless".

This repost best describes the issues involved:

Do you remember the scene in Star Trek Old Generation where they could not
get the hatch to a grain silo open, and when Kirk finally opened it
thousands of tribbles rained down all over him?

Kirk represents your source file.

The grain silo represents all the header files your source file includes.

The tribbles each represents an identifier declared inside 'namespace std'
in those headers.

Raining down all over Kirk represents all those identifiers polluting your
local namespace.

'using namespace std' represents sliding the hatch open.

The purpose of the 'namespace' keyword is to prevent this pollution. You
must keep all the tribbles in the grain silo, and only take down the one or
two that you need:

using std::cout;
using std::endl;

Folks use 'using namespace std' in this newsgroup because trivial example
code often uses it; the code is not large enough to have enough of its own
identifiers to potentially conflict with the 'std' ones. But nobody should
use 'using namespace std', and those who post sample code to this newsgroup
should set a good example.

Thanks for your explanation, Phlip. I understand the reason why
indiscrimnately 'using namespace std' is a bad idea. But it was the
first time I heard it being referred to as 'an ugly hack'. It is
interesting they later relaxed their phrase.

V, as you may or may not be able to see, I wasn't sure what was being
quoted in Phlip's original post as 'an ugly hack' thus my quote of
Phlip's complete message.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,537
Members
45,021
Latest member
AkilahJaim

Latest Threads

Top