Simple noobish question

T

theronnightstar

I am having a problem with a small bit of code. I can't figure out why
it won't work. Here:


//code:
#include <iostream>
#include <fstream>
#include <ctype.h>
#include <string>

using namespace std;

int main()
{
ifstream file_in;
ofstream file_out;
string word_in;
int old_total = 0;
int new_total = 0;
int x = 0;

file_in.open("words.list");
file_out.open("words.out");

while ( !file_in.eof() )
{
getline(file_in, word_in);
old_total++;
int good = 0;
for ( int i = 0; i < word_in.size(); i++ )
{
if ( isalpha( word_in ) != 0)
{
continue;
}
else
{
x = 1;
break;
}
}
if ( x == 0 )
{
file_out << word_in << endl;
new_total++;
}
else
{
continue;
}
}
cout << old_total << " lines in the original file." << endl;
cout << new_total << " lines in new file." << endl;
cout << old_total - new_total << " lines of difference." << endl;

return 0;
}
//code


All this is supposed to do is run through a list of words from one
file and put the one that only have the standard 26 letters of the
English alphabet into another file. It works fine until it hits the
first word with a non-english character, at which time it finishes and
exits. The output is this:

72411 lines in the original file.
589 lines in new file.
71822 lines of difference.

Obviously the new file should have a whole lot more lines than that.
At line 590 is the word Asunción. It is such a minor thing but it is
driving me up a wall none the less. Any comments would be much
appreciated.

Thanks
 
V

Victor Bazarov

I am having a problem with a small bit of code. I can't figure out why
it won't work. Here:


//code:
#include <iostream>
#include <fstream>
#include <ctype.h>
#include <string>

using namespace std;

int main()
{
ifstream file_in;
ofstream file_out;
string word_in;
int old_total = 0;
int new_total = 0;
int x = 0;
^^^^^^^^^^^^

Drop this declaration of 'x'.
file_in.open("words.list");
file_out.open("words.out");

while ( !file_in.eof() )
{
getline(file_in, word_in);
old_total++;
int good = 0;

I believe here should be the 'x', not 'good'.
for ( int i = 0; i < word_in.size(); i++ )
{
if ( isalpha( word_in ) != 0)
{
continue;
}
else
{
x = 1;
break;
}
}
if ( x == 0 )


Since 'x' is never reset to 0, you're always falling into the
'continue' part.
{
file_out << word_in << endl;
new_total++;
}
else
{
continue;
}
}
cout << old_total << " lines in the original file." << endl;
cout << new_total << " lines in new file." << endl;
cout << old_total - new_total << " lines of difference." << endl;

return 0;
}
//code


All this is supposed to do is run through a list of words from one
file and put the one that only have the standard 26 letters of the
English alphabet into another file. It works fine until it hits the
first word with a non-english character, at which time it finishes and
exits. The output is this:

72411 lines in the original file.
589 lines in new file.
71822 lines of difference.

Obviously the new file should have a whole lot more lines than that.
At line 590 is the word Asunción. It is such a minor thing but it is
driving me up a wall none the less. Any comments would be much
appreciated.

Use tracing to see how your code executes. Or a debugger.

V
 
A

Alf P. Steinbach

* (e-mail address removed):
I am having a problem with a small bit of code. I can't figure out why
it won't work. Here:


//code:
#include <iostream>
#include <fstream>
#include <ctype.h>
#include <string>

using namespace std;

int main()
{
ifstream file_in;
ofstream file_out;
string word_in;

This is really used to get a line: misleading name.

int old_total = 0;
int new_total = 0;
int x = 0;

The variable x is used as a boolean.

Declare it as 'bool'.

Use the values 'true' and 'false' instead of '1' and '0'.


file_in.open("words.list");
file_out.open("words.out");

Here you should check for failure opening the files.

while ( !file_in.eof() )

For the first iteration there has not yet been any attempt to read from
the file, and so the eof flag cannot have been set yet, and so it's
meaningless to check it here.

{
getline(file_in, word_in);

Here you should check whether getline fails, and not proceed if getline
fails (which can be due to end of file).

old_total++;
int good = 0;

Here 'x' should be set to '0' (or rather, 'false').

for ( int i = 0; i < word_in.size(); i++ )
{
if ( isalpha( word_in ) != 0)
{
continue;
}
else
{
x = 1;
break;
}
}


The loop above would better be expressed as a separate function.

if ( x == 0 )
{
file_out << word_in << endl;
new_total++;
}
else
{
continue;
}

The else-part is superfluous.

}
cout << old_total << " lines in the original file." << endl;
cout << new_total << " lines in new file." << endl;
cout << old_total - new_total << " lines of difference." << endl;

return 0;
}
//code


All this is supposed to do is run through a list of words from one
file and put the one that only have the standard 26 letters of the
English alphabet into another file. It works fine until it hits the
first word with a non-english character, at which time it finishes and
exits. The output is this:

72411 lines in the original file.
589 lines in new file.
71822 lines of difference.

Obviously the new file should have a whole lot more lines than that.
At line 590 is the word Asunción. It is such a minor thing but it is
driving me up a wall none the less. Any comments would be much
appreciated.

See above.

By the way, please convert tab characters to spaces before posting code.
 
O

osmium

I am having a problem with a small bit of code. I can't figure out why
it won't work. Here:


//code:
#include <iostream>
#include <fstream>
#include <ctype.h>
#include <string>

using namespace std;

int main()
{
ifstream file_in;
ofstream file_out;
string word_in;
int old_total = 0;
int new_total = 0;
int x = 0;

file_in.open("words.list");
file_out.open("words.out");

while ( !file_in.eof() )
{
getline(file_in, word_in);
old_total++;
int good = 0;
for ( int i = 0; i < word_in.size(); i++ )
{
if ( isalpha( word_in ) != 0)
{
continue;
}
else
{
x = 1;
break;

The character you mention is going to cause that break to exit the loop.
ISTM that this may be the first non-ASCII char in the file and this break
may not be what you want.
}
}
if ( x == 0 )
{
file_out << word_in << endl;
new_total++;
}
else
{
continue;
}
}
cout << old_total << " lines in the original file." << endl;
cout << new_total << " lines in new file." << endl;
cout << old_total - new_total << " lines of difference." << endl;

return 0;
}
//code


All this is supposed to do is run through a list of words from one
file and put the one that only have the standard 26 letters of the
English alphabet into another file. It works fine until it hits the
first word with a non-english character, at which time it finishes and
exits. The output is this:

72411 lines in the original file.
589 lines in new file.
71822 lines of difference.

Obviously the new file should have a whole lot more lines than that.
At line 590 is the word Asunción. It is such a minor thing but it is
driving me up a wall none the less. Any comments would be much
appreciated.

Thanks
 
J

James Kanze

I am having a problem with a small bit of code. I can't figure out why
it won't work. Here:

There are a couple of problems.
//code:
#include <iostream>
#include <fstream>
#include <ctype.h>
#include <string>
using namespace std;
int main()
{
ifstream file_in;
ofstream file_out;
string word_in;
int old_total = 0;
int new_total = 0;
int x = 0;

while ( !file_in.eof() )

This line is definitely wrong. The *only* time it is useful to
check eof() is after a read has failed, to know if it failed
because you were at eof, or because of some other failure.

The standard idiom here would be:

while ( std::getline( file_in, word_in ) ) // ...

In production code, it's generally considered a good idea to
check bad() and eof() *after* the loop: bad() means you had a
hard read error, and !eof() means that there was a format error
in the file---it shouldn't happen with getline().
{
getline(file_in, word_in);
old_total++;
int good = 0;
for ( int i = 0; i < word_in.size(); i++ )
{
if ( isalpha( word_in ) != 0)


This is undefined behavior. You can't legally call the single
argument version of isalpha with a char.

If I were writing this code, I'd use std::find_if, and a
predicate using the std::ctype facet. But I do this sort of
thing often enough that I've got the necessary predicate in my
tool box. For a simple loop, I'd write something like:

for ( std::string::const_iterator it = word_in.begin() ;
it != word_in.end()
&& isalpha( static_cast< unsigned char >( *it ) );
++ it ) {
}
if ( it == word_in.end() ) {
file_out << word_in << std::endl ;
++ new_total ;
}

Note that in general, the use of continue or break in a loop is
a sign of poor programming practice. Especially in this case:
you are using a standard algorithm, called linear search, and it
has a more or less standard implementation, easily recognized
and verified by anyone familiar with the idioms of the language.
{
continue;
}
else
{
x = 1;
break;
}
}
if ( x == 0 )
{
file_out << word_in << endl;
new_total++;
}
else
{
continue;
}
}
cout << old_total << " lines in the original file." << endl;
cout << new_total << " lines in new file." << endl;
cout << old_total - new_total << " lines of difference." << endl;
return 0;}

All this is supposed to do is run through a list of words from one
file and put the one that only have the standard 26 letters of the
English alphabet into another file. It works fine until it hits the
first word with a non-english character, at which time it finishes and
exits. The output is this:
72411 lines in the original file.
589 lines in new file.
71822 lines of difference.
Obviously the new file should have a whole lot more lines than that.
At line 590 is the word Asunción. It is such a minor thing but it is
driving me up a wall none the less. Any comments would be much
appreciated.

Well, there are two obvious problems. The first is that once
you've set x, you never reset it, so it is non-zero for the rest
of the program. Of course, to begin with, it shouldn't be an
int, but a bool, with true and false. And since it is only used
within the loop, it should be declared (and initialized) within
the loop, and not outside. But in the end, if you're writing
idiomatic C++, there's no need for it what so ever.

The second problem is that you are calling isalpha with a char.
If char is signed on your implementation (it usually is), then
it is almost certain that the encoding for ó results in a
negative value, which in turn results in undefined behavior when
calling the single parameter version of isalpha. If you use
this version, you absolutely *must* cast the char to unsigned
char. Otherwise, it's undefined behavior. (With most
implementations, you'll get random results---some accented
characters may return true.)

Note too that whether isalpha( 'ó' ) returns true or false
depends on the locale. By default, here, you are in "C" locale,
where it is guaranteed to return false, but it's something you
generally want to be careful with; at least where I live, real
programs very rarely leave the default locale "C". This is
another advantage of using the std::ctype facet---you can
specify any locale you want at the call site, thus avoiding any
risk of the global locale not being the one you want.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,756
Messages
2,569,533
Members
45,007
Latest member
OrderFitnessKetoCapsules

Latest Threads

Top