Use != rather than < in for loops ?

  • Thread starter lovecreatesbea...
  • Start date
L

lovecreatesbea...

In C++ Primer 4th, sec 3.3.2, it states that C++ programmers use !=
rather than < in a for loop.

The following small snippet erases punctuations in a string. It works
well with < used in the for loop but it breaks when != is used instead.

#include <string>
#include <iostream>
#include <exception>
#include <cctype>
using namespace std;

int main(){
string s = "hello, world!";//s: i18n
//for (string::size_type sz = 0; sz < s.size(); ++sz){
/*works with < */
for (string::size_type sz = 0; sz != s.size(); ++sz){
/*doesn't work with != */
try{
if (ispunct(s.at(sz))){
s.erase(sz, 1);
}
}catch(exception &e){
cout << e.what() << endl;
}
}
cout << s << endl;
}
 
R

Rolf Magnus

In C++ Primer 4th, sec 3.3.2, it states that C++ programmers use !=
rather than < in a for loop.

They do? Well, I do that when using iterators, but for a loop using an
integer loop counter, I use <.
The following small snippet erases punctuations in a string. It works
well with < used in the for loop but it breaks when != is used instead.

#include <string>
#include <iostream>
#include <exception>
#include <cctype>
using namespace std;

int main(){
string s = "hello, world!";//s: i18n
//for (string::size_type sz = 0; sz < s.size(); ++sz){
/*works with < */
for (string::size_type sz = 0; sz != s.size(); ++sz){
/*doesn't work with != */
try{
if (ispunct(s.at(sz))){
s.erase(sz, 1);
}
}catch(exception &e){
cout << e.what() << endl;
}
}
cout << s << endl;
}

The problem is that s.size() will be reduced by 1 when you erase a
character. When the last character in your string is removed, your
index "jumps" over the end of the string, so sz will be greater than
s.size(). Therefore the version with < works, but not the one with !=.

Here is a table that shows what it looks like at the beginning of each loop
iteration:

index size character
0 13 h
1 13 e
2 13 l
3 13 l
4 13 o
5 13 , -> erased
6 12 w -> space is jumped over
7 12 o
8 12 r
9 12 l
10 12 d
11 12 ! -> erased
12 11 <none> -> index is past string end
 
M

Mark P

In C++ Primer 4th, sec 3.3.2, it states that C++ programmers use !=
rather than < in a for loop.

The following small snippet erases punctuations in a string. It works
well with < used in the for loop but it breaks when != is used instead.

#include <string>
#include <iostream>
#include <exception>
#include <cctype>
using namespace std;

int main(){
string s = "hello, world!";//s: i18n
//for (string::size_type sz = 0; sz < s.size(); ++sz){
/*works with < */
for (string::size_type sz = 0; sz != s.size(); ++sz){
/*doesn't work with != */
try{
if (ispunct(s.at(sz))){
s.erase(sz, 1);
}
}catch(exception &e){
cout << e.what() << endl;
}
}
cout << s << endl;
}

I don't have that book so I can only comment on what you've typed above.
The problem with the != version is that every time you call s.erase,
s.size() decreases by 1. Because s ends in a punctuation character you
erase the '!', thereby decrementing s.size() (12 to 11). Immediately
thereafter you increment sz (11 to 12), and the two values "pass
through" each other without ever being equal. Then you end up going out
of bounds with sz, throwing an out_of_range exception. In general I'd
prefer < to != for loop continuation conditions.
 
B

boaz1sade

In C++ Primer 4th, sec 3.3.2, it states that C++ programmers use !=
rather than < in a for loop.

The following small snippet erases punctuations in a string. It works
well with < used in the for loop but it breaks when != is used instead.

#include <string>
#include <iostream>
#include <exception>
#include <cctype>
using namespace std;

int main(){
string s = "hello, world!";//s: i18n
//for (string::size_type sz = 0; sz < s.size(); ++sz){
/*works with < */
for (string::size_type sz = 0; sz != s.size(); ++sz){
/*doesn't work with != */
try{
if (ispunct(s.at(sz))){
s.erase(sz, 1);
}
}catch(exception &e){
cout << e.what() << endl;
}
}
cout << s << endl;
}
Hi
The use of != is good practice when using iterators since the only type
of iterator that < will be meaningful to it is random access iterator
(the one used in vector and string but not in list, map ..). In this
case since you are not using iterators but comparing ints it is better
using < from the reasons the others pointed out. By the way I will
steer clean from this book. And anther by the way, its better to use
STL algorithms in this case and not to write this kind of explicit
loops so those type of troubles will not arise in the first place
 
L

lovecreatesbea...

Rolf said:
They do? Well, I do that when using iterators, but for a loop using an
integer loop counter, I use <.


The problem is that s.size() will be reduced by 1 when you erase a
character. When the last character in your string is removed, your
index "jumps" over the end of the string, so sz will be greater than
s.size(). Therefore the version with < works, but not the one with !=.

Here is a table that shows what it looks like at the beginning of each loop
iteration:

index size character
0 13 h
1 13 e
2 13 l
3 13 l
4 13 o
5 13 , -> erased
6 12 w -> space is jumped over
7 12 o
8 12 r
9 12 l
10 12 d
11 12 ! -> erased
12 11 <none> -> index is past string end

Thank you, and thank you all.

Yes, the problem in my last code snippet is not because of those two
operators.
I write a new one. I hope this one doesn't have so many errors like the
last one.

#include <string>
#include <iostream>
#include <exception>
#include <cctype>
using namespace std;

int main(){
string s = "hello, world!";
string::size_type sz = s.size();

cout << s << endl;
for(string::size_type idx=0; idx!=sz; ++idx){
//...; idx<sz; ... works
try{
if(ispunct(s.at(idx))){
s.erase(idx, 1);
idx--;
sz--;
}
}catch(exception &e){
cout << e.what() << endl;
}
}
cout << s << endl;

return 0;
}
 
A

Andre Kostur

You may want to look at the context in which that's used. That "rule"
applies to using iterators in a loop.


[snip]
of bounds with sz, throwing an out_of_range exception. In general I'd
prefer < to != for loop continuation conditions.

I'd amend that statement to clarify about what your using as a loop
continuation condition. Using < when working with iterators may not work.
 
D

Daniel T.

In C++ Primer 4th, sec 3.3.2, it states that C++ programmers use !=
rather than < in a for loop.

The following small snippet erases punctuations in a string. It works
well with < used in the for loop but it breaks when != is used instead.

#include <string>
#include <iostream>
#include <exception>
#include <cctype>
using namespace std;

int main(){
string s = "hello, world!";//s: i18n
//for (string::size_type sz = 0; sz < s.size(); ++sz){
/*works with < */
for (string::size_type sz = 0; sz != s.size(); ++sz){
/*doesn't work with != */
try{
if (ispunct(s.at(sz))){
s.erase(sz, 1);
}
}catch(exception &e){
cout << e.what() << endl;
}
}
cout << s << endl;
}

I think the above could be done in a better way:

int main() {
string s = "hello, world!";
// first skip through the string to the first punct
string::size_type pos = 0;
while ( pos != s.size() && !ispunct(s[pos]) )
++pos;
// if a punct was found...
if ( pos != s.size() )
{
// start shifting characters to the left,
// overwriting any puncts found along the way
string::size_type index = pos++;
while ( pos != s.size() ) {
if ( !ispunct(s[pos]) ) {
s[index++] = s[pos];
}
++pos;
}
s.resize( index );
}
cout << s << '\n';
}

Of course this simplest and most idiomatic way would be:

bool is_punct( char c ) {
return ispunct( c );
}

int main() {
string s = "hello, world!";
s.erase( remove_if( s.begin(), s.end(), &is_punct ), s.end() );
cout << s << '\n';
}
 
D

Dizzy

Daniel said:
bool is_punct( char c ) {
return ispunct( c );
}

int main() {
string s = "hello, world!";
s.erase( remove_if( s.begin(), s.end(), &is_punct ), s.end() );
cout << s << '\n';
}

I don't understand why do you need the is_punct() wrapper, cannot ispunct()
work directly as in:
s.erase( remove_if( s.begin(), s.end(), &ispunct ), s.end() );
 
?

=?ISO-8859-15?Q?Juli=E1n?= Albo

Dizzy said:
I don't understand why do you need the is_punct() wrapper, cannot
ispunct() work directly as in:
s.erase( remove_if( s.begin(), s.end(), &ispunct ), s.end() );

There is a problem with the direct use of functions of the ispunct family,
his siganture and return type can give problems to choose the appropriate
template parameters. And there can be other problems if the implemenation
use signed char in the plain char type. The less problematic solution will
be a wrapper that calls ispunct (static_cast <unsigned char> (c) )
 
K

Kaz Kylheku

In C++ Primer 4th, sec 3.3.2, it states that C++ programmers use !=
rather than < in a for loop.

Not smart programmers, who understand what it means for a loop guard
test to check for the correct precondition.

Fact is, if there is some loop variable i, and a limiting value N, and
if it is incorrect to execute the loop body if i is equal to N, or if i
is greater than N, then the proper guard for executing that loop body
is (i < N).

The test (i != N) only works when other logic has already ensured that
(i <= N) is true. That other logic is usually the fact N is
nonnegative, and that i starts at value 0 (and so i <= N) is initially
true, and that i increments by one in each loop iteration toward the
limiting value, and that N does not change.

The correct test (i < N) doesn't rely on any such assumptions. It
doesn't matter what happens to the value of i in each iteration.
The following small snippet erases punctuations in a string. It works
well with < used in the for loop but it breaks when != is used instead.

That's because the body of the loop changes N. If N is a moving target,
then the invariant (i <= N) does not hold from one iteration to the
next. On one iteration, it could be that i == N - 1. But then, i
increments, and N decrements, so that i is suddenly N + 1.
#include <string>
#include <iostream>
#include <exception>
#include <cctype>
using namespace std;

int main(){
string s = "hello, world!";//s: i18n
//for (string::size_type sz = 0; sz < s.size(); ++sz){
/*works with < */
for (string::size_type sz = 0; sz != s.size(); ++sz){
/*doesn't work with != */
try{
if (ispunct(s.at(sz))){
s.erase(sz, 1);
}

You have another bug here anyway, because if you erase an element of
the string, then you must not increment the sz index. By doing that,
you skip a character, which could be a punctuation character.

Imagine that the action of erase is like that of the Del key on your
keyboard in a typical text editor. When you use Del, the character
under the cursor is eaten, and everything after it moves one position
to the left. If you wanted to delete several punctuation characters in
a row, you'd type Del several times without having to move the cursor.

Similarly, the zs ``cursor'' must not move while characters are being
deleted; after each erasure, the next character to be processed shifts
into the current position, and so that position of the string must be
reevaluated.

If that logic is corrected, then the != test will work, since in any
given iteration, only the s.size() value or the sz value will change,
not both at the same time. Either the size decrements by one, bringing
it closer to sz, or sz increments by one closer to s.size().
 
R

Ron House

Thank you, and thank you all.

Yes, the problem in my last code snippet is not because of those two
operators.

The error you had is one reason why I use != rather than <: it goes
dramatically wrong if there is an error in the loop control conditions.
That is a *good* thing because when the program fails dramatically, I
fix the problem. If the program "works" but has a misunderstanding, it
could bite me in a more subtle but more damaging way when the program is
in production use.
 
D

Dizzy

Julián Albo said:
There is a problem with the direct use of functions of the ispunct family,
his siganture and return type can give problems to choose the appropriate
template parameters. And there can be other problems if the implemenation
use signed char in the plain char type. The less problematic solution will
be a wrapper that calls ispunct (static_cast <unsigned char> (c) )

I have used directly with ispunct and it worked (in my case ispunct is a int
ispunct(int) signature). I don't understand what could be wrong with "chose
the appropiate template parameters" as long as the standard says that
remove_if is of the form:
template<class ForwardIterator, class Predicate>
ForwardIterator remove_if(ForwardIterator first, ForwardIterator last,
Predicate pred);

Thus the Predicate is it's own type and it can be anything that can be
called with a syntax like "pred(*iterator)" and returns an implicitely
convertible to bool value (and has no side effects) where iterator is of
ForwardIterator type. I think in this case functions with the kind of
ispunct signature should work just fine.

Also what can it be wrong with signed/unsigned char thingy, because if the
implementation uses a signed char then this is true for everything
(std::string, the values that ispunct() check for, etc).

I'm asking all this because I want to know more, where I am wrong. Of course
if you talk about non-standard compliant implementations then I'm
interested :)

Thanks!
 
R

Rolf Magnus

Dizzy said:
I have used directly with ispunct and it worked (in my case ispunct is a
int ispunct(int) signature). I don't understand what could be wrong with
"chose the appropiate template parameters" as long as the standard says
that remove_if is of the form:
template<class ForwardIterator, class Predicate>
ForwardIterator remove_if(ForwardIterator first, ForwardIterator last,
Predicate pred);

Thus the Predicate is it's own type and it can be anything that can be
called with a syntax like "pred(*iterator)" and returns an implicitely
convertible to bool value (and has no side effects) where iterator is of
ForwardIterator type. I think in this case functions with the kind of
ispunct signature should work just fine.

It might, or it might not. Basically, the behavor is undefined.
Also what can it be wrong with signed/unsigned char thingy, because if the
implementation uses a signed char then this is true for everything
(std::string, the values that ispunct() check for, etc).

The problem is that ispunct needs the value to be in the range of unsigned
char. If char is signed, it could be negative, and that's not allowed. As
an example, ispunct might internally use a table to determine if a
character is a punctuation character or not. A negative index into that
table could be desaterous.
So the implicit conversion from char to int is not enough. You need to add a
conversion to unsigned char before it.
 
?

=?ISO-8859-15?Q?Juli=E1n?= Albo

Dizzy said:
I have used directly with ispunct and it worked (in my case ispunct is a
int ispunct(int) signature). I don't understand what could be wrong with
"chose the appropiate template parameters" as long as the standard says
that remove_if is of the form:
template<class ForwardIterator, class Predicate>
ForwardIterator remove_if(ForwardIterator first, ForwardIterator last,
Predicate pred);

But using a function or functor with the right signature is cleaner, it will
works correctly in all cases, so you don't need to recheck if you change of
algorithm, or add an adapter in the call.
Also what can it be wrong with signed/unsigned char thingy, because if the
implementation uses a signed char then this is true for everything
(std::string, the values that ispunct() check for, etc).

ispunct and familly assumes the his int argument contains the conversion to
int of an unsigned char or EOF. If your implementation has plain chars with
sign, passing it directly is bad according to the standard because does not
follow the rules, and also can be esaisly seen in the practice as bad, for
example in several implementations '\xFF' will be converted to EOF when
promoted to int. Naturally, if you always use a non-extended ascii
character set you will never see this problem.
 
A

Andrew Koenig

Kaz Kylheku said:
(e-mail address removed) wrote:
Not smart programmers, who understand what it means for a loop guard
test to check for the correct precondition.
Fact is, if there is some loop variable i, and a limiting value N, and
if it is incorrect to execute the loop body if i is equal to N, or if i
is greater than N, then the proper guard for executing that loop body
is (i < N).

Hey Kaz, I expected better from you -- you're usually not this careless :)

There are two issues here, and you've caught only one of them:

1) Under what conditions is the body of the loop executed?

2) What conditions pertain after the loop completes?

If we write

while (i < N) { /* do something */ }

we are certain that whenever we execute /* do something */, i is less than
N. So far, we agree.

However, if we write

while (i != N) { /* do something */ }

we are certain that after the loop completes, i is equal to N. We do not
have this certainty in the first case, because it is conceivable that i
might be greater than N.

In other words, if you use < for index comparisons, you make it easier to
verify that the loop doesn't do anything undefined, but harder to verify
that the loop produces the correct result.

Personally, I would rather write my programs in a way that makes it easier
to prove correctness.
 
D

Daniel T.

Dizzy said:
I have used directly with ispunct and it worked (in my case ispunct is a int
ispunct(int) signature). I don't understand what could be wrong with "chose
the appropiate template parameters" as long as the standard says that
remove_if is of the form:
template<class ForwardIterator, class Predicate>
ForwardIterator remove_if(ForwardIterator first, ForwardIterator last,
Predicate pred);

Thus the Predicate is it's own type and it can be anything that can be
called with a syntax like "pred(*iterator)" and returns an implicitely
convertible to bool value (and has no side effects) where iterator is of
ForwardIterator type. I think in this case functions with the kind of
ispunct signature should work just fine.

There are two 'ispunct' functions in C++ with different signatures. One
of them is sometimes (allowably) implemented as a macro. There is simply
no way for the compiler to tell which ispunct you wanted to use without
more context, and there is no way for the template code to properly use
a macro.

If you have used it in the past and it worked, great for you. Undefined
behavior sometimes does that.
Also what can it be wrong with signed/unsigned char thingy, because if the
implementation uses a signed char then this is true for everything
(std::string, the values that ispunct() check for, etc).

There is nothing wrong. Julián's concern is that if the char is signed,
it will expand out to an int with a bunch of 1s, but I don't think that
is an issue because 'ispunct' and family only looks at 7 bits of the
char. I could be wrong on this one...
 
?

=?ISO-8859-15?Q?Juli=E1n?= Albo

Daniel said:
There is nothing wrong. Julián's concern is that if the char is signed,
it will expand out to an int with a bunch of 1s, but I don't think that
is an issue because 'ispunct' and family only looks at 7 bits of the
char. I could be wrong on this one...

It looks for EOF, and the value of EOF can be the same as the conversion to
int of a char '\xFF', depending on the implementation and possibly of
compiler options used. Also, the way it evaluates the non-EOF valid values
is dependant of the current locale.
 
T

Tr0n

Andrew said:
Hey Kaz, I expected better from you -- you're usually not this careless :)

There are two issues here, and you've caught only one of them:

1) Under what conditions is the body of the loop executed?

2) What conditions pertain after the loop completes?

If we write

while (i < N) { /* do something */ }

we are certain that whenever we execute /* do something */, i is less than
N. So far, we agree.

However, if we write

while (i != N) { /* do something */ }

we are certain that after the loop completes, i is equal to N. We do not
have this certainty in the first case, because it is conceivable that i
might be greater than N.

You can always make this certainty with a simple one-liner:

if ( i != N ) return(255);
(or a throw is just as valid)

... That is if you NEED i as a certain value - but why would you if you
have N? - Surely you'd be using N because it could change?

I suppose it really depends on what it is used for - but I still think <
is less error-prone than a != (as any value can be != - while you're
limiting a range using <) .

In other words, if you use < for index comparisons, you make it easier to
verify that the loop doesn't do anything undefined, but harder to verify
that the loop produces the correct result.

What "correct" result are you after?
The last point in the loop?
... Surely the data effected by the code inside the loop proves if it is
"correct" or not - and not the result of an incrementing index?

If you wanted the last point using "i != N" then you're after N... Why
not just USE N?
Personally, I would rather write my programs in a way that makes it easier
to prove correctness.

... I really don't like that.
Reminds me of all these buffer over-run errors where something
unexpected means CONSTANT bug-fixing.

It also reminds me of a current position I have at work, using an
un-named application (I don't think it's my place to name-names) which
produced an endless loop because of some incorrect data.. It took them 3
days to track this down on a production environment (the data was
incorrect for 2 weeks over the xmas and only produced errors after
people started again: adding to the difficulty).

I actually support the previous posters' paragraph which states
"limiting value N", and "if it is incorrect to execute the loop body
if.. i is greater than N".

I just thought I'd post my thoughts on this, because I'd prefer my code
to do what I program - rather than produce a program which may do 'other
things'.

--
(e-mail address removed)

A pumpkin warrior, brave and good
The last survivor from the wood
So go now swiftly climb the stair
And cut a lock of witch’s hair.
 
D

Daniel T.

Julián Albo said:
It looks for EOF, and the value of EOF can be the same as the conversion to
int of a char '\xFF', depending on the implementation and possibly of
compiler options used.

There is no char '\xFF' in ASCII (which is what ispunct(int) and family
are designed to handle.) The alternative ispunct (which also takes a
locale parameter) is templated to the char type, thus ispunct<char> and
ispunct<unsigned char> might be different functions, but I would expect
each to work for its type, thus not requiring a cast. It seems to me,
that in any case \xFF and EOF would both return false from ispunct, so
the issue is moot.

Is there an authoritative source about this issue?
 
?

=?ISO-8859-15?Q?Juli=E1n?= Albo

Daniel said:
There is no char '\xFF' in ASCII (which is what ispunct(int) and family
are designed to handle.)

Are you saying that C++ is only intended to machines with ascii compatible
charsets?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,017
Latest member
GreenAcreCBDGummiesReview

Latest Threads

Top