The behavior of istream.


S

somenath

I was trying to understand the concept of Iterator from the book "Programming -- Principles and Practice Using C++" by Bjarne Stroustrup
I was trying to understand the program in Chapter 20.6.2
But the exact copy of the code was crashing and was not able to read the newline also from the console.
So I did little bit of experiment and change the code as follows

#include <iostream>
#include<list>
#include <vector>
using namespace std;
class Text_iterator;

typedef vector<char> Line;

class Text_iterator {
list<Line>::iterator ln;
Line::iterator pos;
public:
Text_iterator( list<Line>::iterator ll, Line::iterator pp ):ln(ll),pos(pp) {

}
char& operator *() {
return *pos;
}
Text_iterator & operator++() {
if ( pos == ln->end() ){
++ln;
pos = ln->begin();
}
else {
++pos;
}
return *this;
}
bool operator ==( const Text_iterator &rhs ) const{
return ln == rhs.ln && pos ==rhs.pos;
}

bool operator != (const Text_iterator &rhs ) const{
return !(*this == rhs);

}

};

struct Document {
list<Line> line;
Document() {
line.push_back(Line());
}
Text_iterator begin() {
return Text_iterator(line.begin(),(line.begin())->begin() );

}
Text_iterator end() { //This is the main modification to

list<Line>::iterator last = line.end();
--last;
return Text_iterator(last, last->end());
//return Text_iterator(line.end(),(line.end())->end() );

}

};

void print(Document &d) {
for (Text_iterator p = d.begin(); p != d.end(); ++p) {
cout << *p;
}
}

//Insert into Document
istream & operator>>(istream &is, Document &d) {

char ch;
while (is.get(ch)) { //This is also modified to read '\n' and space
d.line.back().push_back(ch);
if (ch == '\n') {
d.line.push_back(Line());
}
}
return is;
}


int main() {
Document dd;
cin>>dd;
print(dd);
return 0;
}


But while I ran the code with the following input I get extra character in output. That's the place I am perplexed.

Input
+++++++++
hello
how are you
I am fine
I
++++++++++
Output
hello
how are you
I am fine
I
a
++++++
So I am getting extra character "a" in output.
Could you please help me to understand the reason for this.
 
Ad

Advertisements

S

Stuart

I was trying to understand the concept of Iterator from the book "Programming -- Principles and Practice Using C++" by Bjarne Stroustrup
I was trying to understand the program in Chapter 20.6.2
But the exact copy of the code was crashing and was not able to read the newline also from the console.
So I did little bit of experiment and change the code as follows

#include <iostream>
#include<list>
#include <vector>
using namespace std;
class Text_iterator;

typedef vector<char> Line;

class Text_iterator {
list<Line>::iterator ln;
Line::iterator pos;
public:
Text_iterator( list<Line>::iterator ll, Line::iterator pp ):ln(ll),pos(pp) {

}
char& operator *() {
return *pos;
}
Text_iterator & operator++() {
if ( pos == ln->end() ){
++ln;
pos = ln->begin();
}
else {
++pos;
}

The problem seems to be here. operator++ should not return the ln->end()
iterator, but it does. The one-past-the-end element will contain
arbitrary garbage, so you might even get an Access Violation when you
try to print the returned iterator (most of the time, I got an
additional question mark symbol at the beginning of each printed line,
but once I also got an Access Violation).
return *this;
}
bool operator ==( const Text_iterator &rhs ) const{
return ln == rhs.ln && pos ==rhs.pos;
}

bool operator != (const Text_iterator &rhs ) const{
return !(*this == rhs);

}

};

struct Document {
list<Line> line;
Document() {
line.push_back(Line());
}
Text_iterator begin() {
return Text_iterator(line.begin(),(line.begin())->begin() );

}
Text_iterator end() { //This is the main modification to

list<Line>::iterator last = line.end();
--last;
return Text_iterator(last, last->end());
//return Text_iterator(line.end(),(line.end())->end() );

}

};

This alteration is fine. However, it changes the behaviour (does not
print the last line) and does not fix the problem of the superfluous
character (see my comment above).

void print(Document &d) {
for (Text_iterator p = d.begin(); p != d.end(); ++p) {
cout << *p;
}
}

//Insert into Document
istream & operator>>(istream &is, Document &d) {

char ch;
while (is.get(ch)) { //This is also modified to read '\n' and space
d.line.back().push_back(ch);
if (ch == '\n') {
d.line.push_back(Line());
}
}
return is;
}


int main() {
Document dd;
cin>>dd;
print(dd);
return 0;
}


But while I ran the code with the following input I get extra character in output. That's the place I am perplexed.

Input
+++++++++
hello
how are you
I am fine
I
++++++++++
Output
hello
how are you
I am fine
I
a
++++++
So I am getting extra character "a" in output.
Could you please help me to understand the reason for this.

What you was experiencing was a fine example of Undefined Behaviour, I
got a different extra character, and sometimes the code even crashed
with an Access Violation. See my comments above.

Regards,
Stuart

PS: It is a pity that few implementation of the STL add extra checks for
debug mode. It would be no problem to use a special iterator for the
one-past-the-end elements which would raise an exception if they are
dereferenced. This would mean, that iterators needed to have a back-link
to the container in order to figure out whether it++ should return that
special iterator, but I think for hunting down bugs (such as yours) this
would be very neat.
 
S

somenath

The problem seems to be here. operator++ should not return the ln->end()

iterator, but it does. The one-past-the-end element will contain

arbitrary garbage, so you might even get an Access Violation when you

try to print the returned iterator (most of the time, I got an

additional question mark symbol at the beginning of each printed line,

but once I also got an Access Violation).

I am not really able to figure out how not to point to ln->end() and at the same time go to the next line. At this point how do I get the information that increment of "ln" will lead "ln" to ln->end()?

So the idea here is if "pos" point to the end of the current line then "pos" should point to the beginning of next line. But how can I do the check that if "ln" is already at the last line so we should not do ++ln;
 
Ad

Advertisements

S

Stuart

[snip]

The problem seems to be here. operator++ should not return the
ln->end() iterator, but it does. The one-past-the-end element
will contain arbitrary garbage, so you might even get an Access
Violation when you try to print the returned iterator (most of the
time, I got an additional question mark symbol at the beginning of
each printed line, but once I also got an Access Violation).


I am not really able to figure out how not to point to ln->end()
and at the same time go to the next line. At this point how
do I get the information that increment of "ln" will lead "ln"
to ln->end()?

So the idea here is if "pos" point to the end of the current
line then "pos" should point to the beginning of next line. But
how can I do the check that if "ln" is already at the last line
so we should not do ++ln;

I'd suggest something like this:

Text_iterator & operator++() {

// Each line but the last will ends with a \n character, so
// if the current position contains a newline character, we
// know that there will be another line.
if (*pos != '\n') {
++pos;
// pos might now point to ln->end(), but only if this is the
// last line and in that case this iterator will be the same
// as Document::end()
}
else {
++ln;

// This is now safe, because we have just seen a newline.
pos = ln->begin();
}
return *this;
}

Regards,
Stuart
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top