Thinking in C++ StreamTokenizer example bug

  • Thread starter jose luis fernandez diaz
  • Start date
J

jose luis fernandez diaz

Hi,

In "STL Containers & Iterators" chapter there is the example below:

#ifndef TOKENITERATOR_H
#define TOKENITERATOR_H
#include <string>
#include <iterator>
#include <algorithm>
#include <cctype>

struct Isalpha {
bool operator()(char c) {
using namespace std; //[[For a compiler bug]]
return isalpha(c);
}
};

class Delimiters {
std::string exclude;
public:
Delimiters() {}
Delimiters(const std::string& excl)
: exclude(excl) {}
bool operator()(char c) {
return exclude.find(c) == std::string::npos;
}
};

template <class InputIter, class Pred = Isalpha>
class TokenIterator: public std::iterator<
std::input_iterator_tag,std::string,ptrdiff_t>{
InputIter first;
InputIter last;
std::string word;
Pred predicate;
public:
TokenIterator(InputIter begin, InputIter end,
Pred pred = Pred())
: first(begin), last(end), predicate(pred) {
++*this;
}
TokenIterator() {} // End sentinel
// Prefix increment:
TokenIterator& operator++() {
word.resize(0);
first = std::find_if(first, last, predicate);
while (first != last && predicate(*first))
word += *first++;
return *this;
}
// Postfix increment
class Proxy {
std::string word;
public:
Proxy(const std::string& w) : word(w) {}
std::string operator*() { return word; }
};
Proxy operator++(int) {
Proxy d(word);
++*this;
return d;
}
// Produce the actual value:
std::string operator*() const { return word; }
std::string* operator->() const {
return &(operator*());
}
// Compare iterators:
bool operator==(const TokenIterator&) {
return word.size() == 0 && first == last;
}
bool operator!=(const TokenIterator& rv) {
return !(*this == rv);
}
};
#endif // TOKENITERATOR_H ///:~


When I run the program:

#include <set>
#include <iterator>
#include <streambuf>
#include <sstream>

using namespace std;

int main() {
ifstream in("/tmp/kk");
ostream_iterator<string> out(cout, "\n");
typedef istreambuf_iterator<char> IsbIt;
IsbIt begin(in), isbEnd;
Delimiters
delimiters(" \t\n~;()\"<>:{}[]+-=&*#.,/\\");
TokenIterator<IsbIt, Delimiters>
wordIter(begin, isbEnd, delimiters), it,
end;
it = wordIter++; // <<--
}


I get the next error:

Error: tok.cxx, line 24: no operator "=" matches these operands
operand types are: TokenIterator<std::istreambuf_iterator<char,
std::char_traits<char>>, Delimiters> =
TokenIterator<std::istreambuf_iterator<char,
std::char_traits<char>>, Delimiters>::proxy
it = wordIter++;




The problem is that the method:


Proxy operator++(int) {
Proxy d(word);
++*this;
return d;
}

don't return a TokenIterator object.

Any idea to fix this error ?

Thanks,
Jose Luis.
 
C

Chris Theis

jose luis fernandez diaz said:
Hi,

In "STL Containers & Iterators" chapter there is the example below:
[SNIP]

Proxy operator++(int) {
Proxy d(word);
++*this;
return d;
}

don't return a TokenIterator object.

Any idea to fix this error ?

I just looked through the code quickly, but as far as I see it the result of
the postfix operator is not meant to be assigned to a TokenIterator. A
simple & quick solution would be to perform what you want in two steps:

wordIter++;
it = WordIter;

Other solutions would include much more knick knack which you would have to
pull off to get a TokenIterator out of its postfix proxy implementation.
IMHO this might simply not be worth the effort.

Chris
 
J

Jeff

Hi Jose, I made some changes to your class, posted below. Some of
these changes were minor, such as renaming class variables to my own
style, to help me sort out what was going on in this example. Some of
the changes were more substantial, and are mentioned in the code.

Hope this helps. It's a really interesting piece of code. Do you
recommend the book?

Jeff


#ifndef TOKENITERATOR_H
#define TOKENITERATOR_H
#include <string>
#include <iterator>
#include <algorithm>
#include <cctype>

struct Isalpha {
bool operator()(char c) {
//using namespace std; //[[For a compiler bug]]
return isalpha(c);
}
};


class Delimiters {
private:
std::string exclude;
public:
Delimiters() {}
Delimiters(const std::string& excl)
: exclude(excl) {}
bool operator()(char c) {
// returns true if char should be kept
// (i.e. the char isn't in the exclude string)
return (exclude.find(c) == std::string::npos);
}
};


template <class InputIter, class Pred = Isalpha>
class TokenIterator: public std::iterator<std::input_iterator_tag,
std::string, ptrdiff_t>
{
InputIter current_; // prefer "current_" instead of "first_"
because it changes
InputIter last_; // constant, name is OK
Pred predicate_;

// the token is only built when the iterator is dereferenced, not
before
// tried to make the token and flag mutable to make dereferencing a
const operator, g++ didn't like it,
// though it looks fine to me
/*mutable*/ std::string token_;
/*mutable*/ bool token_built_;

// word pointer movement, building to a helper function
void advance_pointer_to_next_token_start() {
current_ = std::find_if(current_, last_, predicate_);
token_built_ = false; // haven't actually made the token yet
}

void build_token() {
// token only built when iterator deref'd
if (!token_built_) {
token_.resize(0);

// add following chars to the token
// don't want to mess with the current_ pointer, it should stay
where it is until
// it's explicitly advanced to the next token start
InputIter it = current_;
while (it != last_ && predicate_(*it))
token_ += *it++;

token_built_ = true;
}
}


public:
TokenIterator(InputIter begin, InputIter end, Pred pred = Pred())
: current_(begin), last_(end), predicate_(pred) {
advance_pointer_to_next_token_start(); }

TokenIterator() {} // End sentinel


// Produce the actual value:
// (wanted to make token_ and token_built_ mutable, allowing these
ops to be constant,
// but g++ didn't like it ... I could be doing something wrong
std::string operator*() /*const*/ { build_token(); return token_; }
std::string* operator->() /*const*/ { build_token(); return &token_;
} // changed from &(operator*());


// Prefix increment:
TokenIterator& operator++() {
// prefix increment should only move the pointer to the next
place, it shouldn't actually
// build the token, as was being done in the original code. 2
reasons for this:
// 1. prefix increment only moves the pointer, building the token
in an internal buffer
// would be a "side effect".
// 2. the user may (for some odd reason) call operator++ a number
of times without
// dereferencing the pointer, so building the token would be a
waste of time
advance_pointer_to_next_token_start();
return *this;
}


// Postfix increment

// I'm not sure what the use of Proxy is ...
/*
class Proxy {
std::string prox_word_;
public:
Proxy(const std::string& w) : prox_word_(w) {}
std::string operator*() { return prox_word_; }
};

Proxy operator++(int) {
Proxy d(word_);
++*this;
return d;
}
*/

// postfix operator should still return a TokenIterator
TokenIterator operator++(int) {
TokenIterator ret(*this); // copy
++*this;
return ret;
}

// Compare iterators:
// minor question
// ? why should the equality operator return true if the word is
null and the current pointer
// is at the end?
// shouldn't the function be:
// bool operator==(const TokenIterator& rhs) {
//
// some test here which checks if rhs is a sentinal iterator, and
if so, checks if the word is null
// and the current_ pointer is at the last_ (ie, the same code
as yours) ...
//
// otherwise, if rhs is not a sentinel pointer, check the following
// return (this->word_ == rhs->word_) && (this->...) etc ..
//
// Now, I'm talking a bit out of my depth here. I'm not sure if you
ever need to compare iterators,
// except in the case of comparing against a sentinel value.
Perhaps you can enlighten me on this.
//
bool operator==(const TokenIterator&) {
return token_.size() == 0 && current_ == last_;
}

bool operator!=(const TokenIterator& rv) {
return !(*this == rv);
}

};
#endif // TOKENITERATOR_H ///:~


// USAGE ========================================================

#include <iterator>
#include <streambuf>
#include <iostream>
#include <fstream>

using namespace std;

int main() {
typedef istreambuf_iterator<char> IsbIt;
ifstream in("temp.txt");
IsbIt fbegin_it(in), fend_it; // start, end of file

Delimiters d(" \t\n~;()\"<>:{}[]+-=&*#.,/\\");
TokenIterator<IsbIt, Delimiters> wordIter(fbegin_it, fend_it, d),
it, end;


// USAGE
typedef TokenIterator<IsbIt, Delimiters> tok_it_t;
for (tok_it_t iter(fbegin_it, fend_it, d), iter_end; iter!=iter_end;
iter++) { cout << *iter << endl; }
// (can use pre- or post- increment in the loop construct above)

// for(it=wordIter; it!=end; it=wordIter++) { cout << *it << endl; }


return 0;
}
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,059
Latest member
cryptoseoagencies

Latest Threads

Top