Locating partially matched strings in a file

D

Dilip

I have a situation where I have a bunch of strings stuffed into a
vector.

I need to run through each of them and locate approximate (or complete)
matches to them in another file (which also contains a string on each
line). If I am searching for CAD I need to locate even something that
looks like ACAD or CADINC (along with the exact match for CAD of
course).

I ran into some examples using istream_iterator and have an idea how to
do this if I am just looking for an exact match.

Can someone point me in the right direction what I need to do if I have
to make substring searches too?

thanks!
 
D

Dilip

Dilip said:
I have a situation where I have a bunch of strings stuffed into a
vector.

I need to run through each of them and locate approximate (or complete)
matches to them in another file (which also contains a string on each
line). If I am searching for CAD I need to locate even something that
looks like ACAD or CADINC (along with the exact match for CAD of
course).

I ran into some examples using istream_iterator and have an idea how to
do this if I am just looking for an exact match.

Actually here is what I came up with. Does it work for the situation I
mentioned above (partial matches)?

ifstream ifs("filetosearch.txt");
string strtosearch("CAD");

istream_iterator<string> pos(std::find(istream_iterator<string>(ifs),
istream_iterator<string>(), strtosearch);

// at this point, how do I check if I located what I need?
// and how do I extract what I located in the file?
 
J

jois.de.vivre

Dilip said:
Actually here is what I came up with. Does it work for the situation I
mentioned above (partial matches)?

ifstream ifs("filetosearch.txt");
string strtosearch("CAD");

istream_iterator<string> pos(std::find(istream_iterator<string>(ifs),
istream_iterator<string>(), strtosearch);

// at this point, how do I check if I located what I need?
// and how do I extract what I located in the file?

Just read in each string from your file into an std::string object.
Then use one of std::string's "find" functions to locate your substr
(google std::string). This isn't necessarily the most efficient way to
do it, but it's one way.
 
D

Davlet Panech

Dilip said:
Actually here is what I came up with. Does it work for the situation I
mentioned above (partial matches)?

ifstream ifs("filetosearch.txt");
string strtosearch("CAD");

istream_iterator<string> pos(std::find(istream_iterator<string>(ifs),
istream_iterator<string>(), strtosearch);

This assumes that the lines in your file contain no whitespace (IIRC
iterator <string> splits on any whitespace char). IMHO trying to do this
with std algorithms/iterators/binders is likely to give you a brain
hemorrhage... the easiest thing is to read your file one line at a time
(in a loop) and search for each of your substrings in the current line.

D.
 
D

Dilip

Davlet said:
This assumes that the lines in your file contain no whitespace (IIRC
iterator <string> splits on any whitespace char). IMHO trying to do this
with std algorithms/iterators/binders is likely to give you a brain
hemorrhage... the easiest thing is to read your file one line at a time
(in a loop) and search for each of your substrings in the current line.

oops.. there could be white space in a line. so istream_iterator is
out.
I do exactly what you suggested above but see all the cool kids out
there have littered their code iterators and binders and what not. I
thought I'd use this opportunity to learn a more C++/STLish way of
doing things...
oh well..
 
P

Pete Becker

Dilip said:
oops.. there could be white space in a line. so istream_iterator is
out.

No, it's not. istream_iterator doesn't do anything special with
whitespace. It just copies one character after another.

--

-- Pete

Author of "The Standard C++ Library Extensions: a Tutorial and
Reference." For more information about this book, see
www.petebecker.com/tr1book.
 
D

Davlet Panech

Pete said:
No, it's not. istream_iterator doesn't do anything special with
whitespace. It just copies one character after another.

Actually istream_iterator uses operator >> to read things, which in case
of std::string reads one "word", e.g. "abc efg" would stop after "c".
 
P

Pete Becker

Davlet said:
Actually istream_iterator uses operator >> to read things, which in case
of std::string reads one "word", e.g. "abc efg" would stop after "c".

You're right. Sorry about the confusion. I was thinking of
istreambuf_iterator, which would be the right choice here.

--

-- Pete

Author of "The Standard C++ Library Extensions: a Tutorial and
Reference." For more information about this book, see
www.petebecker.com/tr1book.
 
J

Jerry Coffin

[ ... ]
Actually istream_iterator uses operator >> to read things, which in case
of std::string reads one "word", e.g. "abc efg" would stop after "c".

I'd use "token" instead of "word", but more or less correct. Keep in
mind, however, that tokens/words are broken (only) at what is defined as
whitesapce by the ctype facet of the locale associated with the stream.

You can define a new ctype facet that only defines new-line as white
space, and go from there.

Alternatively, you can define a string proxy that overloads operator>>
to use std::getline:

class line {
std::string data;
public:
operator std::string() const { return data; }

friend std::istream &operator>>(std::istream &is, line &l) {
return std::getline(is, l.data);
}
};

Then use it something like:

std::vector<std::string> lines;

std::copy(std::istream_iterator<line>(wherever),
std::istream_iterator<line>(),
std::back_inserter(lines));

The only place we use the 'line' type is to instantiate the iterator.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,777
Messages
2,569,604
Members
45,227
Latest member
Daniella65

Latest Threads

Top