finding an index in an STL vector<>

J

Joe

I have a: vector<string> which contains a few dozen elements.

I want to find the index of the element containing a certain string.

for example:

vector<string> strings;
strings.push_back("abc");
strings.push_back("xyz");
strings.push_back("lmnop");

int index = distance(strings.begin(), find(strings.begin(),
strings.end(), string("xyz")));

cout << index << endl; // should print out a '1', the index of "xyz"

The above line using 'distance' seems pretty terrible to me. Please
tell me that there a better way?

Thanks in advance,
Joe
 
D

David Hilsee

Joe said:
I have a: vector<string> which contains a few dozen elements.

I want to find the index of the element containing a certain string.

for example:

vector<string> strings;
strings.push_back("abc");
strings.push_back("xyz");
strings.push_back("lmnop");

int index = distance(strings.begin(), find(strings.begin(),
strings.end(), string("xyz")));

cout << index << endl; // should print out a '1', the index of "xyz"

The above line using 'distance' seems pretty terrible to me. Please
tell me that there a better way?

Ah, it's not that bad. The explicit construction of the std::string can be
eliminated to make it a little prettier, right? I can't think of anything
better offhand. If you think it's ugly, wrap it.

// optionally a template
std::size_t index_of_element( std::vector<std::string&>& vec, const
std::string& elem );

int index = index_of_element( strings, "xyz" );

But why do you want the index when you already have an iterator?
 
J

Joe Harris

David said:
Ah, it's not that bad. The explicit construction of the std::string can be
eliminated to make it a little prettier, right? I can't think of anything
better offhand. If you think it's ugly, wrap it.

// optionally a template
std::size_t index_of_element( std::vector<std::string&>& vec, const
std::string& elem );

int index = index_of_element( strings, "xyz" );

But why do you want the index when you already have an iterator?

Hi, this I'm the original poster 'Joe' just using a different new client...

Thanks for your quick response!

Why do I want an index when I already have an iterator?

Well, I had a class like this:

class foo {
[other class members]
string _bar;
public:
string bar() { return _bar; } // my reader function
bar([other parameters], bar) _bar(bar) {} // my constructor
}

The problem is that I have ten's of millions of objects instantiated
from this class, and I don't have enough RAM (and my machine can't hold
anymore than the 1GB it has).

So, since I know that there are only about 10 or 20 unique types of
'bar' I would just store them all in a static vector and store an index
(which only needs to be a char because there are only 10 or 20 unique
strings) to that vector in each object. So, I ended up with this:

class foo_new {
[ other class members]
static vector<string> mystrings;
char _bar_index; // this is just an index to the mystrings element
public:
string bar() { return mystrings[bar_index]; } // my reader function
foo_new([other parameters], string bar) { // constructor
int index = distance(strings.begin(), find(strings.begin(),
strings.end(), bar));
if(index != strings.end()) {
_bar_index = index;
} else {
_bar_index = strings.size();
strings.push_back(bar);
}
}
}

So, foo_new objects now have 'char bar_index' instead of 'string bar'
which will save me hundreds of megabytes.

If you can propose a better way of getting a similar result, I will be
both grateful and impressed.

-- Joe
 
D

David Hilsee

Hi, this I'm the original poster 'Joe' just using a different new client...

Thanks for your quick response!

Why do I want an index when I already have an iterator?

Well, I had a class like this:

class foo {
[other class members]
string _bar;
public:
string bar() { return _bar; } // my reader function
bar([other parameters], bar) _bar(bar) {} // my constructor
}

The problem is that I have ten's of millions of objects instantiated
from this class, and I don't have enough RAM (and my machine can't hold
anymore than the 1GB it has).

So, since I know that there are only about 10 or 20 unique types of
'bar' I would just store them all in a static vector and store an index
(which only needs to be a char because there are only 10 or 20 unique
strings) to that vector in each object. So, I ended up with this:

class foo_new {
[ other class members]
static vector<string> mystrings;
char _bar_index; // this is just an index to the mystrings element
public:
string bar() { return mystrings[bar_index]; } // my reader function
foo_new([other parameters], string bar) { // constructor
int index = distance(strings.begin(), find(strings.begin(),
strings.end(), bar));
if(index != strings.end()) {
_bar_index = index;
} else {
_bar_index = strings.size();
strings.push_back(bar);
}
}
}

So, foo_new objects now have 'char bar_index' instead of 'string bar'
which will save me hundreds of megabytes.

If you can propose a better way of getting a similar result, I will be
both grateful and impressed.

Well, I can't think of a better way to do what you're doing. It looks like
a flyweight. However, given that there are tens of millions of these
objects, I would probably verify a few things, if I were you. First, make
sure that there isn't a way you can reduce the number of elements. Can
these objects, or certain member of these objects, be computed on the fly?
Second, make sure that there aren't more members that can benefit from using
that static table. Third, make sure that you are realizing the benefits of
using a char instead of a std::vector<std::string>::size_type and that your
code doesn't accidentally go beyond the range of a char. Other than that,
without knowing anything else, I'd say that the implementation looks pretty
solid.

If the call to distance still looks odd to you, you can use subtraction
instead, or separate it out into two different lines.
 
D

Daniel T.

I have a: vector<string> which contains a few dozen elements.

I want to find the index of the element containing a certain string.

for example:

vector<string> strings;
strings.push_back("abc");
strings.push_back("xyz");
strings.push_back("lmnop");

int index = distance(strings.begin(), find(strings.begin(),
strings.end(), string("xyz")));

cout << index << endl; // should print out a '1', the index of "xyz"

The above line using 'distance' seems pretty terrible to me. Please
tell me that there a better way?

That does look a little yuk. How about this?

// this is a bit ugly, but its use is much cleaner. See below
template < typename InputIterator, typename EqualityComparable >
typename iterator_traits<InputIterator>::difference_type
Index(const InputIterator& begin, const InputIterator& end,
const EqualityComparable& item) {
return distance(begin, find(begin, end, item));
}

int main() {
vector<string> strings;
strings.push_back("abc");
strings.push_back("xyz");
strings.push_back("lmnop");

assert(Index(strings.begin(), strings.end(), "xyz") == 1);
assert(Index(strings.begin(), strings.end(), "WWW") ==
strings.size()); //line 1
assert(Index(strings.rbegin(), strings.rend(), "lmnop") == 0);//line 2
cout << "OK";
}

Note how it handles error conditions (line 1.) Also note how it handles
reverse iterators (line 2)...
 
J

Joe Harris

David said:
Hi, this I'm the original poster 'Joe' just using a different new
client...

Thanks for your quick response!

Why do I want an index when I already have an iterator?

Well, I had a class like this:

class foo {
[other class members]
string _bar;
public:
string bar() { return _bar; } // my reader function
bar([other parameters], bar) _bar(bar) {} // my constructor
}

The problem is that I have ten's of millions of objects instantiated
from this class, and I don't have enough RAM (and my machine can't hold
anymore than the 1GB it has).

So, since I know that there are only about 10 or 20 unique types of
'bar' I would just store them all in a static vector and store an index
(which only needs to be a char because there are only 10 or 20 unique
strings) to that vector in each object. So, I ended up with this:

class foo_new {
[ other class members]
static vector<string> mystrings;
char _bar_index; // this is just an index to the mystrings element
public:
string bar() { return mystrings[bar_index]; } // my reader function
foo_new([other parameters], string bar) { // constructor
int index = distance(strings.begin(), find(strings.begin(),
strings.end(), bar));
if(index != strings.end()) {
_bar_index = index;
} else {
_bar_index = strings.size();
strings.push_back(bar);
}
}
}

So, foo_new objects now have 'char bar_index' instead of 'string bar'
which will save me hundreds of megabytes.

If you can propose a better way of getting a similar result, I will be
both grateful and impressed.


Well, I can't think of a better way to do what you're doing. It looks like
a flyweight. However, given that there are tens of millions of these
objects, I would probably verify a few things, if I were you. First, make
sure that there isn't a way you can reduce the number of elements. Can
these objects, or certain member of these objects, be computed on the fly?
Second, make sure that there aren't more members that can benefit from using
that static table. Third, make sure that you are realizing the benefits of
using a char instead of a std::vector<std::string>::size_type and that your
code doesn't accidentally go beyond the range of a char. Other than that,
without knowing anything else, I'd say that the implementation looks pretty
solid.

If the call to distance still looks odd to you, you can use subtraction
instead, or separate it out into two different lines.

I am now using subtraction and it looks better. I'm still getting up to
speed with STL and I wanted to make sure that what I was doing wasn't
too strange. It seemed odd to me that there was no more concise way to
express my thought, but I guess that -- as you say -- people usually are
happy with an iterator.

-- Joe
 
J

Joe Harris

Daniel said:
That does look a little yuk. How about this?

// this is a bit ugly, but its use is much cleaner. See below
template < typename InputIterator, typename EqualityComparable >
typename iterator_traits<InputIterator>::difference_type
Index(const InputIterator& begin, const InputIterator& end,
const EqualityComparable& item) {
return distance(begin, find(begin, end, item));
}

int main() {
vector<string> strings;
strings.push_back("abc");
strings.push_back("xyz");
strings.push_back("lmnop");

assert(Index(strings.begin(), strings.end(), "xyz") == 1);
assert(Index(strings.begin(), strings.end(), "WWW") ==
strings.size()); //line 1
assert(Index(strings.rbegin(), strings.rend(), "lmnop") == 0);//line 2
cout << "OK";
}

Note how it handles error conditions (line 1.) Also note how it handles
reverse iterators (line 2)...

Interesting... thanks.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,432
Messages
2,571,682
Members
48,796
Latest member
Greg L.

Latest Threads

Top