char array nummterminated to std::vector<std::string>

Philipp Kraus · Apr 21, 2011

Hello,

I get a char array (out of another component) like this
abcd*xyz*mnop*

* is a seperator nullterm char. How can I seperate the char array at
the * and push back them to a std;;vector<std::string> ?

Thanks

Phil

Paul · Apr 21, 2011

Philipp Kraus said:
Hello,

I get a char array (out of another component) like this
abcd*xyz*mnop*

* is a seperator nullterm char. How can I seperate the char array at the *
and push back them to a std;;vector<std::string> ?

Thanks

There is a C function called strtok() that does this.

http://www.cplusplus.com/reference/clibrary/cstring/strtok/

Paul · Apr 21, 2011

Philipp Kraus said:
Hello,

I get a char array (out of another component) like this
abcd*xyz*mnop*

* is a seperator nullterm char. How can I seperate the char array at the *
and push back them to a std;;vector<std::string> ?

Thanks

There is another way explained on this page at section 7.3 , you may prefer.
http://www.oopweb.com/CPP/Documents/CPPHOWTO/Volume/C++Programming-HOWTO-7.html

Virchanza · Apr 21, 2011

Hello,

I get a char array (out of another component) like this
abcd*xyz*mnop*

* is a seperator nullterm char. How can I seperate the char array at
the * and push back them to a std;;vector<std::string> ?

vector<string> MakeVecString(char *p, char const z)
{
assert( p != 0 );
assert( *p != '\0');
assert( z != '\0');

vector<string> vecstr;

for (;

{
char *const q = strchr(p, z);

if ( q )
{
*q = '\0';
}

vecstr.push_back(p);

if ( !q )
break;

p = q + 1;
}

return vecstr;
}

Philipp Kraus · Apr 21, 2011

vector<string> MakeVecString(char *p, char const z)
{
assert( p != 0 );
assert( *p != '\0');
assert( z != '\0');

vector<string> vecstr;

for (;
{
char *const q = strchr(p, z);

if ( q )
{
*q = '\0';
}

vecstr.push_back(p);

if ( !q )
break;

p = q + 1;
}

return vecstr;
}

This seperates the chars. On my example I need a vector in this case:
vec[0] = abcd
vec[1] = xyz
vec[2] = mnop

Thanks

Philipp Kraus · Apr 21, 2011

There is a C function called strtok() that does this.

http://www.cplusplus.com/reference/clibrary/cstring/strtok/

I have tried this with the delimiter \0 but it does not work. I will
get only the first part "abcd". On my example above I would like to get
a result vector in this way:
vec[0] = "abcd"
vec[1] = "xyz"
vec[2] = "mnop"

Thanks for help

Phil

Kai-Uwe Bux · Apr 21, 2011

Philipp said:
Hello,

I get a char array (out of another component) like this
abcd*xyz*mnop*

* is a seperator nullterm char. How can I seperate the char array at
the * and push back them to a std;;vector<std::string> ?

Well, the idea would be to search for the first * and then construct the
first string from the initial segment. Then, you would search for the second
* and construct the second string from the segment between the two *. Now,
you go on.

The problem I see is this: how do you tell when to stop; i.e., how can you
know detect that a * that you find is the last. It appears that this is not
encoded in your input data. At least, it seems, you have left out that part
in your description.

Best,

Kai-Uwe Bux

Paul · Apr 21, 2011

Philipp Kraus said:
There is a C function called strtok() that does this.

http://www.cplusplus.com/reference/clibrary/cstring/strtok/

Click to expand...

I have tried this with the delimiter \0 but it does not work. I will get
only the first part "abcd". On my example above I would like to get a
result vector in this way:
vec[0] = "abcd"
vec[1] = "xyz"
vec[2] = "mnop"

Thanks for help

You could just do something like this:

#include <iostream>
#include <vector>
#include <string>

void splitstring(std::vector<std::string>& v, char* p, char c){
std::string str;
while (*p != '\0'){
if (*p == c){
v.push_back(str);
str.clear();
++p;
}
str+= *p;
++p;
}
if(*(str.data()) != '\0')
v.push_back(str);
}

void printvector(std::vector<std::string> v){
for (int i=0; i<v.size() ; ++i ){
std::cout<< v << std::endl;
}
}

int main(){
std::vector<std::string> v1;
std::vector<std::string> v2;
char arr1[] = "abcd*xyz*mnop*";
char arr2[] = "abcd*xyz*mnop*sgegsgsgga";

splitstring(v1, arr1, '*');
splitstring(v2, arr2, '*');
std::cout<<"Printing vector v1:\n";
printvector(v1);
std::cout<<"Printing vector v2:\n";
printvector(v2);

}

The above is not perfect because if your first char is a '*' it would create
an empty string at v[0], but it may be something you can work on.

Paul · Apr 21, 2011

Leigh Johnston said:
Hello,

I get a char array (out of another component) like this
abcd*xyz*mnop*

* is a seperator nullterm char. How can I seperate the char array at
the * and push back them to a std;;vector<std::string> ?

Thanks

There is a C function called strtok() that does this.

http://www.cplusplus.com/reference/clibrary/cstring/strtok/

I have tried this with the delimiter \0 but it does not work. I will
get only the first part "abcd". On my example above I would like to
get a result vector in this way:
vec[0] = "abcd"
vec[1] = "xyz"
vec[2] = "mnop"

Thanks for help

Click to expand...

You could just do something like this:

#include <iostream>
#include <vector>
#include <string>

void splitstring(std::vector<std::string>& v, char* p, char c){
std::string str;
while (*p != '\0'){

Click to expand...

You need to learn to name variables more intelligently; 'p' for 'pointer'
and 'str' for 'string' is all very n00bish; names should be chosen based
on *role* not *type*.

The naming convention god has spoken. Now we all must use Leighs naming
conventions.

Would you like CamelCase or underscores?

Philipp Kraus · Apr 21, 2011

Well, the idea would be to search for the first * and then construct the
first string from the initial segment. Then, you would search for the second
* and construct the second string from the segment between the two *. Now,
you go on.

Is there a C or C++ function which I can use for seperating the parts?
Or should I iterate over all elements, if I read a \0 char, I push back
a element to the vector and
than create a new string element on which I append the readed char.

The problem I see is this: how do you tell when to stop; i.e., how can you
know detect that a * that you find is the last. It appears that this is not
encoded in your input data. At least, it seems, you have left out that part
in your description.

I know the length of the char array, so I can stop if I reach the end.

Phil

Philipp Kraus · Apr 21, 2011

Philipp Kraus said:
Philipp Kraus said:

Hello,

I get a char array (out of another component) like this
abcd*xyz*mnop*

* is a seperator nullterm char. How can I seperate the char array at
the * and push back them to a std;;vector<std::string> ?

Thanks

There is a C function called strtok() that does this.

http://www.cplusplus.com/reference/clibrary/cstring/strtok/

Click to expand...

I have tried this with the delimiter \0 but it does not work. I will
get only the first part "abcd". On my example above I would like to get
a result vector in this way:
vec[0] = "abcd"
vec[1] = "xyz"
vec[2] = "mnop"

Thanks for help

Click to expand...

The above is not perfect because if your first char is a '*' it would
create an empty string at v[0], but it may be something you can work on.

The * should be a placeholder for \0, so my input char array has this
structure abcd\0xyz\0mnop\0 and I will seperate at the \0. In my
opinion I must iterate over all elements of the array to detect the
seperators and cut the array. Can I use a C or C++ function for
seperating at \0?

Phil

Victor Bazarov · Apr 21, 2011

Is there a C or C++ function which I can use for seperating the parts?
Or should I iterate over all elements, if I read a \0 char, I push back
a element to the vector and
than create a new string element on which I append the readed char.

I know the length of the char array, so I can stop if I reach the end.

You can do it manually, as you describe here. Or you could construct a
std::string object and use the member functions 'find', 'substr', to
locate the parts and extract them.

V

Kai-Uwe Bux · Apr 21, 2011

Philipp said:
Is there a C or C++ function which I can use for seperating the parts?
Or should I iterate over all elements, if I read a \0 char, I push back
a element to the vector and
than create a new string element on which I append the readed char.

I would just iterate. However, I would not create the strings in the vector<
string > by appending each character separately. Instead, I would do
something like this:

given: char const * from = beginning of the char array
char const * to = past_end for the char array

local: std::vector< std::string > result : initially empty

while ( from != to ) {
char const * next = std::find( from, to, \0 );
result.push_back( std::string( from, next ) );
from = next + 1; // not quite right for next == to
}

Note the use of std::find() to search a \0 character and note the
construction of string from two char const * cutting out the piece you are
interested in.

I know the length of the char array, so I can stop if I reach the end.

Ah, I see. That way, you could compute the past_end pointer above.

Best,

Kai-Uwe Bux

Paul · Apr 21, 2011

Philipp Kraus said:
Hello,

I get a char array (out of another component) like this
abcd*xyz*mnop*

* is a seperator nullterm char. How can I seperate the char array at
the * and push back them to a std;;vector<std::string> ?

Thanks

There is a C function called strtok() that does this.

http://www.cplusplus.com/reference/clibrary/cstring/strtok/

I have tried this with the delimiter \0 but it does not work. I will get
only the first part "abcd". On my example above I would like to get a
result vector in this way:
vec[0] = "abcd"
vec[1] = "xyz"
vec[2] = "mnop"

Thanks for help

Click to expand...

The above is not perfect because if your first char is a '*' it would
create an empty string at v[0], but it may be something you can work on.

Click to expand...

The * should be a placeholder for \0, so my input char array has this
structure abcd\0xyz\0mnop\0 and I will seperate at the \0. In my opinion I
must iterate over all elements of the array to detect the seperators and
cut the array. Can I use a C or C++ function for seperating at \0?

The problem with that is that the char '\0' is generally used to indicate
the end of the string. So if your strings are all the same length its easy,
but if they vary in length then you will need a way to identify the length
of the strings for any routine to iterate though them

Joshua Maurice · Apr 21, 2011

Hello,

I get a char array (out of another component) like this
abcd*xyz*mnop*

* is a seperator nullterm char. How can I seperate the char array at
the * and push back them to a std;;vector<std::string> ?

I presume that the string encoding used is a simple fixed width
encoding where '\0' has no other meaning besides a separator
character? If no, then this may have just become manifestly harder.

I also presume that you are also given the byte length of this input
char array or some equivalent? If not, then the problem is
impossible.

Also, note that it's "std::vector", not "std;;vector".

Finally, this sounds like homework. The solution is easy and
straightforward. Iterate from the beginning until the end of the char
array, keeping track of the start of the current substring. When you
encounter the next '\0' character, you have pointers / iterators to
the beginning of the substring, and one past the end. You can then
construct a new std::string, and push_back it to the vector. Repeat
for each substring. Repeat until you've reached the end of the input
char array.

Thomas J. Gritzan · Apr 22, 2011

Am 21.04.2011 20:18, schrieb Philipp Kraus:

Hello,

I get a char array (out of another component) like this
abcd*xyz*mnop*

* is a seperator nullterm char. How can I seperate the char array at the
* and push back them to a std;;vector<std::string> ?

#include <iostream>
#include <string>
#include <vector>

std::vector<std::string> tokenizeMultiSZ(const char* str, size_t length)
{
std::vector<std::string> result;
for (size_t pos = 0; pos < length; ) {
std::string line(str+pos); /* this reads the string until it
finds a \0 */
pos += line.length() + 1; /* skip to next null-terminated string */
result.push_back(line);
}
return result;
}

int main()
{
const char input[] = "abcd\0xyz\0mnop";
std::vector<std::string> result = tokenizeMultiSZ(input, sizeof(input));

for (int i = 0; i < result.size(); ++i)
std::cout << result << std::endl;
}

The above code "abuses" the std::string constructor which reads a
null-terminated char array. Because the string knows its size after the
construction, we can use this to skip over those parts of the input
string that we just read.

Pro: We read the input string only once.

Paul · Apr 22, 2011

Leigh Johnston said:
On 21/04/2011 21:52, Philipp Kraus wrote:

You are rather sad.
With your personal homepage that advertises the 2 or 3 programs that you
have ever created. I believe this tokeniser is one of them and a singleton
class or something is another.
Your homepage gives a list of the computer of games you play, which is
longer that your arm. You are quite obvioulsy a struggling unemployed person
who sits round playing computer games all day long. I bet you still live
with your parents and you have never had a girlfriend.

Besides your annoying repetitions and your irritating incorrectness, you are
actaull quite pityfull. Especailly when I see you posting, what you consider
one of the greatest creations of all time, and nobody takes any notice,
because its really a useless pile of shit. I genuinely feel sorry for you.

Philipp Kraus · Apr 22, 2011

Finally, this sounds like homework.

No, it's not homework, but the thread explode. I get the char array
back from a external function so I can't change the return char value.
My question at the beginning was, is there a function to seperated char
arrys on the \0 char. I've tested it, and all functions stop at the \0.

The solution is easy and
straightforward. Iterate from the beginning until the end of the char
array, keeping track of the start of the current substring. When you
encounter the next '\0' character, you have pointers / iterators to
the beginning of the substring, and one past the end. You can then
construct a new std::string, and push_back it to the vector. Repeat
for each substring. Repeat until you've reached the end of the input
char array.

Yes, I have iterate over all chars and put them to the string and the
string into vector

Thx

Paul · Apr 22, 2011

Leigh Johnston said:
Hello,

I get a char array (out of another component) like this
abcd*xyz*mnop*

* is a seperator nullterm char. How can I seperate the char array
at the * and push back them to a std;;vector<std::string> ?

Thanks

There is a C function called strtok() that does this.

http://www.cplusplus.com/reference/clibrary/cstring/strtok/

I have tried this with the delimiter \0 but it does not work. I will
get only the first part "abcd". On my example above I would like to
get a result vector in this way:
vec[0] = "abcd"
vec[1] = "xyz"
vec[2] = "mnop"

Thanks for help

The above is not perfect because if your first char is a '*' it would
create an empty string at v[0], but it may be something you can work
on.

The * should be a placeholder for \0, so my input char array has this
structure abcd\0xyz\0mnop\0 and I will seperate at the \0. In my opinion
I must iterate over all elements of the array to detect the seperators
and cut the array. Can I use a C or C++ function for seperating at \0?

Click to expand...

There are not any standard C or C++ functions that can help you; you
will have to write your own.

If you are lazy you can use my tokenizer function(s):

template <typename C, typename FwdIter1, typename FwdIter2>
inline FwdIter1 do_tokens(FwdIter1 aFirst, FwdIter1 aLast, FwdIter2
aDelimeterFirst, FwdIter2 aDelimiterLast, C& aTokens, std::size_t
aMaxTokens = 0, bool aSkipEmptyTokens = true, bool
aDelimeterIsSubsequence = false)
{
if (aFirst >= aLast)
return aFirst;

typedef typename C::value_type value_type;

FwdIter1 b = aFirst;
FwdIter1 e = aDelimeterIsSubsequence ? std::search(b, aLast,
aDelimeterFirst, aDelimiterLast) : std::find_first_of(b, aLast,
aDelimeterFirst, aDelimiterLast);
std::size_t tokens = 0;
while(e != aLast && (aMaxTokens == 0 || tokens < aMaxTokens))
{
if (b == e && !aSkipEmptyTokens)
{
aTokens.push_back(value_type(b, b));
++tokens;
}
else if (b != e)
{
aTokens.push_back(value_type(b, e));
++tokens;
}
b = e;
for (std::size_t i = aDelimeterIsSubsequence ?
std::distance(aDelimeterFirst, aDelimiterLast) : 1; i > 0; --i)
++b;
e = aDelimeterIsSubsequence ? std::search(b, aLast, aDelimeterFirst,
aDelimiterLast) : std::find_first_of(b, aLast, aDelimeterFirst,
aDelimiterLast);
}
if (b != e && (aMaxTokens == 0 || tokens < aMaxTokens))
{
aTokens.push_back(value_type(b, e));
b = e;
}
return b;
}
template <typename C, typename FwdIter1, typename FwdIter2>
inline FwdIter1 tokens(FwdIter1 aFirst, FwdIter1 aLast, FwdIter2
aDelimeterFirst, FwdIter2 aDelimiterLast, C& aTokens, std::size_t
aMaxTokens = 0, bool aSkipEmptyTokens = true, bool
aDelimeterIsSubsequence = false)
{
return do_tokens(aFirst, aLast, aDelimeterFirst, aDelimiterLast,
aTokens, aMaxTokens, aSkipEmptyTokens, aDelimeterIsSubsequence);
}
template <typename CharT, typename Traits, typename Alloc, typename C>
inline void tokens(const std::basic_string<CharT, Traits, Alloc>& aLine,
const std::basic_string<CharT, Traits, Alloc>& aDelimeter, C& aTokens,
std::size_t aMaxTokens = 0, bool aSkipEmptyTokens = true, bool
aDelimeterIsSubsequence = false)
{
do_tokens(aLine.begin(), aLine.end(), aDelimeter.begin(),
aDelimeter.end(), aTokens, aMaxTokens, aSkipEmptyTokens,
aDelimeterIsSubsequence);
}

Click to expand...

How to use it for your particular problem:

int main()
{
typedef std::vector<std::string> words_t;
words_t words;
const char source[] = "abcd\0xyz\0mnop\0";
const char delim[] = "\0";
tokens(source, source + sizeof(source), delim, delim + sizeof(delim),
words);
for (words_t::const_iterator i = words.begin(); i != words.end(); ++i)
std::cout << "[" << *i << "] ";
}

Its an awfull lot of code and function calls , include files and general
bulk to do something that only requires a few conditionals and can be done
with something much simpler like this:

const int Max_len = 24;

void splitstr(std::vector<std::string>& v, char* arr, int len){
char temp[Max_len+1]={0};
char* p=temp+1;
for(int i=0; i<len; ++i){
*p = arr;
if(arr =='\0'){
if(i!=0) v.push_back(temp+1);
p=temp;
}
++p;
}
}

int main(){
std::vector<std::string> v;
char arr[] = "\0yzbgh\0dfgt\0ghee";
splitstr(v, arr, sizeof(arr));
}

I'm sure you could code something, in the same style as the code you have
presented using iterators etc which is more efficient than that which you
have posted.
To acheive what essentially one single iteration of a char array how many
calls does your code make to functions such as std:: find_find_first_of(),
std::search() ?
I didn't bother to trace through it. But you probably will have.
Or is it the case that you will just be your usual obnoxious self and behave
with the attitude that yours is the best and only way and so says you
therefore it must be? And there will be no sensible discussion on it.

Paul · Apr 22, 2011

*plonk*

I guess that means your code has so many function calls you don't want to
discuss it.

Aw: char array nummterminated to std::vector<std::string>	1	Apr 22, 2011
std::string on "const char *"	16	Jan 21, 2013
static std::vector<std::string> member and a static function	12	Jun 15, 2010
std::vector::reserve and std::ifstream::read	12	May 24, 2011
Char array as a function returned value	2	Nov 23, 2019
static_cast and std::vector	9	Jan 20, 2012
c++11 std::array init	2	May 16, 2013
Nested Boost::unordered_map with std::pair insertion help ..	1	Feb 20, 2014

char array nummterminated to std::vector<std::string>

Philipp Kraus

Paul

Paul

Virchanza

Philipp Kraus

Philipp Kraus

Kai-Uwe Bux

Paul

Paul

Philipp Kraus

Philipp Kraus

Victor Bazarov

Kai-Uwe Bux

Paul

Joshua Maurice

Thomas J. Gritzan

Paul

Philipp Kraus

Paul

Paul

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads