char array nummterminated to std::vector<std::string>

P

Philipp Kraus

Hello,

I get a char array (out of another component) like this
abcd*xyz*mnop*

* is a seperator nullterm char. How can I seperate the char array at
the * and push back them to a std;;vector<std::string> ?

Thanks

Phil
 
V

Virchanza

Hello,

I get a char array (out of another component) like this
abcd*xyz*mnop*

* is a seperator nullterm char. How can I seperate the char array at
the * and push back them to a std;;vector<std::string> ?

vector<string> MakeVecString(char *p, char const z)
{
assert( p != 0 );
assert( *p != '\0');
assert( z != '\0');

vector<string> vecstr;

for (;;)
{
char *const q = strchr(p, z);

if ( q )
{
*q = '\0';
}

vecstr.push_back(p);

if ( !q )
break;

p = q + 1;
}

return vecstr;
}
 
P

Philipp Kraus

vector<string> MakeVecString(char *p, char const z)
{
assert( p != 0 );
assert( *p != '\0');
assert( z != '\0');

vector<string> vecstr;

for (;;)
{
char *const q = strchr(p, z);

if ( q )
{
*q = '\0';
}

vecstr.push_back(p);

if ( !q )
break;

p = q + 1;
}

return vecstr;
}

This seperates the chars. On my example I need a vector in this case:
vec[0] = abcd
vec[1] = xyz
vec[2] = mnop

Thanks
 
K

Kai-Uwe Bux

Philipp said:
Hello,

I get a char array (out of another component) like this
abcd*xyz*mnop*

* is a seperator nullterm char. How can I seperate the char array at
the * and push back them to a std;;vector<std::string> ?

Well, the idea would be to search for the first * and then construct the
first string from the initial segment. Then, you would search for the second
* and construct the second string from the segment between the two *. Now,
you go on.

The problem I see is this: how do you tell when to stop; i.e., how can you
know detect that a * that you find is the last. It appears that this is not
encoded in your input data. At least, it seems, you have left out that part
in your description.


Best,

Kai-Uwe Bux
 
P

Paul

Philipp Kraus said:
There is a C function called strtok() that does this.

http://www.cplusplus.com/reference/clibrary/cstring/strtok/

I have tried this with the delimiter \0 but it does not work. I will get
only the first part "abcd". On my example above I would like to get a
result vector in this way:
vec[0] = "abcd"
vec[1] = "xyz"
vec[2] = "mnop"


Thanks for help
You could just do something like this:

#include <iostream>
#include <vector>
#include <string>

void splitstring(std::vector<std::string>& v, char* p, char c){
std::string str;
while (*p != '\0'){
if (*p == c){
v.push_back(str);
str.clear();
++p;
}
str+= *p;
++p;
}
if(*(str.data()) != '\0')
v.push_back(str);
}

void printvector(std::vector<std::string> v){
for (int i=0; i<v.size() ; ++i ){
std::cout<< v << std::endl;
}
}

int main(){
std::vector<std::string> v1;
std::vector<std::string> v2;
char arr1[] = "abcd*xyz*mnop*";
char arr2[] = "abcd*xyz*mnop*sgegsgsgga";

splitstring(v1, arr1, '*');
splitstring(v2, arr2, '*');
std::cout<<"Printing vector v1:\n";
printvector(v1);
std::cout<<"Printing vector v2:\n";
printvector(v2);

}


The above is not perfect because if your first char is a '*' it would create
an empty string at v[0], but it may be something you can work on.
 
P

Paul

Leigh Johnston said:
Hello,

I get a char array (out of another component) like this
abcd*xyz*mnop*

* is a seperator nullterm char. How can I seperate the char array at
the * and push back them to a std;;vector<std::string> ?

Thanks


There is a C function called strtok() that does this.

http://www.cplusplus.com/reference/clibrary/cstring/strtok/

I have tried this with the delimiter \0 but it does not work. I will
get only the first part "abcd". On my example above I would like to
get a result vector in this way:
vec[0] = "abcd"
vec[1] = "xyz"
vec[2] = "mnop"


Thanks for help
You could just do something like this:

#include <iostream>
#include <vector>
#include <string>

void splitstring(std::vector<std::string>& v, char* p, char c){
std::string str;
while (*p != '\0'){

You need to learn to name variables more intelligently; 'p' for 'pointer'
and 'str' for 'string' is all very n00bish; names should be chosen based
on *role* not *type*.
The naming convention god has spoken. Now we all must use Leighs naming
conventions. :)
Would you like CamelCase or underscores?
 
P

Philipp Kraus

Well, the idea would be to search for the first * and then construct the
first string from the initial segment. Then, you would search for the second
* and construct the second string from the segment between the two *. Now,
you go on.

Is there a C or C++ function which I can use for seperating the parts?
Or should I iterate over all elements, if I read a \0 char, I push back
a element to the vector and
than create a new string element on which I append the readed char.
The problem I see is this: how do you tell when to stop; i.e., how can you
know detect that a * that you find is the last. It appears that this is not
encoded in your input data. At least, it seems, you have left out that part
in your description.

I know the length of the char array, so I can stop if I reach the end.

Phil
 
P

Philipp Kraus

Philipp Kraus said:
Hello,

I get a char array (out of another component) like this
abcd*xyz*mnop*

* is a seperator nullterm char. How can I seperate the char array at
the * and push back them to a std;;vector<std::string> ?

Thanks


There is a C function called strtok() that does this.

http://www.cplusplus.com/reference/clibrary/cstring/strtok/

I have tried this with the delimiter \0 but it does not work. I will
get only the first part "abcd". On my example above I would like to get
a result vector in this way:
vec[0] = "abcd"
vec[1] = "xyz"
vec[2] = "mnop"


Thanks for help

The above is not perfect because if your first char is a '*' it would
create an empty string at v[0], but it may be something you can work on.


The * should be a placeholder for \0, so my input char array has this
structure abcd\0xyz\0mnop\0 and I will seperate at the \0. In my
opinion I must iterate over all elements of the array to detect the
seperators and cut the array. Can I use a C or C++ function for
seperating at \0?

Phil
 
V

Victor Bazarov

Is there a C or C++ function which I can use for seperating the parts?
Or should I iterate over all elements, if I read a \0 char, I push back
a element to the vector and
than create a new string element on which I append the readed char.


I know the length of the char array, so I can stop if I reach the end.

You can do it manually, as you describe here. Or you could construct a
std::string object and use the member functions 'find', 'substr', to
locate the parts and extract them.

V
 
K

Kai-Uwe Bux

Philipp said:
Is there a C or C++ function which I can use for seperating the parts?
Or should I iterate over all elements, if I read a \0 char, I push back
a element to the vector and
than create a new string element on which I append the readed char.

I would just iterate. However, I would not create the strings in the vector<
string > by appending each character separately. Instead, I would do
something like this:

given: char const * from = beginning of the char array
char const * to = past_end for the char array

local: std::vector< std::string > result : initially empty

while ( from != to ) {
char const * next = std::find( from, to, \0 );
result.push_back( std::string( from, next ) );
from = next + 1; // not quite right for next == to
}

Note the use of std::find() to search a \0 character and note the
construction of string from two char const * cutting out the piece you are
interested in.

I know the length of the char array, so I can stop if I reach the end.

Ah, I see. That way, you could compute the past_end pointer above.


Best,

Kai-Uwe Bux
 
P

Paul

Philipp Kraus said:
Hello,

I get a char array (out of another component) like this
abcd*xyz*mnop*

* is a seperator nullterm char. How can I seperate the char array at
the * and push back them to a std;;vector<std::string> ?

Thanks


There is a C function called strtok() that does this.

http://www.cplusplus.com/reference/clibrary/cstring/strtok/

I have tried this with the delimiter \0 but it does not work. I will get
only the first part "abcd". On my example above I would like to get a
result vector in this way:
vec[0] = "abcd"
vec[1] = "xyz"
vec[2] = "mnop"


Thanks for help

The above is not perfect because if your first char is a '*' it would
create an empty string at v[0], but it may be something you can work on.


The * should be a placeholder for \0, so my input char array has this
structure abcd\0xyz\0mnop\0 and I will seperate at the \0. In my opinion I
must iterate over all elements of the array to detect the seperators and
cut the array. Can I use a C or C++ function for seperating at \0?
The problem with that is that the char '\0' is generally used to indicate
the end of the string. So if your strings are all the same length its easy,
but if they vary in length then you will need a way to identify the length
of the strings for any routine to iterate though them
 
J

Joshua Maurice

Hello,

I get a char array (out of another component) like this
abcd*xyz*mnop*

* is a seperator nullterm char. How can I seperate the char array at
the * and push back them to a std;;vector<std::string> ?

I presume that the string encoding used is a simple fixed width
encoding where '\0' has no other meaning besides a separator
character? If no, then this may have just become manifestly harder.

I also presume that you are also given the byte length of this input
char array or some equivalent? If not, then the problem is
impossible.

Also, note that it's "std::vector", not "std;;vector".

Finally, this sounds like homework. The solution is easy and
straightforward. Iterate from the beginning until the end of the char
array, keeping track of the start of the current substring. When you
encounter the next '\0' character, you have pointers / iterators to
the beginning of the substring, and one past the end. You can then
construct a new std::string, and push_back it to the vector. Repeat
for each substring. Repeat until you've reached the end of the input
char array.
 
T

Thomas J. Gritzan

Am 21.04.2011 20:18, schrieb Philipp Kraus:
Hello,

I get a char array (out of another component) like this
abcd*xyz*mnop*

* is a seperator nullterm char. How can I seperate the char array at the
* and push back them to a std;;vector<std::string> ?

#include <iostream>
#include <string>
#include <vector>

std::vector<std::string> tokenizeMultiSZ(const char* str, size_t length)
{
std::vector<std::string> result;
for (size_t pos = 0; pos < length; ) {
std::string line(str+pos); /* this reads the string until it
finds a \0 */
pos += line.length() + 1; /* skip to next null-terminated string */
result.push_back(line);
}
return result;
}

int main()
{
const char input[] = "abcd\0xyz\0mnop";
std::vector<std::string> result = tokenizeMultiSZ(input, sizeof(input));

for (int i = 0; i < result.size(); ++i)
std::cout << result << std::endl;
}

The above code "abuses" the std::string constructor which reads a
null-terminated char array. Because the string knows its size after the
construction, we can use this to skip over those parts of the input
string that we just read.

Pro: We read the input string only once.
 
P

Paul

Leigh Johnston said:
On 21/04/2011 21:52, Philipp Kraus wrote:

You are rather sad.
With your personal homepage that advertises the 2 or 3 programs that you
have ever created. I believe this tokeniser is one of them and a singleton
class or something is another.
Your homepage gives a list of the computer of games you play, which is
longer that your arm. You are quite obvioulsy a struggling unemployed person
who sits round playing computer games all day long. I bet you still live
with your parents and you have never had a girlfriend.

Besides your annoying repetitions and your irritating incorrectness, you are
actaull quite pityfull. Especailly when I see you posting, what you consider
one of the greatest creations of all time, and nobody takes any notice,
because its really a useless pile of shit. I genuinely feel sorry for you.
 
P

Philipp Kraus

Finally, this sounds like homework.

No, it's not homework, but the thread explode. I get the char array
back from a external function so I can't change the return char value.
My question at the beginning was, is there a function to seperated char
arrys on the \0 char. I've tested it, and all functions stop at the \0.

The solution is easy and
straightforward. Iterate from the beginning until the end of the char
array, keeping track of the start of the current substring. When you
encounter the next '\0' character, you have pointers / iterators to
the beginning of the substring, and one past the end. You can then
construct a new std::string, and push_back it to the vector. Repeat
for each substring. Repeat until you've reached the end of the input
char array.

Yes, I have iterate over all chars and put them to the string and the
string into vector

Thx
 
P

Paul

Leigh Johnston said:
Hello,

I get a char array (out of another component) like this
abcd*xyz*mnop*

* is a seperator nullterm char. How can I seperate the char array
at the * and push back them to a std;;vector<std::string> ?

Thanks


There is a C function called strtok() that does this.

http://www.cplusplus.com/reference/clibrary/cstring/strtok/

I have tried this with the delimiter \0 but it does not work. I will
get only the first part "abcd". On my example above I would like to
get a result vector in this way:
vec[0] = "abcd"
vec[1] = "xyz"
vec[2] = "mnop"


Thanks for help


The above is not perfect because if your first char is a '*' it would
create an empty string at v[0], but it may be something you can work
on.


The * should be a placeholder for \0, so my input char array has this
structure abcd\0xyz\0mnop\0 and I will seperate at the \0. In my opinion
I must iterate over all elements of the array to detect the seperators
and cut the array. Can I use a C or C++ function for seperating at \0?

There are not any standard C or C++ functions that can help you; you
will have to write your own.

If you are lazy you can use my tokenizer function(s):

template <typename C, typename FwdIter1, typename FwdIter2>
inline FwdIter1 do_tokens(FwdIter1 aFirst, FwdIter1 aLast, FwdIter2
aDelimeterFirst, FwdIter2 aDelimiterLast, C& aTokens, std::size_t
aMaxTokens = 0, bool aSkipEmptyTokens = true, bool
aDelimeterIsSubsequence = false)
{
if (aFirst >= aLast)
return aFirst;

typedef typename C::value_type value_type;

FwdIter1 b = aFirst;
FwdIter1 e = aDelimeterIsSubsequence ? std::search(b, aLast,
aDelimeterFirst, aDelimiterLast) : std::find_first_of(b, aLast,
aDelimeterFirst, aDelimiterLast);
std::size_t tokens = 0;
while(e != aLast && (aMaxTokens == 0 || tokens < aMaxTokens))
{
if (b == e && !aSkipEmptyTokens)
{
aTokens.push_back(value_type(b, b));
++tokens;
}
else if (b != e)
{
aTokens.push_back(value_type(b, e));
++tokens;
}
b = e;
for (std::size_t i = aDelimeterIsSubsequence ?
std::distance(aDelimeterFirst, aDelimiterLast) : 1; i > 0; --i)
++b;
e = aDelimeterIsSubsequence ? std::search(b, aLast, aDelimeterFirst,
aDelimiterLast) : std::find_first_of(b, aLast, aDelimeterFirst,
aDelimiterLast);
}
if (b != e && (aMaxTokens == 0 || tokens < aMaxTokens))
{
aTokens.push_back(value_type(b, e));
b = e;
}
return b;
}
template <typename C, typename FwdIter1, typename FwdIter2>
inline FwdIter1 tokens(FwdIter1 aFirst, FwdIter1 aLast, FwdIter2
aDelimeterFirst, FwdIter2 aDelimiterLast, C& aTokens, std::size_t
aMaxTokens = 0, bool aSkipEmptyTokens = true, bool
aDelimeterIsSubsequence = false)
{
return do_tokens(aFirst, aLast, aDelimeterFirst, aDelimiterLast,
aTokens, aMaxTokens, aSkipEmptyTokens, aDelimeterIsSubsequence);
}
template <typename CharT, typename Traits, typename Alloc, typename C>
inline void tokens(const std::basic_string<CharT, Traits, Alloc>& aLine,
const std::basic_string<CharT, Traits, Alloc>& aDelimeter, C& aTokens,
std::size_t aMaxTokens = 0, bool aSkipEmptyTokens = true, bool
aDelimeterIsSubsequence = false)
{
do_tokens(aLine.begin(), aLine.end(), aDelimeter.begin(),
aDelimeter.end(), aTokens, aMaxTokens, aSkipEmptyTokens,
aDelimeterIsSubsequence);
}

How to use it for your particular problem:

int main()
{
typedef std::vector<std::string> words_t;
words_t words;
const char source[] = "abcd\0xyz\0mnop\0";
const char delim[] = "\0";
tokens(source, source + sizeof(source), delim, delim + sizeof(delim),
words);
for (words_t::const_iterator i = words.begin(); i != words.end(); ++i)
std::cout << "[" << *i << "] ";
}
Its an awfull lot of code and function calls , include files and general
bulk to do something that only requires a few conditionals and can be done
with something much simpler like this:

const int Max_len = 24;

void splitstr(std::vector<std::string>& v, char* arr, int len){
char temp[Max_len+1]={0};
char* p=temp+1;
for(int i=0; i<len; ++i){
*p = arr;
if(arr =='\0'){
if(i!=0) v.push_back(temp+1);
p=temp;
}
++p;
}
}

int main(){
std::vector<std::string> v;
char arr[] = "\0yzbgh\0dfgt\0ghee";
splitstr(v, arr, sizeof(arr));
}


I'm sure you could code something, in the same style as the code you have
presented using iterators etc which is more efficient than that which you
have posted.
To acheive what essentially one single iteration of a char array how many
calls does your code make to functions such as std:: find_find_first_of(),
std::search() ?
I didn't bother to trace through it. But you probably will have.
Or is it the case that you will just be your usual obnoxious self and behave
with the attitude that yours is the best and only way and so says you
therefore it must be? And there will be no sensible discussion on it.
:)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,767
Messages
2,569,572
Members
45,045
Latest member
DRCM

Latest Threads

Top