char array nummterminated to std::vector<std::string>

Peter Remmers · Apr 22, 2011

Am 22.04.2011 17:56, schrieb Leigh Johnston:

[criticism to Leigh's code]

Click to expand...

*plonk*

I don't endorse Paul's way of attacking your code, but at the core of
his criticism, I actually have to agree. I think that it is way too
verbose, and therefore unreadable, for the simple task it is supposed to do.

Peter

Paul · Apr 22, 2011

Peter Remmers said:
Am 22.04.2011 17:56, schrieb Leigh Johnston:

[criticism to Leigh's code]

Click to expand...

*plonk*

Click to expand...

I don't endorse Paul's way of attacking your code, but at the core of his
criticism, I actually have to agree. I think that it is way too verbose,
and therefore unreadable, for the simple task it is supposed to do.

I didn't attack his code I made a criticism that it perhaps had too many
function calls. And I invited a reasonable debate about it, but I did say
that Leigh would probably be of the mind that his code was so great nobody
was worthy of criticising it in any way , and I did feel he would be
unwilling to discuss any possible imperfections, I was obviously correct in
my premonition about Leighs response.

And even your very carefull attempt to make a smalll criticism was to
responded to in a manner that was quite hostile.

Paul · Apr 22, 2011

Leigh Johnston said:
Am 22.04.2011 17:56, schrieb Leigh Johnston:
On 22/04/2011 16:25, Paul wrote:

[criticism to Leigh's code]

*plonk*

I don't endorse Paul's way of attacking your code, but at the core of
his criticism, I actually have to agree. I think that it is way too
verbose, and therefore unreadable, for the simple task it is supposed to
do.

Click to expand...

I disagree; it is not too verbose when you consider it is a general
utility function designed to solve a general problem (you can specify a
maximum token count; treat the delimiters as a subsequence as well as a
list of delimiters; and to discard empty tokens) rather than the OP's
specific problem even though it can be used to solve that specific
problem. Paul The Troll's criticism was a nonsense as I do not iterate
through the source buffer more than once which he claimed.

Click to expand...

I forgot to mention that the same function can also tokenize into pairs of
iterators rather making sub-string copies which has obvious performance
benefits; as I said it is a general (iterator based) solution to a general
problem rather than a specific solution to a specific problem. One should
strive to solve problems in as generic a way as possible especially when
designing library functions (which my solution is).

Paul The Troll's alternative "solution" is far from generic having a fixed
sized buffer and such.

I wrote a function once, it consisted of 10,000 pages of code, but it could
process absolutely anything.
Its alot better that your functionS.

Peter Remmers · Apr 22, 2011

Am 22.04.2011 20:53, schrieb Leigh Johnston:

Am 22.04.2011 17:56, schrieb Leigh Johnston:
On 22/04/2011 16:25, Paul wrote:

[criticism to Leigh's code]

*plonk*

I don't endorse Paul's way of attacking your code, but at the core of
his criticism, I actually have to agree. I think that it is way too
verbose, and therefore unreadable, for the simple task it is supposed to
do.

Click to expand...

I disagree; it is not too verbose when you consider it is a general
utility function designed to solve a general problem (you can specify a
maximum token count; treat the delimiters as a subsequence as well as a
list of delimiters; and to discard empty tokens) rather than the OP's
specific problem even though it can be used to solve that specific
problem. Paul The Troll's criticism was a nonsense as I do not iterate
through the source buffer more than once which he claimed.

Click to expand...

I forgot to mention that the same function can also tokenize into pairs
of iterators rather making sub-string copies which has obvious
performance benefits; as I said it is a general (iterator based)
solution to a general problem rather than a specific solution to a
specific problem. One should strive to solve problems in as generic a
way as possible especially when designing library functions (which my
solution is).

So your function is a "swiss army knife"...

The more generic a function/class/library becomes, the more of a monster
it becomes.
I think a function should focus on a single task. If you need a function
that does something different, write another function.

Why do you think there are different variations of find() in the STL,
such as find_first_of(), find_last_of(), find_if(), etc.?
You would have written a single find() function that you can parametrize
with all sorts of stuff.

KISS.

That said. I think it is too big for what it does. Or it tries to do too
much at once. Pick one.

Paul The Troll's alternative "solution" is far from generic having a
fixed sized buffer and such.

I didn't say his was the ultimate solution

Peter

Peter Remmers · Apr 22, 2011

Am 22.04.2011 21:30, schrieb Peter Remmers:

Am 22.04.2011 20:53, schrieb Leigh Johnston:

On 22/04/2011 19:26, Peter Remmers wrote:
Am 22.04.2011 17:56, schrieb Leigh Johnston:
On 22/04/2011 16:25, Paul wrote:

[criticism to Leigh's code]

*plonk*

I don't endorse Paul's way of attacking your code, but at the core of
his criticism, I actually have to agree. I think that it is way too
verbose, and therefore unreadable, for the simple task it is supposed to
do.

I disagree; it is not too verbose when you consider it is a general
utility function designed to solve a general problem (you can specify a
maximum token count; treat the delimiters as a subsequence as well as a
list of delimiters; and to discard empty tokens) rather than the OP's
specific problem even though it can be used to solve that specific
problem. Paul The Troll's criticism was a nonsense as I do not iterate
through the source buffer more than once which he claimed.

Click to expand...

I forgot to mention that the same function can also tokenize into pairs
of iterators rather making sub-string copies which has obvious
performance benefits; as I said it is a general (iterator based)
solution to a general problem rather than a specific solution to a
specific problem. One should strive to solve problems in as generic a
way as possible especially when designing library functions (which my
solution is).

Click to expand...

So your function is a "swiss army knife"...

The more generic a function/class/library becomes, the more of a monster
it becomes.
I think a function should focus on a single task. If you need a function
that does something different, write another function.

Why do you think there are different variations of find() in the STL,
such as find_first_of(), find_last_of(), find_if(), etc.?
You would have written a single find() function that you can parametrize
with all sorts of stuff.

KISS.

That said. I think it is too big for what it does. Or it tries to do too
much at once. Pick one.

I'd like to add that I think that parameters that only serve to change
the algorithm of the function are a sure sign that you should split that
beast into different functions.
Your template bool parameters are such indicators.

Also, you said:This, at the very least, is a prime candidate for breaking the function up.

Peter

Paul · Apr 22, 2011

Leigh Johnston said:
Am 22.04.2011 17:56, schrieb Leigh Johnston:
On 22/04/2011 16:25, Paul wrote:

[criticism to Leigh's code]

*plonk*

I don't endorse Paul's way of attacking your code, but at the core of
his criticism, I actually have to agree. I think that it is way too
verbose, and therefore unreadable, for the simple task it is supposed to
do.

Click to expand...

I disagree; it is not too verbose when you consider it is a general
utility function designed to solve a general problem (you can specify a
maximum token count; treat the delimiters as a subsequence as well as a
list of delimiters; and to discard empty tokens) rather than the OP's
specific problem even though it can be used to solve that specific
problem. Paul The Troll's criticism was a nonsense as I do not iterate
through the source buffer more than once which he claimed.

Click to expand...

I forgot to mention that the same function can also tokenize into pairs of
iterators rather making sub-string copies which has obvious performance
benefits; as I said it is a general (iterator based) solution to a general
problem rather than a specific solution to a specific problem. One should
strive to solve problems in as generic a way as possible especially when
designing library functions (which my solution is).

Paul The Troll's alternative "solution" is far from generic having a fixed
sized buffer and such.

You fail to realise that you solution is actually worse in this respect.
Any character array is going to be passed to this function as a char*, your
2nd parameter requires a char* + length to operate.

In no real practicable situation is this going to be a static array , it
will usually be a char*.

const char source[] = "abcd\0xyz\0mnop\0";
const char delim[] = "\0";
tokens(source, source + sizeof(source), delim, delim + sizeof(delim),
words);

oh look parameter two requires a sizeof claculation.

Actually I've just noticed so does parameter 4 , what the hell is that?

Additionaly my code was a 5 minute example, I'm sure if I spent some time on
this I could create something much more proffessonal than yours, expecially
now that I have had a further look at your code which appears to be rather
noobish. Once again I felt pity for you as I looked over your code.

If you werent such an obnoxious and hostile character you might actually be
able to have a sensible discussion.

Peter Remmers · Apr 22, 2011

Am 22.04.2011 22:30, schrieb Leigh Johnston:

Am 22.04.2011 20:53, schrieb Leigh Johnston:

On 22/04/2011 19:40, Leigh Johnston wrote:
On 22/04/2011 19:26, Peter Remmers wrote:
Am 22.04.2011 17:56, schrieb Leigh Johnston:
On 22/04/2011 16:25, Paul wrote:

[criticism to Leigh's code]

*plonk*

I don't endorse Paul's way of attacking your code, but at the core of
his criticism, I actually have to agree. I think that it is way too
verbose, and therefore unreadable, for the simple task it is
supposed to
do.

I disagree; it is not too verbose when you consider it is a general
utility function designed to solve a general problem (you can specify a
maximum token count; treat the delimiters as a subsequence as well as a
list of delimiters; and to discard empty tokens) rather than the OP's
specific problem even though it can be used to solve that specific
problem. Paul The Troll's criticism was a nonsense as I do not iterate
through the source buffer more than once which he claimed.

I forgot to mention that the same function can also tokenize into pairs
of iterators rather making sub-string copies which has obvious
performance benefits; as I said it is a general (iterator based)
solution to a general problem rather than a specific solution to a
specific problem. One should strive to solve problems in as generic a
way as possible especially when designing library functions (which my
solution is).

Click to expand...

So your function is a "swiss army knife"...

The more generic a function/class/library becomes, the more of a monster
it becomes.

Click to expand...

35 lines of code is hardly a "monster". It was not a case of feature
creep either as the design hasn't changed since I first wrote it.

It all has to be seen in relation, of course. If a non-parametrized
function would do the same in 5 lines, then 35 are pretty much montrous.
And, as I've written, monstrosities arise at all levels.

Whilst I agree with you in general in this particular case focusing on a
single task would result in multiple functions doing more or less the
same thing and as the solution does not lend itself well to functional
decomposition you would possibly end up with duplicated code. I have
three parameters that can change behaviour which would result in 8
similar functions of similar length i.e. I wouldn't be gaining much. I
could provide 8 functions that forward to a common function but I think
your problem is that you have an irrational aversion to default
parameter values. If I follow your advice I would end up with:

no_max_no_skip_empty_delimeter_list_tokens(...)
no_max_no_skip_empty_delimeter_sequence_tokens(...)
no_max_skip_empty_delimeter_list_tokens(...)
no_max_skip_empty_delimeter_sequence_tokens(...)
max_no_skip_empty_delimeter_list_tokens(...)
max_no_skip_empty_delimeter_sequence_tokens(...)
max_skip_empty_delimeter_list_tokens(...)
max_skip_empty_delimeter_sequence_tokens(...)

This is clearly barmy, verbose, and possibly involves code duplication.
A single function with three extra parameters is *not* verbose IMO.

I agree that would not be any better. Well, actually it would, because
the 3 parameters would have names instead of "true, true, false".

But still, I think there must be other ways to split it up. The general
formula is "Find the orthogonality and model each aspect separately."
And of course, that does not mean encapsulating each aspect in a bool
parameter, much less a template one.

No I wouldn't; if the algorithms are different then separate functions
are justified.

Well I have already said that I disagree with your analysis. It is not
particularly "verbose" in my subjective opinion.

Fact is, when I first saw that function, I thought "Uhhh... what's that?
All that just to split a multi-string?? And it uses Templates??" It was
like a slap in the face. And the formatting issues of a usenet post are
only partly to blame.

Of course, the first impression is very subjective. It may indeed turn
out that all of it is necessary, and there is no more elegant way of
doing it.

Peter

James Kanze · Apr 25, 2011

I get a char array (out of another component) like this
abcd*xyz*mnop*

* is a seperator nullterm char. How can I seperate the char array at
the * and push back them to a std;;vector<std::string> ?

boost::algorthm:split does exactly what you want.

Tomasz Sowa · Apr 28, 2011

Dnia Thu, 21 Apr 2011 22:46:02 +0200, Philipp Kraus napisa³(a):

Is there a C or C++ function which I can use for seperating the parts?
Or should I iterate over all elements, if I read a \0 char, I push back
a element to the vector and
than create a new string element on which I append the readed char.

You don't have to separate, just use operator= from std::string.

void fun(const char * in, size_t len, std::vector<std::string> & out)
{
std::string temp;

while( len > 0 )
{
temp = in;
out.push_back(temp);
in += temp.size() + 1; // +1 for the null char
len -= temp.size() + 1;
}
}

int main()
{
const char table[] = "a\0bcd\0xyz\0mnop"; // it should be a null at the end
size_t len = sizeof(table) / sizeof(char);
std::vector<std::string> v;

fun(table, len, v);
}

Aw: char array nummterminated to std::vector<std::string>	1	Apr 22, 2011
std::string on "const char *"	16	Jan 21, 2013
static std::vector<std::string> member and a static function	12	Jun 15, 2010
std::vector::reserve and std::ifstream::read	12	May 24, 2011
Char array as a function returned value	2	Nov 23, 2019
static_cast and std::vector	9	Jan 20, 2012
c++11 std::array init	2	May 16, 2013
Nested Boost::unordered_map with std::pair insertion help ..	1	Feb 20, 2014

char array nummterminated to std::vector<std::string>

Peter Remmers

Paul

Paul

Peter Remmers

Peter Remmers

Paul

Peter Remmers

James Kanze

Tomasz Sowa

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads