C Style Strings

Barry Schwarz · Jun 3, 2006

[ ... ]

[ ... ]

Why do you believe that manually stepping through each character will
be faster when strcpy and strcat can take advantage of any CISC
instructions the hardware might offer?

Click to expand...

The first method steps through the first string once (in
strcpy) to copy it, and then again (in strcat) to find
its end, before concatenating the second string onto it.

His method avoids stepping through the first string the
second time.

True, but some machines, like IBM mainframe I work on, have
specialized string instructions and can perform the "strcpy" and the
first part of the "strcat" faster than manually stepping through each
char can do just the "strcpy".

Remove del for email

persenaama · Jun 3, 2006

So what's the big - O analysis of that '+' operation? Where is this

documented? What if I want to sacrifice a bit of safety for speed, as we did
with C? Can I overload the string '+' operator to achieve this?

It is interesting to note, that you can implement + style string
concenation with c++ strings and templates _very_ efficiently.

Here's the input:

foo::string x = a + b + c;

For simplicity, I omit the foo namespace from this point forward. We
need to state what *we* think is the optimal result (well optimal is
that nothing is done but let's say, with minimum number of operations
which still achieve something, let's not consider lazy evaluation and
similiar for this exercise).

What would be, in my humble opinion, efficient would be in pseudo code:

allocate a.length() + b.length() + c.length() bytes of memory in x, and
copy the a, b and c into the x.

This is possible using the proposed syntax. We need to implement a
string expression class, which encapsulates the expresion which is
being assigned into a string object, in this case x.

a + b is when we look at the types: string + string, result type is
string. Here comes the twist, we implement operator + which instead
returns a new type, which only encapsulates the parameters of the +
operation.

The new type, let's call it, "expr" so the above becomes:

expr operator + (string,string)

The next + operator with rhv of c will be of form:

expr operator + (expr,string)

And last, the assignment to x will be of form:

expr operator = (expr)

(references et cetera omitted for clarity)

When we implement this with templates, the expr object will be a type,
encapsulating a lot of information about the expression on the right
side of the assignment.

At this stage, we can compute a sum of the type tree on the right
(using the length of the string objects), this works so that the expr
object sums both left and right arguments lengths - recursively.
Because this "recursion" is done at code generation time, the recursion
is flattened (we are assuming the c++ implementation isn't braindead,
it could be but that is another matter, then there are bigger things to
worry, I think).

Next thing the operator = does is, that it just concatenates the text
as it best sees fit. Whatever is "optimal", efficient whatever. It's
actually pretty trivial to write, too.

Malcolm · Jun 3, 2006

Barry Schwarz said:
Why do you believe that manually stepping through each character will
be faster when strcpy and strcat can take advantage of any CISC
instructions the hardware might offer?

Ultimately you do have to go to assembly to squeeze every last bit of
efficiency out of the code.
You shouldn't assume that library functions are always optimally written -
memcpy is often a simple byte copying loop.
If there is a copy asciiz assembly instruction then it will be faster to
call it. The code gets rid of memory management overhead, and performs only
one scan of the string. It cannot be speeded up algorithmically, only by
micro-optimisation.

Alex Buell · Jun 3, 2006

void fastconcat(char *out, char *str1, char *str2)
{
while(*str1)
*out++ = *str1++;
while(*str2)
*out++ = *str2++;
*out = 0;
}

If you want this even faster, and the architecture can support it, copy
the largest size per loop, i.e. 32 bits at a time. Or even 64 bits.

websnarf · Jun 3, 2006

Chris said:
I don't believe the complaint was about stack memory. It was about the
incorrect statement regarding C++. The same statement may be considered
valid concerning Java, C#, VB, or C++/CLI, for example; but those are
different languages from C++. (The word "valid" should be taken lightly
there; I haven't verified the hundreds of times.)

C++ perfectly well allows programmers to allocate any "objects" (not
quite, really, since they don't own their identity so they are a sort of
2/3-object... but in C++ vocab they are objects) on the stack, with all
the accompanying performance benefits.

Indeed, I have again made the mistake of calling C++ what I mean to
call C++ but not using the C subset/paradigms. Of course you can do
this in C++ because you can just do it using the C-like subset (where
the resulting data types are still considered "objects".) My point was
just that C++ does not have a blanket advantage over C, since falling
back to ordinary C may still be the best way to do things.

Martin Ambuhl · Jun 3, 2006

persenaama said:
It is interesting to note, that you can implement + style string
concenation with c++ strings and templates _very_ efficiently.

Here's the input:

foo::string x = a + b + c;

Please stop this crap. There seems to be a concerted effort by assholes
from <to crosspost their obscurities to
<news.comp.lang.c>, and in more than one thread. You have been told,
more than once and by more than one poster, that this shit does not
belong in clc, yet you keep it up. The only possible excuses for this
are (a) you are inexcusably stupid or (b) you are inexcusably trying to
start flame wars. Neither is acceptable. Please go away.

Noah Roberts · Jun 3, 2006

Indeed, I have again made the mistake of calling C++ what I mean to
call C++ but not using the C subset/paradigms. Of course you can do
this in C++ because you can just do it using the C-like subset (where
the resulting data types are still considered "objects".) My point was
just that C++ does not have a blanket advantage over C, since falling
back to ordinary C may still be the best way to do things.

typedef ? XType;

int main()
{
X stackX;
}

No matter what ? is above stackX is an automatic variable allocated on
the stack.

I still don't know where you're getting these wield ideas of yours.

websnarf · Jun 3, 2006

Noah said:
typedef ? XType;

int main()
{
X stackX;
}

No matter what ? is above stackX is an automatic variable allocated on
the stack.

I still don't know where you're getting these wield ideas of yours.

/* In C: */

int catHi (bstring b) {
static struct tagbstring hi = bsStatic ("hi"); /* init-time */
bconcat (b, &hi); /* Essentially an addition and a memcpy */
}

// in C++:

void catHi (CBString &b) {
b += "hi"; // Requires implicit construction of CBString("hi") or
some
// sort of strlen() being called on "hi" before
appending.
}

Andrew Poelstra · Jun 3, 2006

typedef ? XType;

int main()
{
X stackX;
}

No matter what ? is above stackX is an automatic variable allocated on
the stack.

That isn't true; the C standard has no concept of a stack and I don't
see any stack libraries included in this code. Not everyone has a Wintel,
you know.

Ian Collins · Jun 4, 2006

Malcolm said:
Ultimately you do have to go to assembly to squeeze every last bit of
efficiency out of the code.
You shouldn't assume that library functions are always optimally written -
memcpy is often a simple byte copying loop.

Profile, don't speculate. Many CPUs have block copy/move instructions
that the library might use.

Profile, don't speculate.

kwikius · Jun 4, 2006

persenaama wrote:

[snip Expression Templates ]

Sounds cool. I hadnt thought of using expression templates for this
task but it fits rather well. I am not sure the implementation would be
totally trivial, but certainly worth investigating.
Note that there is one downside IIRC, that you either need an
assignment or a conversion to prompt the evaluation, which can be
inconvenient in use. There is also the issue of tracking the data which
may causes problems, or maybe not.

Overall though... cool idea!

regards
Andy Little

Noah Roberts · Jun 4, 2006

/* In C: */

int catHi (bstring b) {
static struct tagbstring hi = bsStatic ("hi"); /* init-time */
bconcat (b, &hi); /* Essentially an addition and a memcpy */
}

// in C++:

void catHi (CBString &b) {
b += "hi"; // Requires implicit construction of CBString("hi") or
some
// sort of strlen() being called on "hi" before
appending.
}

The only difference between the two code slices is syntax; they mean
the same thing. Of course, to be truely certain of that we would have
to know what bsStatic() does, what bconcat does, what CBString(const
char*) does and what CBString:

perator += does. What I think they all
mean just looking at the above would result in exactly the same
operations. Of course there is nothing requiring that CBString be
implemented so that += does not accept a const char* directly.

Even at that.... your code example does not prove your point at all.
It does not illustrate any difference between C and C++ with regard to
stack variables.

Malcolm · Jun 4, 2006

Ian Collins said:
Profile, don't speculate. Many CPUs have block copy/move instructions
that the library might use.

Profile, don't speculate.

That's wonderful advice, as long as your code is always running on the same
platform. My programs have to run on anything with reasonable efficiency.

Ian Collins · Jun 4, 2006

Malcolm said:
That's wonderful advice, as long as your code is always running on the same
platform. My programs have to run on anything with reasonable efficiency.

Then you can't use assembly and have to code to the lowest common
denominator.

When I've faced this problem, I included optimisations for known targets
or types (32/16/8 bit) of target and a fall back byte copying loop.

The specific optimisations did result from thorough profiling on those
targets.

Phlip · Jun 4, 2006

Ian said:
Malcolm wrote:
Then you can't use assembly and have to code to the lowest common
denominator.

Profiling isn't just about determining which low-level bits to split.

Profiling is also about detecting where to use brute-force, and where to
work hard to install a good high-level algorithm.

For most of your code, brute-force will not cause a bottleneck under high
load.

This is why "premature optimization is the root of all evil". Brute-force is
most likely the easiest code to read, unit test, and maintain. The
high-level algorithms cost more in programming time, so only put them in
when profiling reveals they will have a benefit.

Ian Collins · Jun 4, 2006

Phlip said:
Ian Collins wrote:

Profiling isn't just about determining which low-level bits to split.

Who said it was?

Phlip · Jun 4, 2006

Ian said:
Who said it was?

Take a cleansing breath, Ian. Sometimes when a post contains a quote, the
next line doesn't contradict but reinforces it. I'm aware this is the
exception with USENET, but it happens.

In this case, the mark you replied to took the position "optimization is
about opcodes, so profiling won't work for portability".

Optimize opcodes only as a last resort. You knew that too, but forgot to
bring the point up.

Ian Collins · Jun 4, 2006

Phlip said:
Ian Collins wrote:

Take a cleansing breath, Ian. Sometimes when a post contains a quote, the
next line doesn't contradict but reinforces it. I'm aware this is the
exception with USENET, but it happens.

In this case, the mark you replied to took the position "optimization is
about opcodes, so profiling won't work for portability".

Optimize opcodes only as a last resort. You knew that too, but forgot to
bring the point up.

Ah

persenaama · Jun 4, 2006

belong in clc, yet you keep it up. The only possible excuses for this

are (a) you are inexcusably stupid or (b) you are inexcusably trying to
start flame wars. Neither is acceptable. Please go away.

You are wrong, is it possible? Yes. Bye. *plonk*

persenaama · Jun 4, 2006

If you want this even faster, and the architecture can support it, copy

the largest size per loop, i.e. 32 bits at a time. Or even 64 bits.

Problem with that is that both source and destination have to be
aligned (for correctness, performance or both depending on the
architechture). Strings are commonly short so the overhead of detecting
this situation might degrade the performance on average use case.

Search and replace c-style strings in c++	17	Apr 25, 2012
c-style string vs std::string	20	Sep 16, 2011
UTF-8 and strings	44	Jun 7, 2011
What is the most astounding C++ syntax construct?	0	Dec 22, 2022
Comparing Multiple Strings More Efficiently	7	Jun 21, 2010
Which happens first: destructors or return?	3	Jan 19, 2010
C++ strings & C strings	23	Oct 15, 2006
Comparing C-Style Strings	5	Feb 18, 2004

C Style Strings

Barry Schwarz

persenaama

Malcolm

Alex Buell

websnarf

Martin Ambuhl

Noah Roberts

websnarf

Andrew Poelstra

Ian Collins

kwikius

Noah Roberts

Malcolm

Ian Collins

Phlip

Ian Collins

Phlip

Ian Collins

persenaama

persenaama

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads