simple StringBuilder proposal

R

Roedy Green

Have you tried writing
sb.append( "<input name=\"cx\" type=\"hidden\" value=\"" + cseAccount + ...)
?

I think, I once read here, that the JIT might even be smart enough
to not create a new StringBuilder for the "+"s, but inline it into
the existing StringBuilder.

If the strings are literals, they will be concatenated at compile
time.

However in my case they are a string of boiler plate with variable
fields t this will be done very clumsily at run time.

The FastCat technique just saves up the constituent strings, then at
toString time, allocates char[] exactly the right size and builds the
String very quickly. This is quite a bit more efficient than what
default + StringBuilder does, growing/copying its buffer over and
over.
 
R

Roedy Green

Have you done timing tests that support your claim that creating
temporary arrays (for the varargs-call) is really faster than simple
"+"-style argument concatenation? If so, how big was the difference?

My application, expanding macros for my website ran about 10% faster
when I optimised StringBuilders to use the proper size. I do more
than just concatenate, so the concatenate was actually faster than
that. I forget now the additional speed for the FastCat wrinkle. The
main thing is it does garbage collection less often. It is much
easier to get an accurate fastcat estimate, and the penalty is less
for being off.
 
R

Roedy Green

Java already has a shorthand for that:

String s = "<input name=\"cx\" type=\"hidden\" value=\""
+ cseAccount + ":" + cseCode + "\">\n";

The difference is you cannot give a size estimate when you use +.
 
J

Jan Burse

markspace said:
Good job actually sussing this out. You did far more work with a
questionable library than I was willing to.

Well it would be of interest to see the test
programs for all the different classes. FastCat
has definitively a problem here:

return new String( buffer ); // Would like
// some way to just hand buffer over to
// String to avoid copy.



https://wush.net/websvn/mindprod/fi...dprod&path=/com/mindprod/fastcat/FastCat.java

But it also depends what the other test programs
do, and whether the result is really comparable. Does
the test program for StringBuilder call toString()?

I guess without calling toString() the test programs
are not comparable, since the use case would differ.

Bye
Bye
 
R

Roedy Green

Okay, if that is enough of a hassle, and the "inefficiency" of a second StringBuilder
is really a concern, then you write the method yourself.

I did. It is called FastCat. It would be nice if FastCat,
StringBuilder, StringBuffer etc. implemented some interface so they
could be more easily substituted, e.g. in places like regex.

I am suggesting it might be a useful tool for others.
 
J

Joerg Meier

Well it would be of interest to see the test
programs for all the different classes. FastCat
has definitively a problem here:

I posted a link to the full test program upthread.
But it also depends what the other test programs
do, and whether the result is really comparable. Does
the test program for StringBuilder call toString()?
I guess without calling toString() the test programs
are not comparable, since the use case would differ.

I did not, but a change of the lines to:

return sbLocal.toString().length() - start;

changed the performance like so (with the reusing ones running loops/250):

Fresh SB/FCs:
Took 00:10.927ms to append 454,895,082 chars total with FastCat.
Took 00:03.980ms to append 454,895,082 chars total with StringBuffer.
Took 00:03.678ms to append 454,895,082 chars total with StringBuilder.
Reused SB/FCs:
Took 02:51.926ms to append 1,623,732 chars total with FastCat.
Took 00:13.920ms to append 1,623,732 chars total with StringBuffer.
Took 00:13.987ms to append 1,623,732 chars total with StringBuilder

So, FastCat is doing even worse if .toString() is used. And much, much
worse if .toString() is used more than once.

Liebe Gruesse,
Joerg
 
J

Jan Burse

Joerg said:
So, FastCat is doing even worse if .toString() is used. And much, much
worse if .toString() is used more than once.

Liebe Gruesse,
Joerg

Ok, so then. Thanks.
 
S

Sven Köhler

Hi Joerg,

Am 27.02.2013 18:16, schrieb Joerg Meier:
Because his incessant advertising irritates me, I spent a few minutes
writing a small benchmark, comparing the performance of StringBuilder,
StringBuffer and FastCat. The results were ... well, pretty much what one
would expect:

Took 00:08.845ms to append 454,895,082 chars total with FastCat.
Took 00:03.939ms to append 454,895,082 chars total with StringBuffer.
Took 00:03.462ms to append 454,895,082 chars total with StringBuilder.

Thanks for that!


Regards,
Sven
 
D

Daniel Pitts

Ok, so then. Thanks.
Now, to be fair to Roedy, you aren't exactly using it "as proscribed",
and you aren't testing the time spent in the constructors.

I've made the Strings you append longer, and I've fixed the estimates
for the FastCat and the StringBu*ers constructors to be spot-on:

Fresh SB/FCs:
Cat> Init: 00.000ms, Loop: 00.175ms, End: 00.000ms, Total: 00.175ms.
Buf> Init: 00.000ms, Loop: 00.227ms, End: 00.000ms, Total: 00.227ms.
Bui> Init: 00.000ms, Loop: 00.236ms, End: 00.000ms, Total: 00.236ms.
Reused SB/FCs:
Cat> Init: 00.000ms, Loop: 00.023ms, End: 00.375ms, Total: 00.398ms.
Buf> Init: 00.101ms, Loop: 00.048ms, End: 00.130ms, Total: 00.279ms.
Bui> Init: 00.092ms, Loop: 00.052ms, End: 00.130ms, Total: 00.274ms.


For Fresh, Init/End do nothing, but toString() is called every loop.

For Reused, Init creates the buffer/builder/fastcat, and End is the
single call to toString().

If I remove the estimate entirely from StringBu*er constructors, FastCat
does indeed win out. FastCat relies on a "good estimate" to start with,
so I left that estimate in.

Fresh SB/FCs:
Cat> Init: 00.000ms, Loop: 00.187ms, End: 00.000ms, Total: 00.187ms.
Buf> Init: 00.000ms, Loop: 00.240ms, End: 00.000ms, Total: 00.240ms.
Bui> Init: 00.000ms, Loop: 00.236ms, End: 00.000ms, Total: 00.236ms.
Reused SB/FCs:
Cat> Init: 00.000ms, Loop: 00.020ms, End: 00.377ms, Total: 00.397ms.
Buf> Init: 00.000ms, Loop: 00.495ms, End: 00.122ms, Total: 00.617ms.
Bui> Init: 00.000ms, Loop: 00.468ms, End: 00.121ms, Total: 00.589ms.


So, depending on its use, FastCat *can* be about twice as fast as
StringBuffer/StringBuilder. It can, however, be significantly slower as
you have shown.

Sorry Roedy, I'm not going to switch to FastCat any time soon. It's not
a bad idea for a specialized use-case, but as a general use-case not as
good.
 
J

Joerg Meier

I've made the Strings you append longer, and I've fixed the estimates
for the FastCat and the StringBu*ers constructors to be spot-on:
Fresh SB/FCs:
Cat> Init: 00.000ms, Loop: 00.175ms, End: 00.000ms, Total: 00.175ms.
Buf> Init: 00.000ms, Loop: 00.227ms, End: 00.000ms, Total: 00.227ms.
Bui> Init: 00.000ms, Loop: 00.236ms, End: 00.000ms, Total: 00.236ms.
Reused SB/FCs:
Cat> Init: 00.000ms, Loop: 00.023ms, End: 00.375ms, Total: 00.398ms.
Buf> Init: 00.101ms, Loop: 00.048ms, End: 00.130ms, Total: 00.279ms.
Bui> Init: 00.092ms, Loop: 00.052ms, End: 00.130ms, Total: 00.274ms.

I'm curious, would you mind sharing the updated code on a pastebin
somewhere ? Writing benchmarks is always tricky, so I'd be happy to learn
from your changes.

Liebe Gruesse,
Joerg
 
D

Daniel Pitts

I'm curious, would you mind sharing the updated code on a pastebin
somewhere ? Writing benchmarks is always tricky, so I'd be happy to learn
from your changes.

Liebe Gruesse,
Joerg

Here it is in all its glory:

import java.text.DateFormat;
import java.text.SimpleDateFormat;
import java.util.Date;
import com.mindprod.fastcat.FastCat;


class CatTester {
public static final String LONG_STRING =
"abcdefgabcdefgabcdefgabcdefgabc" +
"defgabcdefgabcdefgabcdefgabcdefgabcde" +
"defgabcdefgabcdefgabcdefgabcdefgabcde" +
"defgabcdefgabcdefgabcdefgabcdefgabcde" +
"defgabcdefgabcdefgabcdefgabcdefgabcde" +
"defgabcdefgabcdefgabcdefgabcdefgabcde" +
"defgabcdefgabcdefgabcdefgabcdefgabcde" +
"defgabcdefgabcdefgabcdefgabcdefgabcde" +
"defgabcdefgabcdefgabcdefgabcdefgabcde" +
"fg"
;
public static final String ANOTHER_STRING = "xvck" +
"blahblahblahblahxjcvh" +
"blahblahblahblahxjcvh" +
"blahblahblahblahxjcvh" +
"blahblahblahblahxjcvh" +
"blahblahblahblahxjcvh" +
"blahblahblahblahxjcvh" +
"blahblahblahblahxjcvh" +
"blahblahblahblahxjcvh" +
"blahblahblahblahxjcvh" +
"blahblahblahblahxjcvh" +
"blahblahblahblahxjcvh" +
"blahblahblahblahxjcvh" +
"blahblahblahblahxjcvh" +
"blahblahblahblahxjcvh" +
"blahblahblahblahxjcvh" +
"blahblahblahblahxjcvh" +
"blahblahblahblahxjcvh" +
"blahblahblahblahxjcvh" +
"blahblahblahblahxjcvh" +
"";

private interface Tester {
public void init();
public int test(final long junk);
String conclude();
}

private static class TestFastCat implements Tester {
protected FastCat sb;

public TestFastCat() {
}

public void init() {
}

public String conclude() {
if (sb != null) {
return sb.toString();
}
return null;
}


public int test(final long junk) {
final FastCat sbLocal = sb == null ? new FastCat(6) : sb;
final int start = sbLocal.length();
sbLocal.append(LONG_STRING);
sbLocal.append(1234);
sbLocal.append(ANOTHER_STRING);
sbLocal.append(junk);
sbLocal.append(ANOTHER_STRING);
sbLocal.append(junk + 1);
if (sbLocal != sb) {
sbLocal.toString();
}
return sbLocal.length() - start;
}
}

private static class TestStringBuffer implements Tester {
protected StringBuffer sb;

public TestStringBuffer() {
}

public void init() {
}

public int test(final long junk) {
final StringBuffer sbLocal = sb == null ? new StringBuffer() : sb;
final int start = sbLocal.length();
sbLocal.append(LONG_STRING);
sbLocal.append(1234);
sbLocal.append(ANOTHER_STRING);
sbLocal.append(junk);
sbLocal.append(ANOTHER_STRING);
sbLocal.append(junk + 1);
if (sbLocal != sb) {
sbLocal.toString();
}
return sbLocal.length() - start;
}

public String conclude() {
if (sb != null) {
return sb.toString();
}
return null;
}

}

private static class TestStringBuilder implements Tester {
protected StringBuilder sb;

public TestStringBuilder() {
}


public void init() {
}

public int test(final long junk) {
final StringBuilder sbLocal = sb == null ? new StringBuilder() : sb;
final int start = sbLocal.length();
sbLocal.append(LONG_STRING);
sbLocal.append(1234);
sbLocal.append(ANOTHER_STRING);
sbLocal.append(junk);
sbLocal.append(ANOTHER_STRING);
sbLocal.append(junk + 1);
if (sbLocal != sb) {
sbLocal.toString();
}
return sbLocal.length() - start;
}

public String conclude() {
if (sb != null) {
return sb.toString();
}
return null;
}

}

public static void main(final String[] args) {
new CatTester().test();
}

private final DateFormat DF = new SimpleDateFormat("ss'.'SSS'ms'");

private long loopTest(final String name, final Tester tester, final int
loops) {
final long start = System.currentTimeMillis();
tester.init();
final long initTime = System.currentTimeMillis();
long junk = 0;
for (int i = 1; i < loops; i++) {
junk += tester.test(junk);
}
final long appendTime = System.currentTimeMillis();
tester.conclude();
if (name != null) {
System.out.println(String.format("%s> Init: %s, Loop: %s,
End: %s, Total: %s.",
name,
timestamp(start, initTime),
timestamp(initTime, appendTime),
timestamp(appendTime),
timestamp(start))
);
}
return junk;
}

private void test() {
final int loops = 100000;

// To warm up the VM, HotSpot etc
loopTest(null, new TestFastCat(), loops);
final long length = loopTest(null, new TestStringBuilder(),
loops) / 10;
loopTest(null, new TestStringBuffer(), loops);
loopTest(null, new TestFastCatReuse(loops), loops);
loopTest(null, new TestStringBufferReuse(length), loops);
loopTest(null, new TestStringBuilderReuse(length), loops);
for (int i = 0; i < 3; ++i) {
System.out.println("Fresh SB/FCs:");
loopTest("Cat", new TestFastCat(), loops);
loopTest("Buf", new TestStringBuffer(), loops);
loopTest("Bui", new TestStringBuilder(), loops);

System.out.println("Reused SB/FCs:");
loopTest("Cat", new TestFastCatReuse(loops), loops);
loopTest("Buf", new TestStringBufferReuse(length), loops);
loopTest("Bui", new TestStringBuilderReuse(length), loops);
}
}

private String timestamp(final long start) {
return timestamp(start, System.currentTimeMillis());
}

private String timestamp(final long start, long endTime) {
return DF.format(new Date(endTime - start));
}

private static class TestFastCatReuse extends TestFastCat {
private final int loops;

public TestFastCatReuse(int loops) {
this.loops = loops;
}

public void init() { sb = new FastCat(loops * 6); }
}

private static class TestStringBufferReuse extends TestStringBuffer {
private final long length;

public TestStringBufferReuse(long length) {
this.length = length;
}

public void init() { sb = new StringBuffer((int) length);}
}

private static class TestStringBuilderReuse extends TestStringBuilder {
private final long length;

public TestStringBuilderReuse(long length) {
this.length = length;
}

public void init() { sb = new StringBuilder((int) length); }
}
}
 
L

Lew

Daniel said:
Now, to be fair to Roedy, you aren't exactly using it "as proscribed",
and you aren't testing the time spent in the constructors.

I've made the Strings you append longer, and I've fixed the estimates
for the FastCat and the StringBu*ers constructors to be spot-on:

Fresh SB/FCs:
Cat> Init: 00.000ms, Loop: 00.175ms, End: 00.000ms, Total: 00.175ms.
Buf> Init: 00.000ms, Loop: 00.227ms, End: 00.000ms, Total: 00.227ms.
Bui> Init: 00.000ms, Loop: 00.236ms, End: 00.000ms, Total: 00.236ms.
Reused SB/FCs:

Cat> Init: 00.000ms, Loop: 00.023ms, End: 00.375ms, Total: 00.398ms.
Buf> Init: 00.101ms, Loop: 00.048ms, End: 00.130ms, Total: 00.279ms.
Bui> Init: 00.092ms, Loop: 00.052ms, End: 00.130ms, Total: 00.274ms.

For Fresh, Init/End do nothing, but toString() is called every loop.

For Reused, Init creates the buffer/builder/fastcat, and End is the
single call to toString().

If I remove the estimate entirely from StringBu*er constructors, FastCat
does indeed win out. FastCat relies on a "good estimate" to start with,
so I left that estimate in.

Fresh SB/FCs:
Cat> Init: 00.000ms, Loop: 00.187ms, End: 00.000ms, Total: 00.187ms.
Buf> Init: 00.000ms, Loop: 00.240ms, End: 00.000ms, Total: 00.240ms.
Bui> Init: 00.000ms, Loop: 00.236ms, End: 00.000ms, Total: 00.236ms.

Reused SB/FCs:
Cat> Init: 00.000ms, Loop: 00.020ms, End: 00.377ms, Total: 00.397ms.
Buf> Init: 00.000ms, Loop: 00.495ms, End: 00.122ms, Total: 00.617ms.
Bui> Init: 00.000ms, Loop: 00.468ms, End: 00.121ms, Total: 00.589ms.

So, depending on its use, FastCat *can* be about twice as fast as
StringBuffer/StringBuilder. It can, however, be significantly slower as
you have shown.

How does it compare to using String +?

How does it change after a 10,000-loop warmup to kick in HotSpot?
 
J

Joerg Meier

How does it compare to using String +?

I didn't include + in my initial benchmark because I learned some time ago
that both javac as well as HotSpot will just convert it to StringBuilder
behind the scenes in most circumstances.

Liebe Gruesse,
Joerg
 
M

markspace

Now, to be fair to Roedy, you aren't exactly using it "as proscribed",

<pedantic>

proscribed = forbidden
prescribed = recommended

I think you meant the latter.

</pedantic>
 
D

Daniel Pitts

<pedantic>

proscribed = forbidden
prescribed = recommended

I think you meant the latter.

</pedantic>
I learn something new everyday.
Indeed. I actually meant "as intended".
 
A

Arne Vajhøj

Sorry Roedy, I'm not going to switch to FastCat any time soon. It's not
a bad idea for a specialized use-case, but as a general use-case not as
good.

It is very rarely a good idea to replace well tested builtin solutions
with some hacked code that is supposedly faster. Often it is not faster
at all. And if it is then it is often only for special cases. And if
it is more than that there are often problems with correctness.

It can be good fun for students to play around with. But I think the
keyword here is "play".

26 years ago I spent an awful lot of time trying to optimize
Sieve of E in VAX assembler.

Arne
 
A

Arne Vajhøj

So its not in the GoF book?

No. But GoF never claimed to be complete.
But the "pattern" rather feels
to me like a micro pattern, and not a full fledged pattern.
Since it doesn't realy solve a problem, its just cosmetics.

When Fluent interface is fully used, then it is more than just
cosmetics. It almost becomes a DSL.

Arne
 
S

Sven Köhler

Am 27.02.2013 19:31, schrieb Roedy Green:
The difference is you cannot give a size estimate when you use +.

And you dont have to, because the stringbuilder implements exponential
groth. In result, the amount of characters copied during array expansion
is linear in the final length of the stringbuilder.

Regards,
Sven
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,777
Messages
2,569,604
Members
45,227
Latest member
Daniella65

Latest Threads

Top