Initial StringBulder allocation estimates

Discussion in 'Java' started by Roedy Green, Jul 19, 2008.

  1. Roedy Green

    Roedy Green Guest

    I put in a little bit of code like this to see how good I was at
    estimating the initial size for StringBuilder size allocation.

    I was embarrassed to discover I badly underestimated in every single
    case. That meant StringBuilder had to pause in the middle of each
    string constructed to double the buffer size, then garbage collect
    twice as many objects as it need have done.

    You might use this code to check out how good you are at estimating.


    /**
    * Used to fine tune initial StringBuilder size estimates. Insert
    a call to checkEstimate just before toString.
    *
    * @param sb the StringBuilder to check.
    * @param initSize initial size the StringBuilder was allocated.
    */
    static void checkEstimate( StringBuilder sb, int initSize )
    {
    final int size = sb.length();
    if ( size > initSize )
    {
    Throwable t = new Throwable();
    StackTraceElement[] es = t.getStackTrace();
    StackTraceElement e = es[ 1 ];
    err.println( "at " + e.getClassName()
    + "." + e.getMethodName()
    + " line:" + e.getLineNumber() );
    err.println( "StringBuffer initially sized too small " +
    initSize + " to contain " + size + " without autogrowing." );
    }
    else
    {
    if ( size + 100 < initSize )
    {
    Throwable t = new Throwable();
    StackTraceElement[] es = t.getStackTrace();
    StackTraceElement e = es[ 1 ];
    err.println( "at " + e.getClassName()
    + "." + e.getMethodName()
    + " line:" + e.getLineNumber() );
    err.println( "StringBuffer initially sized needlessly
    large " + initSize + " to contain contain " + size );
    }
    }
    }

    --

    Roedy Green Canadian Mind Products
    The Java Glossary
    http://mindprod.com
     
    Roedy Green, Jul 19, 2008
    #1
    1. Advertising

  2. In article <>,
    Roedy Green <> wrote:

    > I put in a little bit of code like this to see how good I was at
    > estimating the initial size for StringBuilder size allocation.
    >
    > I was embarrassed to discover I badly underestimated in every single
    > case. That meant StringBuilder had to pause in the middle of each
    > string constructed to double the buffer size, then garbage collect
    > twice as many objects as it need have done.
    >
    > You might use this code to check out how good you are at estimating.
    >
    >
    > /**
    > * Used to fine tune initial StringBuilder size estimates. Insert
    > a call to checkEstimate just before toString.
    > *
    > * @param sb the StringBuilder to check.
    > * @param initSize initial size the StringBuilder was allocated.
    > */
    > static void checkEstimate( StringBuilder sb, int initSize )
    > {
    > final int size = sb.length();
    > if ( size > initSize )
    > {
    > Throwable t = new Throwable();
    > StackTraceElement[] es = t.getStackTrace();
    > StackTraceElement e = es[ 1 ];
    > err.println( "at " + e.getClassName()
    > + "." + e.getMethodName()
    > + " line:" + e.getLineNumber() );
    > err.println( "StringBuffer initially sized too small " +
    > initSize + " to contain " + size + " without autogrowing." );
    > }
    > else
    > {
    > if ( size + 100 < initSize )
    > {
    > Throwable t = new Throwable();
    > StackTraceElement[] es = t.getStackTrace();
    > StackTraceElement e = es[ 1 ];
    > err.println( "at " + e.getClassName()
    > + "." + e.getMethodName()
    > + " line:" + e.getLineNumber() );
    > err.println( "StringBuffer initially sized needlessly
    > large " + initSize + " to contain contain " + size );
    > }
    > }
    > }


    Or put a breakpoint on AbstractStringBuilder.expandCapacity().

    --
    Goolge is a pro-spamming service. I will not see your reply if you use Google.
     
    Kevin McMurtrie, Jul 20, 2008
    #2
    1. Advertising

  3. Roedy Green

    Roedy Green Guest

    On Sat, 19 Jul 2008 18:33:35 GMT, Roedy Green
    <> wrote, quoted or indirectly quoted
    someone who said :

    > /**
    > * Used to fine tune initial StringBuilder size estimates. Insert
    >a call to checkEstimate just before toString.
    > *


    I have posted a somewhat improved version at
    http://mindprod.com/jgloss/stringbuilder.html

    Getting the code so it generates no warnings is pretty quick. The key
    to it is sorting the warning messages, so you can see what sort of
    typical sizes of result there are at a glance.

    I have a huge amounts of code that basically uses StringBuilder to
    build strings that are then cascaded to build even bigger strings.

    I grossly overestimated by ability to by eye generate a good estimate.
    This little optimisation doubles my RAM efficiency
    --

    Roedy Green Canadian Mind Products
    The Java Glossary
    http://mindprod.com
     
    Roedy Green, Jul 20, 2008
    #3
  4. Roedy Green wrote:
    > On Sat, 19 Jul 2008 18:33:35 GMT, Roedy Green
    > <> wrote, quoted or indirectly
    > quoted
    > someone who said :
    >
    >> /**
    >> * Used to fine tune initial StringBuilder size estimates.
    >> Insert
    >> a call to checkEstimate just before toString.
    >> *

    >
    > I have posted a somewhat improved version at
    > http://mindprod.com/jgloss/stringbuilder.html
    >
    > Getting the code so it generates no warnings is pretty quick. The
    > key
    > to it is sorting the warning messages, so you can see what sort of
    > typical sizes of result there are at a glance.
    >
    > I have a huge amounts of code that basically uses StringBuilder to
    > build strings that are then cascaded to build even bigger strings.
    >
    > I grossly overestimated by ability to by eye generate a good
    > estimate.
    > This little optimisation doubles my RAM efficiency


    When you measure things like elapsed time or CPU usage in the entire
    application, how much difference does it make?
     
    Mike Schilling, Jul 20, 2008
    #4
  5. Roedy Green

    Roedy Green Guest

    On Sun, 20 Jul 2008 09:24:23 -0700, "Mike Schilling"
    <> wrote, quoted or indirectly quoted
    someone who said :

    >When you measure things like elapsed time or CPU usage in the entire
    >application, how much difference does it make?


    I stupidly did not benchmark before the changes.
    --

    Roedy Green Canadian Mind Products
    The Java Glossary
    http://mindprod.com
     
    Roedy Green, Jul 21, 2008
    #5
  6. Roedy Green

    Tom Anderson Guest

    On Mon, 21 Jul 2008, Roedy Green wrote:

    > On Sun, 20 Jul 2008 09:24:23 -0700, "Mike Schilling"
    > <> wrote, quoted or indirectly quoted
    > someone who said :
    >
    >> When you measure things like elapsed time or CPU usage in the entire
    >> application, how much difference does it make?

    >
    > I stupidly did not benchmark before the changes.


    No, but you have the old version in source control, right?

    Right?

    tom

    --
    Sometimes it takes a madman like Iggy Pop before you can SEE the logic
    really working.
     
    Tom Anderson, Jul 22, 2008
    #6
  7. Roedy Green

    Mark Space Guest

    Tom Anderson wrote:

    >
    > No, but you have the old version in source control, right?
    >
    > Right?


    Source code controls are for wimps! Real men just fill their VW bus
    with back-up tapes!
     
    Mark Space, Jul 22, 2008
    #7
  8. Roedy Green

    Roedy Green Guest

    On Tue, 22 Jul 2008 17:46:21 +0100, Tom Anderson
    <> wrote, quoted or indirectly quoted someone who
    said :

    >No, but you have the old version in source control, right?


    OK, I suppose I could revive it. I suspect my code will improve much
    more than average, so it would let you know if there is any hope of
    sufficient improvement to justify the optimisation.

    --

    Roedy Green Canadian Mind Products
    The Java Glossary
    http://mindprod.com
     
    Roedy Green, Jul 22, 2008
    #8
  9. Roedy Green

    Roedy Green Guest

    On Tue, 22 Jul 2008 17:46:21 +0100, Tom Anderson
    <> wrote, quoted or indirectly quoted someone who
    said :

    >> I stupidly did not benchmark before the changes.

    >
    >No, but you have the old version in source control, right?



    I used this method to optimise the initial sizes of the StringBuilders
    used in the static macros program that expands the macros used to
    generate the mindprod.com website. It does a great many StringBuilder.
    appends, though it also does a fair bit of i/o as well, since it has
    to read each file in the website. Here are the results:

    Effect of StringBuilder Optimising
    Time Before Optimising Time After Optimising % improvement
    Sun 27.5 sec 24 sec 13%
    Jet 25 sec 22.5 sec 10%
    --

    Roedy Green Canadian Mind Products
    The Java Glossary
    http://mindprod.com
     
    Roedy Green, Jul 22, 2008
    #9
  10. Roedy Green wrote:
    > On Tue, 22 Jul 2008 17:46:21 +0100, Tom Anderson
    > <> wrote, quoted or indirectly quoted someone
    > who
    > said :
    >
    >>> I stupidly did not benchmark before the changes.

    >>
    >> No, but you have the old version in source control, right?

    >
    >
    > I used this method to optimise the initial sizes of the
    > StringBuilders
    > used in the static macros program that expands the macros used to
    > generate the mindprod.com website. It does a great many
    > StringBuilder.
    > appends, though it also does a fair bit of i/o as well, since it has
    > to read each file in the website. Here are the results:
    >
    > Effect of StringBuilder Optimising
    > Time Before Optimising Time After Optimising % improvement
    > Sun 27.5 sec 24 sec 13%
    > Jet 25 sec 22.5 sec 10%


    That is quite significant indeed. Thanks.
     
    Mike Schilling, Jul 23, 2008
    #10
  11. Roedy Green

    Tom Anderson Guest

    On Tue, 22 Jul 2008, Mike Schilling wrote:

    > Roedy Green wrote:
    >> On Tue, 22 Jul 2008 17:46:21 +0100, Tom Anderson
    >> <> wrote, quoted or indirectly quoted someone
    >> who
    >> said :
    >>
    >>>> I stupidly did not benchmark before the changes.
    >>>
    >>> No, but you have the old version in source control, right?

    >>
    >> I used this method to optimise the initial sizes of the StringBuilders
    >> used in the static macros program that expands the macros used to
    >> generate the mindprod.com website. It does a great many StringBuilder.
    >> appends, though it also does a fair bit of i/o as well, since it has to
    >> read each file in the website. Here are the results:
    >>
    >> Effect of StringBuilder Optimising
    >> Time Before Optimising Time After Optimising % improvement
    >> Sun 27.5 sec 24 sec 13%
    >> Jet 25 sec 22.5 sec 10%

    >
    > That is quite significant indeed. Thanks.


    Yes, thanks. This is interesting stuff.

    Roedy, how did you do the analysis? You showed us the routine which
    detects wrong-sized buffers, but where do you call it from? Every time you
    to a StringBuffer.toString()?

    I'm wondering if a more convenient way would be to hack the StringBuffer
    class itself. You can get the source code from the JDK, modify it, then
    use -Xbootclasspath/p to get it loaded in place of the normal
    StringBuffer.

    You could alter the class to remember its initial size, then to log its
    initial size and final number of characters in its toString, along with a
    bit of stack trace to see where it's being used.

    You could do other stuff too, like counting the number of times toString
    gets called, then having a finalizer which looks at the number, and if
    it's zero, logs the fact that the buffer was never toStringed. You could
    probably think of other things to check too.

    Er, and in the above, s/StringBuffer/StringBuilder/, as appropriate.

    tom

    --
    We can only see a short distance ahead, but we can see plenty there that
    needs to be done. -- Alan Turing
     
    Tom Anderson, Jul 23, 2008
    #11
  12. Roedy Green

    Roedy Green Guest

    On Wed, 23 Jul 2008 14:35:44 +0100, Tom Anderson
    <> wrote, quoted or indirectly quoted someone who
    said :

    >
    >Roedy, how did you do the analysis? You showed us the routine which
    >detects wrong-sized buffers, but where do you call it from? Every time you
    >to a StringBuffer.toString()?


    I inserted a call just prior to StringBuffer.toString. This generated
    output any time the actual size was outside my estimated range. I
    then typically adjusted the range and the initial size. In a few
    cases I put in code to precalculate the precise size such as this:

    /**
    * Display macro name and parms on error.
    *
    * @return parms as a human-readable string.
    */
    private String showParms()
    {
    // macroName and parms may be null.
    if ( macroName == null )
    {
    macroName = "?";
    }

    // get exact estimate of size of string we will build.
    int estSize = " <!-- macro ".length()
    + macroName.length()
    + "\n".length();

    if ( parms != null )
    {
    for ( String parm : parms )
    {
    estSize += " {".length() + parm.length() +
    "}\n".length();
    }
    }
    estSize += " -->".length();

    final StringBuilder sb = new StringBuilder( estSize );

    sb.append( " <!-- macro " );
    sb.append( macroName );
    sb.append( '\n' );
    if ( parms != null )
    {
    for ( String parm : parms )
    {
    sb.append( " {" );
    sb.append( parm );
    sb.append( "}\n" );
    }
    }
    sb.append( " -->" );
    Tools.checkStringBuilderEstimate( sb, estSize, estSize );
    return sb.toString();
    }
    --

    Roedy Green Canadian Mind Products
    The Java Glossary
    http://mindprod.com
     
    Roedy Green, Jul 23, 2008
    #12
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Roedy Green

    space estimates

    Roedy Green, May 25, 2004, in forum: Java
    Replies:
    8
    Views:
    465
    Michael Borgwardt
    May 26, 2004
  2. NeoRev
    Replies:
    2
    Views:
    2,381
    NeoRev
    Nov 26, 2008
  3. moogyd
    Replies:
    1
    Views:
    1,005
    gabor
    Oct 16, 2009
  4. Dr. Leff

    xml market estimates

    Dr. Leff, Dec 19, 2009, in forum: XML
    Replies:
    0
    Views:
    934
    Dr. Leff
    Dec 19, 2009
  5. pkj
    Replies:
    0
    Views:
    115
Loading...

Share This Page