Passing a Method Name to a Method, Redux

Discussion in 'Java' started by Gene Wirchenko, Jun 24, 2011.

  1. Dear Java'ers:

    I have completed my benchmarking. The code is below. Note that
    the difference in the bodies between ParseSequentialSearch(),
    ParseBinarySearch(), and ParseTreesetSearch() is but one line. I
    really would have preferred not having to duplicate the code.

    Oddly, the timings have a LOT of noise in them. In some runs, a
    sequential search has out-performed a binary search. Occasionally, a
    sequential search has beaten both a binary search and a Treeset
    search. The times for sequential searching are only a bit worse than
    for binary searching. Treeset searching is about 20% faster. Any
    explanations?

    I had to kludge this:
    cIdent=""+CurrChar;
    cIdent is a String, CurrChar is a char.
    cIdent=CurrChar;
    does not compile.

    So how would you have written this benchmark?

    ***** Start of Code *****
    // TimingTesting
    // Timing Testing of Character Searching
    // Last Modification: 2011-06-23



    import java.util.*;



    class TimingTesting
    {

    static String cParseString=
    "//identifier//IDENTIFIER//a_b_c abc123
    4b5%$__dbl;one;two;three;END";

    static String IdentChars=
    "0123456789"+
    "ABCDEFGHIJKLMNOPQRSTUVWXYZ"+
    "_"+
    "abcdefghijklmnopqrstuvwxyz"; // sorted order!

    static SortedSet<Character> IdentCharsSet=new TreeSet<Character>();

    static int nRepetitions=1000000;



    static boolean SequentialSearch
    (
    char CurrChar
    )
    {
    boolean fFound=false;
    for (int i=0; i<IdentChars.length() && !fFound; i++)
    fFound=IdentChars.charAt(i)==CurrChar;
    return fFound;
    }



    static boolean BinarySearch
    (
    char CurrChar
    )
    {
    int xLow=0;
    int xHigh=IdentChars.length()-1;
    int xTry;
    boolean fFound=false;
    while (xLow<=xHigh)
    {
    xTry=(xLow+xHigh)/2;
    if (CurrChar==IdentChars.charAt(xTry))
    return true;
    if (CurrChar<IdentChars.charAt(xTry))
    xHigh=xTry-1;
    else
    xLow=xTry+1;
    }
    return false;
    }



    static boolean TreesetSearch
    (
    char CurrChar
    )
    {
    return IdentCharsSet.contains(CurrChar);
    }



    static void ParseSequentialSearch()
    {
    int xScan=0;
    boolean fBuildingIdent=false;
    boolean fInIdentChars;
    String cIdent=""; // fussy init
    while (xScan<cParseString.length())
    {
    char CurrChar=cParseString.charAt(xScan);
    fInIdentChars=SequentialSearch(CurrChar);
    if (SequentialSearch(CurrChar)) // different code
    if (fBuildingIdent)
    cIdent+=CurrChar;
    else
    {
    fBuildingIdent=true;
    cIdent=""+CurrChar;
    }
    else
    if (fBuildingIdent)
    {
    fBuildingIdent=false;
    if (nRepetitions==1)
    System.out.println(cIdent);
    }
    else
    {}
    xScan++;
    }
    if (fBuildingIdent)
    if (nRepetitions==1)
    System.out.println(cIdent);
    }



    static void ParseBinarySearch()
    {
    int xScan=0;
    boolean fBuildingIdent=false;
    boolean fInIdentChars;
    String cIdent=""; // fussy init
    while (xScan<cParseString.length())
    {
    char CurrChar=cParseString.charAt(xScan);
    fInIdentChars=SequentialSearch(CurrChar);
    if (SequentialSearch(CurrChar)) // different code
    if (fBuildingIdent)
    cIdent+=CurrChar;
    else
    {
    fBuildingIdent=true;
    cIdent=""+CurrChar;
    }
    else
    if (fBuildingIdent)
    {
    fBuildingIdent=false;
    if (nRepetitions==1)
    System.out.println(cIdent);
    }
    else
    {}
    xScan++;
    }
    if (fBuildingIdent)
    if (nRepetitions==1)
    System.out.println(cIdent);
    }



    static void ParseTreesetSearch()
    {
    int xScan=0;
    boolean fBuildingIdent=false;
    boolean fInIdentChars;
    String cIdent=""; // fussy init
    while (xScan<cParseString.length())
    {
    char CurrChar=cParseString.charAt(xScan);
    fInIdentChars=SequentialSearch(CurrChar);
    if (TreesetSearch(CurrChar)) // different code
    if (fBuildingIdent)
    cIdent+=CurrChar;
    else
    {
    fBuildingIdent=true;
    cIdent=""+CurrChar;
    }
    else
    if (fBuildingIdent)
    {
    fBuildingIdent=false;
    if (nRepetitions==1)
    System.out.println(cIdent);
    }
    else
    {}
    xScan++;
    }
    if (fBuildingIdent)
    if (nRepetitions==1)
    System.out.println(cIdent);
    }



    public static void main(String[] args)
    {
    int i;

    long StartTime;
    long EndTime;
    long Duration;

    System.out.println("Timing Testing of Character Searching");
    System.out.println();

    // Initialise Set.
    for (i=0; i<IdentChars.length(); i++)
    IdentCharsSet.add(IdentChars.charAt(i));

    // Character Sequential
    System.out.print("Character Sequential Search");
    StartTime=System.nanoTime();
    for (i=1; i<=nRepetitions; i++)
    ParseSequentialSearch();
    EndTime=System.nanoTime();
    Duration=EndTime-StartTime;
    System.out.println(" Duration="+Duration);

    // Character Binary Search
    System.out.print("Character Binary Search ");
    StartTime=System.nanoTime();
    for (i=1; i<=nRepetitions; i++)
    ParseBinarySearch();
    EndTime=System.nanoTime();
    Duration=EndTime-StartTime;
    System.out.println(" Duration="+Duration);

    // Character Treeset
    System.out.print("Character Treeset Search ");
    StartTime=System.nanoTime();
    for (i=1; i<=nRepetitions; i++)
    ParseTreesetSearch();
    EndTime=System.nanoTime();
    Duration=EndTime-StartTime;
    System.out.println(" Duration="+Duration);
    }

    }
    ***** End of Code *****

    Sincerely,

    Gene Wirchenko
    Gene Wirchenko, Jun 24, 2011
    #1
    1. Advertising

  2. Gene Wirchenko

    Stefan Ram Guest

    1. Advertising

  3. On 23 Jun 2011 23:11:50 GMT, -berlin.de (Stefan Ram)
    wrote:

    >Gene Wirchenko <> writes:
    >>So how would you have written this benchmark?

    >
    >http://stackoverflow.com/questions/504103/how-do-i-write-a-correct-micro-benchmark-in-java
    >http://www.kdgregory.com/index.php?page=java.microBenchmark
    >http://www.ibm.com/developerworks/java/library/j-jtp02225/index.html
    >...


    Good links. Thank you.

    But my question was really about method calling to
    SequentialSearch(), BinarySearch(), and TreesetSearch().

    Sincerely,

    Gene Wirchenko
    Gene Wirchenko, Jun 24, 2011
    #3
  4. Gene Wirchenko

    markspace Guest

    On 6/23/2011 4:03 PM, Gene Wirchenko wrote:
    >
    > So how would you have written this benchmark?



    Um, realistically? Is this really what you want to do?

    static boolean TreesetSearch( char CurrChar ) {
    return IdentCharsSet.contains( CurrChar );
    }

    All of the identifiers in your "language" are single characters?

    I would have used actual strings, preferably from existing code so you
    could test performance. Although I appreciate you making a
    self-contained example for us'm here on usenet.
    markspace, Jun 24, 2011
    #4
  5. Gene Wirchenko

    markspace Guest

    On 6/23/2011 4:03 PM, Gene Wirchenko wrote:
    > search. The times for sequential searching are only a bit worse than
    > for binary searching. Treeset searching is about 20% faster. Any
    > explanations?



    Why does your TreeSetSearch() call SequentialSearch? Doesn't that
    defeat the purpose of your timing comparisons? I'm also not following
    the parsing you are doing at all. What is the goal of that method?

    static void ParseTreesetSearch()
    {
    int xScan = 0;
    boolean fBuildingIdent = false;
    boolean fInIdentChars;
    String cIdent = ""; // fussy init
    while( xScan < cParseString.length() ) {
    char CurrChar = cParseString.charAt( xScan );
    fInIdentChars = SequentialSearch( CurrChar );
    ^^^^^^^^^^^^^^^^
    if( TreesetSearch( CurrChar ) ) // different code


    Odd call indicated above. Doesn't that just do the exact same thing
    that TreesetSearch does?
    markspace, Jun 24, 2011
    #5
  6. Gene Wirchenko

    markspace Guest

    Some other things I'd do:


    1. Make life easier on the Mark I eyeball and print out your timing in
    seconds:

    System.out.println( " Duration=" + (Duration/1e9) );


    2. Specify your command line arguments. I found this affected the
    timing greatly (over 100% faster than in the IDE) .


    C:\Users\Brenden\Dev\Test2\src>javac -g:none test\CallingTest.java

    C:\Users\Brenden\Dev\Test2\src>
    C:\Users\Brenden\Dev\Test2\src>java -server test.CallingTest
    Timing Testing of Character Searching

    Character Sequential Search Duration=22.251718931
    Character Binary Search Duration=21.492850347
    Character Treeset Search Duration=19.472623928
    Hash Test Duration=0.098741719

    C:\Users\Brenden\Dev\Test2\src>
    markspace, Jun 24, 2011
    #6
  7. On 6/23/2011 4:03 PM, Gene Wirchenko wrote:
    > Dear Java'ers:
    >
    > I have completed my benchmarking. The code is below. Note that
    > the difference in the bodies between ParseSequentialSearch(),
    > ParseBinarySearch(), and ParseTreesetSearch() is but one line. I
    > really would have preferred not having to duplicate the code.
    >
    > Oddly, the timings have a LOT of noise in them. In some runs, a
    > sequential search has out-performed a binary search. Occasionally, a
    > sequential search has beaten both a binary search and a Treeset
    > search. The times for sequential searching are only a bit worse than
    > for binary searching. Treeset searching is about 20% faster. Any
    > explanations?


    Here is probably the best implementation:

    static boolean[] isIdentifierChar;
    static String slowSearch;
    static void initIdentifierChars(String validChars) {
    isIdentifierChar = new boolean[128];
    for (char c : validChars.toCharArray())
    if (c < 128)
    isIdentifierChar[c] = true;
    slowSearch = validChars;
    }

    static boolean isValidChar(char c) {
    if (c >= 128)
    return slowSearch.indexOf(c) >= 0;
    return isIdentifieChar[c];
    }

    If it's really performance critical, you can just make the character
    array the full 64K characters and skip the search, so testing for valid
    identifiers is simply an index lookup.

    There are other performance improvements:
    1. Use StringBuilder. It's mutable, so it avoids copies (kind of like
    ArrayList). Heck, that's what your code is doing, it just keeps copying
    back and forth between strings.

    2. It's probably better to throw out StringBuilder and use token extents
    (i.e., from index 5 to 9 is this identifier). It saves copying and can
    allow for better error messages in parsers by being able to pinpoint
    line/column numbers accurately.

    But yeah, I would have coded my benchmark like that. Since you seem to
    be concerned with microoptimization here, anything which looks different
    from what the real parser implementation would show up as having impact
    in timing numbers.

    Finally, I might add, if you're writing a full parser, it might be
    better to just use an existing Java lexer rather than rolling your own
    unless you have a *very* good reason not to.

    --
    Beware of bugs in the above code; I have only proved it correct, not
    tried it. -- Donald E. Knuth
    Joshua Cranmer, Jun 24, 2011
    #7
  8. On Thu, 23 Jun 2011 17:34:50 -0700, markspace <-@.> wrote:

    >On 6/23/2011 4:03 PM, Gene Wirchenko wrote:
    >> search. The times for sequential searching are only a bit worse than
    >> for binary searching. Treeset searching is about 20% faster. Any
    >> explanations?

    >
    >
    >Why does your TreeSetSearch() call SequentialSearch? Doesn't that


    I goofed.

    >defeat the purpose of your timing comparisons? I'm also not following
    >the parsing you are doing at all. What is the goal of that method?


    I am parsing for identifiers. In this test, I just throw them
    away.

    [snip]

    >Odd call indicated above. Doesn't that just do the exact same thing
    >that TreesetSearch does?


    Cut and paste did me in. Thank you for catching this.

    Sincerely,

    Gene Wirchenko
    Gene Wirchenko, Jun 24, 2011
    #8
  9. On Thu, 23 Jun 2011 17:24:43 -0700, markspace <-@.> wrote:

    >On 6/23/2011 4:03 PM, Gene Wirchenko wrote:
    >>
    >> So how would you have written this benchmark?


    >Um, realistically? Is this really what you want to do?
    >
    > static boolean TreesetSearch( char CurrChar ) {
    > return IdentCharsSet.contains( CurrChar );
    > }


    Yes. I wanted a simple method call in the parser so I could
    cut-and-paste. I did not know if I would need more than one call. I
    am going to go with a Treeset so I will not have a separate method in
    the implementation.

    >All of the identifiers in your "language" are single characters?


    No. An identifier is a sequence of one or more characters that
    are in IdentChars (or IdentCharsSet).

    >I would have used actual strings, preferably from existing code so you
    >could test performance. Although I appreciate you making a
    >self-contained example for us'm here on usenet.


    I wanted an example. I have test files for when I have this
    worked out.

    Sincerely,

    Gene Wirchenko
    Gene Wirchenko, Jun 24, 2011
    #9
  10. On Thu, 23 Jun 2011 18:30:21 -0700, markspace <-@.> wrote:

    >Some other things I'd do:


    >1. Make life easier on the Mark I eyeball and print out your timing in
    >seconds:
    >
    > System.out.println( " Duration=" + (Duration/1e9) );


    Yes, if this were not just test code.

    >2. Specify your command line arguments. I found this affected the
    >timing greatly (over 100% faster than in the IDE) .


    There are none.

    >C:\Users\Brenden\Dev\Test2\src>javac -g:none test\CallingTest.java
    >
    >C:\Users\Brenden\Dev\Test2\src>
    >C:\Users\Brenden\Dev\Test2\src>java -server test.CallingTest
    >Timing Testing of Character Searching
    >
    >Character Sequential Search Duration=22.251718931
    >Character Binary Search Duration=21.492850347
    >Character Treeset Search Duration=19.472623928
    >Hash Test Duration=0.098741719

    ^^^^^^^^^
    What is this one, please?

    Sincerely,

    Gene Wirchenko
    Gene Wirchenko, Jun 24, 2011
    #10
  11. Gene Wirchenko

    markspace Guest

    On 6/23/2011 7:48 PM, Gene Wirchenko wrote:
    > On Thu, 23 Jun 2011 18:30:21 -0700, markspace<-@.> wrote:
    >
    >> Some other things I'd do:

    >
    >> 1. Make life easier on the Mark I eyeball and print out your timing in
    >> seconds:
    >>
    >> System.out.println( " Duration=" + (Duration/1e9) );

    >
    > Yes, if this were not just test code.


    Test code is probably more important to get right than the actual
    production code. You'll be using the test code more than the production. ;)

    I have another question. This code here, most of this is debugging?
    And you take the entire string and concatenate the characters together,
    but you don't actually parse this into tokens:


    while( xScan < cParseString.length() ) {
    char CurrChar = cParseString.charAt( xScan );
    fInIdentChars = SequentialSearch( CurrChar );
    if( TreesetSearch( CurrChar ) ) // different code
    {
    if( fBuildingIdent ) {
    cIdent += CurrChar;
    } else {
    fBuildingIdent = true;
    cIdent = "" + CurrChar;
    }
    } else if( fBuildingIdent ) {
    fBuildingIdent = false;
    if( nRepetitions == 1 ) {
    System.out.println( cIdent );
    }
    } else {
    }
    xScan++;
    }

    Your just double checking that SequentialSearch and TreesetSearch return
    the same thing? And you concat the whole cParseString into cIdent, you
    don't look for white space or anything and break cParseString into tokens?

    Don't worry about Hash Test, I have something faster now.

    BTW I refactored your test that you were copy-and-pasting around into
    one method. Using techniques I mentioned in my first post to you on
    this subject.

    private static void time( TestCase r ) {
    long StartTime = System.nanoTime();
    for( int i = 1; i <= nRepetitions; i++ ) {
    r.parse();
    }
    long EndTime = System.nanoTime();
    long Duration = EndTime - StartTime;
    System.out.println( " Duration=" + (Duration/1e9) );
    }
    markspace, Jun 24, 2011
    #11
  12. Gene Wirchenko

    lewbloch Guest

    markspace wrote:
    > ...
    > BTW I refactored your test that you were copy-and-pasting around into
    > one method.  Using techniques I mentioned in my first post to you on
    > this subject.
    >
    >      private static void time( TestCase r ) {
    >          long StartTime = System.nanoTime();
    >          for( int i = 1; i <= nRepetitions; i++ ) {
    >             r.parse();
    >          }
    >          long EndTime = System.nanoTime();
    >          long Duration = EndTime - StartTime;
    >          System.out.println( " Duration=" + (Duration/1e9) );
    >      }


    You'll want to run each loop a bunch of times (10,000? 100,000? >=
    1M?) before starting the timing loop in order to cancel the effects of
    HotSpot warmup.

    Unless your real-world scenario pretty much guarantees that HotSpot
    won't be a factor.

    Micro-benchmarks in Java are, at best, a dicey basis for any
    performance conclusions.

    --
    Lew
    lewbloch, Jun 24, 2011
    #12
  13. Gene Wirchenko

    markspace Guest

    On 6/24/2011 8:38 AM, lewbloch wrote:

    > You'll want to run each loop a bunch of times (10,000? 100,000?>=
    > 1M?) before starting the timing loop in order to cancel the effects of
    > HotSpot warmup.



    I was hoping that the -server flag would obviate most or all warm-up
    (this is from another post in this tread):

    C:\Users\Brenden\Dev\Test2\src>java -server test.CallingTest

    No?
    markspace, Jun 24, 2011
    #13
  14. On Thu, 23 Jun 2011 21:02:59 -0700, markspace <-@.> wrote:

    >On 6/23/2011 7:48 PM, Gene Wirchenko wrote:
    >> On Thu, 23 Jun 2011 18:30:21 -0700, markspace<-@.> wrote:
    >>
    >>> Some other things I'd do:

    >>
    >>> 1. Make life easier on the Mark I eyeball and print out your timing in
    >>> seconds:
    >>>
    >>> System.out.println( " Duration=" + (Duration/1e9) );

    >>
    >> Yes, if this were not just test code.

    >
    >Test code is probably more important to get right than the actual
    >production code. You'll be using the test code more than the production. ;)


    No. I expect that I will be using the resulting preprocessor for
    years. The test code will be tossed shortly.

    >I have another question. This code here, most of this is debugging?
    >And you take the entire string and concatenate the characters together,
    >but you don't actually parse this into tokens:


    Yes, I do. The only tokens that I am interested in are
    identifiers.

    [snip]

    >Your just double checking that SequentialSearch and TreesetSearch return
    >the same thing? And you concat the whole cParseString into cIdent, you
    >don't look for white space or anything and break cParseString into tokens?


    No. I am just after the timing.

    The only tokens that I am interested in are the identifiers. The
    identifiers are defined as sequences of characters in IdentChars or
    IdentCharsSet. The rest of the input will be echoed.

    >Don't worry about Hash Test, I have something faster now.


    But *I* want something faster. That was the whole pioont of this
    testing: to find out which way was fastest.

    [snip]

    Sincerely,

    Gene Wirchenko
    Gene Wirchenko, Jun 24, 2011
    #14
  15. On Thu, 23 Jun 2011 19:36:53 -0700, Joshua Cranmer
    <> wrote:

    >On 6/23/2011 4:03 PM, Gene Wirchenko wrote:
    >> Dear Java'ers:
    >>
    >> I have completed my benchmarking. The code is below. Note that
    >> the difference in the bodies between ParseSequentialSearch(),
    >> ParseBinarySearch(), and ParseTreesetSearch() is but one line. I
    >> really would have preferred not having to duplicate the code.
    >>
    >> Oddly, the timings have a LOT of noise in them. In some runs, a
    >> sequential search has out-performed a binary search. Occasionally, a
    >> sequential search has beaten both a binary search and a Treeset
    >> search. The times for sequential searching are only a bit worse than
    >> for binary searching. Treeset searching is about 20% faster. Any
    >> explanations?

    >
    >Here is probably the best implementation:


    Nope. I want a configurable character set for identifiers

    [snip]

    >There are other performance improvements:
    >1. Use StringBuilder. It's mutable, so it avoids copies (kind of like
    >ArrayList). Heck, that's what your code is doing, it just keeps copying
    >back and forth between strings.


    Noted.

    >2. It's probably better to throw out StringBuilder and use token extents
    >(i.e., from index 5 to 9 is this identifier). It saves copying and can
    >allow for better error messages in parsers by being able to pinpoint
    >line/column numbers accurately.


    I plan to for the real thing. I have other changes as well.

    >But yeah, I would have coded my benchmark like that. Since you seem to
    >be concerned with microoptimization here, anything which looks different
    >from what the real parser implementation would show up as having impact
    >in timing numbers.


    Nah, I just do not have a sense of how fast various Java things
    are. Benchmark to find out.

    >Finally, I might add, if you're writing a full parser, it might be
    >better to just use an existing Java lexer rather than rolling your own
    >unless you have a *very* good reason not to.


    I am writing a *simple* parser. It is not for grovelling over
    Java code. It is for a preprocessor for SQL Server for better code
    management. I mean for it to be fairly language-agnostic.

    Sincerely,

    Gene Wirchenko
    Gene Wirchenko, Jun 24, 2011
    #15
  16. Gene Wirchenko

    Guest

    In article <iu129n$d0q$>, markspace <-@.> wrote:
    > On 6/23/2011 7:48 PM, Gene Wirchenko wrote:
    > > On Thu, 23 Jun 2011 18:30:21 -0700, markspace<-@.> wrote:


    [ snip ]

    > BTW I refactored your test that you were copy-and-pasting around into
    > one method. Using techniques I mentioned in my first post to you on
    > this subject.
    >
    > private static void time( TestCase r ) {
    > long StartTime = System.nanoTime();
    > for( int i = 1; i <= nRepetitions; i++ ) {
    > r.parse();
    > }
    > long EndTime = System.nanoTime();
    > long Duration = EndTime - StartTime;
    > System.out.println( " Duration=" + (Duration/1e9) );
    > }


    Gene --

    Below is my revision, along similar lines to what markspace suggests
    above but a complete program program based on your code. I tried not
    to make gratuitous additional changes, but I admit that I did move the
    declaration of the loop counter into the "for" construct because, well,
    *one* gratuitous change to make the program look less Java-unidiomatic?
    (Also I admit I ran the whole program through vim's "reindent", but
    that should change only whitespace.)

    Non-whitespace changes marked "// (blmblm)".


    import java.util.*;



    public class TimingTesting2
    {

    // (blmblm) added
    interface Searcher {
    boolean search(char CurrChar);
    }

    static String cParseString=
    "//identifier//IDENTIFIER//a_b_c abc1234b5%$__dbl;one;two;three;END";

    static String IdentChars=
    "0123456789"+
    "ABCDEFGHIJKLMNOPQRSTUVWXYZ"+
    "_"+
    "abcdefghijklmnopqrstuvwxyz"; // sorted order!

    static SortedSet<Character> IdentCharsSet=new TreeSet<Character>();

    static int nRepetitions=1000000;



    static boolean SequentialSearch
    (
    char CurrChar
    )
    {
    boolean fFound=false;
    for (int i=0; i<IdentChars.length() && !fFound; i++)
    fFound=IdentChars.charAt(i)==CurrChar;
    return fFound;
    }



    static boolean BinarySearch
    (
    char CurrChar
    )
    {
    int xLow=0;
    int xHigh=IdentChars.length()-1;
    int xTry;
    boolean fFound=false;
    while (xLow<=xHigh)
    {
    xTry=(xLow+xHigh)/2;
    if (CurrChar==IdentChars.charAt(xTry))
    return true;
    if (CurrChar<IdentChars.charAt(xTry))
    xHigh=xTry-1;
    else
    xLow=xTry+1;
    }
    return false;
    }



    static boolean TreesetSearch
    (
    char CurrChar
    )
    {
    return IdentCharsSet.contains(CurrChar);
    }


    // (blmblm) merge three separate Parse routines
    static void Parse(Searcher searcher)
    {
    int xScan=0;
    boolean fBuildingIdent=false;
    boolean fInIdentChars;
    String cIdent=""; // fussy init
    while (xScan<cParseString.length())
    {
    char CurrChar=cParseString.charAt(xScan);
    fInIdentChars=SequentialSearch(CurrChar);
    if (searcher.search(CurrChar)) // (blmblm)
    if (fBuildingIdent)
    cIdent+=CurrChar;
    else
    {
    fBuildingIdent=true;
    cIdent=""+CurrChar;
    }
    else
    if (fBuildingIdent)
    {
    fBuildingIdent=false;
    if (nRepetitions==1)
    System.out.println(cIdent);
    }
    else
    {}
    xScan++;
    }
    if (fBuildingIdent)
    if (nRepetitions==1)
    System.out.println(cIdent);
    }


    // (blmblm) factor out common code, as suggested by markspace
    static void time(Searcher searcher) {
    long StartTime;
    long EndTime;
    long Duration;

    StartTime=System.nanoTime();
    // (blmblm) gratuitous change (declare loop counter within "for")
    for (int i=1; i<=nRepetitions; i++)
    Parse(searcher);
    EndTime=System.nanoTime();
    Duration=EndTime-StartTime;
    System.out.println(" Duration="+Duration);
    }


    public static void main(String[] args)
    {
    System.out.println("Timing Testing of Character Searching");
    System.out.println();

    // Initialise Set.
    // (blmblm) gratuitous change (declare loop counter within "for")
    for (int i=0; i<IdentChars.length(); i++)
    IdentCharsSet.add(IdentChars.charAt(i));

    // Character Sequential
    System.out.print("Character Sequential Search");
    // (blmblm) use common-code method
    time(new Searcher() {
    public boolean search(char CurrChar) {
    return SequentialSearch(CurrChar);
    }
    });

    // Character Binary Search
    System.out.print("Character Binary Search ");
    // (blmblm) use common-code method
    time(new Searcher() {
    public boolean search(char CurrChar) {
    return BinarySearch(CurrChar);
    }
    });

    // Character Treeset
    System.out.print("Character Treeset Search ");
    // (blmblm) use common-code method
    time(new Searcher() {
    public boolean search(char CurrChar) {
    return TreesetSearch(CurrChar);
    }
    });
    }

    }

    --
    B. L. Massingill
    ObDisclaimer: I don't speak for my employers; they return the favor.
    , Jun 24, 2011
    #16
  17. Gene Wirchenko

    markspace Guest

    On 6/24/2011 11:45 AM, Gene Wirchenko wrote:

    > No. I expect that I will be using the resulting preprocessor for
    > years. The test code will be tossed shortly.


    You have a couple of problems with your code, one organizational and the
    other understanding the effeciencies.

    The organizational one relates to the idea that you'll just toss your
    tests away. Don't ever do that! The test code is part of the project,
    and should remain with it. Test code is also put under code control,
    and managed along with the projects. It's important because every time
    you want to change your parser, you'll need to re-run the tests to make
    sure everything is working.

    Are you using an IDE? Most will auto generate a test framework for you.
    It's very handy and you should be doing this regardless how you write
    code. The IDE just makes it very handy.

    > No. I am just after the timing.


    The other thing, efficiency, I'll show you right now. The
    organizational stuff is actually probably a bigger deal, but I think
    you'll be happy to see how to make code faster.

    This line here is the biggest offender.

    > cIdent += CurrChar;


    This is super inefficient inside a loop. To do this, the system has to
    create a new string with one extra character, and then toss away the old
    string. Making a new object and tossing an old one is bound to slow you
    down.

    final public void parse() {
    StringBuilder sb1 = new StringBuilder( 255 );
    for( int xScan = 0; xScan <
    TimingTesting.cParseString.length(); xScan++ ) {
    char c = TimingTesting.cParseString.charAt( xScan );
    if( find( c ) ) {
    sb1.append( c );
    }
    }
    String ... = sb1.toString()

    Here's my adaptation of your loop. Notice I make a StringBuilder once,
    outside the loop, and call append() inside the loop, which is much much
    faster. Then I call toString once outside the loop again, so I only
    create a new String once, not each time inside the loop. Try to
    refactor your code to do this, it will make it much faster.


    One last thing for now: on splitting a string into tokens, look at this:

    String[] tok = TimingTesting.cParseString.split( "[^a-zA-Z0-9]+" );
    System.out.println( Arrays.toString( tok ) );
    markspace, Jun 24, 2011
    #17
  18. Gene Wirchenko

    Jeff Higgins Guest

    On 06/24/2011 02:50 PM, Gene Wirchenko wrote:
    > On Thu, 23 Jun 2011 19:36:53 -0700, Joshua Cranmer


    >> Finally, I might add, if you're writing a full parser, it might be
    >> better to just use an existing Java lexer rather than rolling your own
    >> unless you have a *very* good reason not to.

    >
    > I am writing a *simple* parser. It is not for grovelling over
    > Java code. It is for a preprocessor for SQL Server for better code
    > management. I mean for it to be fairly language-agnostic.
    >

    What means simple? JavaCC is the parser generator that I'm most familiar
    with. <http://javacc.java.net/>
    Jeff Higgins, Jun 24, 2011
    #18
  19. On 6/24/2011 11:50 AM, Gene Wirchenko wrote:
    > On Thu, 23 Jun 2011 19:36:53 -0700, Joshua Cranmer
    > <> wrote:
    >> Here is probably the best implementation:

    >
    > Nope. I want a configurable character set for identifiers


    You can configure it. You configure it by calling the first method. If
    you don't like it, you can have people configure it by setting the array
    value themselves. Note that this is a similar implementation to your
    "TreeSet" idea, just using a raw array instead of a TreeSet.

    If you are really concerned about speeds, it is a higher-cost initial
    overhead and an asymptotically and practically faster per-character
    lookup time.

    --
    Beware of bugs in the above code; I have only proved it correct, not
    tried it. -- Donald E. Knuth
    Joshua Cranmer, Jun 24, 2011
    #19
  20. Gene Wirchenko

    Lew Guest

    On 06/24/2011 12:04 PM, markspace wrote:
    > On 6/24/2011 8:38 AM, lewbloch wrote:
    >
    >> You'll want to run each loop a bunch of times (10,000? 100,000?>=
    >> 1M?) before starting the timing loop in order to cancel the effects of
    >> HotSpot warmup.

    >
    >
    > I was hoping that the -server flag would obviate most or all warm-up (this is
    > from another post in this tread):
    >
    > C:\Users\Brenden\Dev\Test2\src>java -server test.CallingTest
    >
    > No?


    No. How could it?

    --
    Lew
    Honi soit qui mal y pense.
    http://upload.wikimedia.org/wikipedia/commons/c/cf/Friz.jpg
    Lew, Jun 26, 2011
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Neo Geshel
    Replies:
    5
    Views:
    422
    Neo Geshel
    Mar 13, 2006
  2. kc
    Replies:
    3
    Views:
    8,337
  3. Replies:
    34
    Views:
    1,249
    Neredbojias
    Jun 3, 2006
  4. legere

    SICP in Python, redux

    legere, Jun 29, 2003, in forum: Python
    Replies:
    0
    Views:
    1,767
    legere
    Jun 29, 2003
  5. Guido van Rossum
    Replies:
    21
    Views:
    591
    Steven Bethard
    Jun 6, 2005
Loading...

Share This Page