Symbolic benchmark

J

Jon Harrop

Following recent discussions here about benchmarking and when Java's
performance is worst, I proposed the idea of symbolic computation because
this entails very high allocation rates for short-lived objects when
written in an idiomatic functional style and Java should be uniquely bad at
this. While you might be able to work around the functional style using
imperative constructs, it will be extremely tedious and error-prone for
such applications.

So, perhaps someone would like to translate this simple symbolic simplifier
into Java:

http://www.lambdassociates.org/studies/study10.htm

Here is the OCaml implementation:

let rec ( +: ) f g = match f, g with
| `Int n, `Int m -> `Int (n +/ m)
| `Int (Int 0), e | e, `Int (Int 0) -> e
| f, `Add(g, h) -> f +: g +: h
| f, g -> `Add(f, g)

let rec ( *: ) f g = match f, g with
| `Int n, `Int m -> `Int (n */ m)
| `Int (Int 0), e | e, `Int (Int 0) -> `Int (Int 0)
| `Int (Int 1), e | e, `Int (Int 1) -> e
| f, `Mul(g, h) -> f *: g *: h
| f, g -> `Mul(f, g)

let rec simplify = function
| `Int _ | `Var _ as f -> f
| `Add (f, g) -> simplify f +: simplify g
| `Mul (f, g) -> simplify f *: simplify g

OCaml is not terribly efficient at this benchmark but I believe it will be
many times faster than any Java solution.

The program simply traverses a symbolic expression, reconstructing it whilst
applying some simple rewrite rules to simplify it. For example, "(a+0)*1*b"
becomes "a*b". The applications are obvious: parsers, compilers,
interpreters, scientific computing and computer algebra.

Note that the symbolic expression is immutable: you must not destroy your
input! Also, numbers are all arbitrary precision integers.
 
L

Lew

Jon said:
Following recent discussions here about benchmarking and when Java's
performance is worst, I proposed the idea of symbolic computation because
this entails very high allocation rates for short-lived objects when
written in an idiomatic functional style and Java should be uniquely bad at
this.

Java is very good at high allocation rates for short-lived objects; it's
optimized for it. Don't believe Jon's rants, people.
 
J

Jon Harrop

Lew said:
Java is very good at high allocation rates for short-lived objects; it's
optimized for it. Don't believe Jon's rants, people.

Prove it by implementing that benchmark in Java.
 
D

Daniel Pitts

Jon said:
Prove it by implementing that benchmark in Java.
Actually, why don't you prove it (or disprove it). I'm sure most of the
Java advocates here have better things to do than chase your wild goose.
 
J

Jon Harrop

Daniel said:
Actually, why don't you prove it (or disprove it). I'm sure most of the
Java advocates here have better things to do than chase your wild goose.

Because you would only blame Java's poor performance on me rather than Java.
 
J

Jon Harrop

Steve said:
Since we've been talking about C/C++ and Java, got a version in C or C++?
:)

I posted something similar here (for derivatives rather than
simplification):

http://www.codecodex.com/wiki/index.php?title=Derivative

but this problem really requires a GC to do it properly because you must be
able to handle shared subexpressions.

#include <iostream>

using namespace std;

class Var;

class Base {
public:
virtual ~Base() {};
virtual const Base *clone() = 0;
virtual const Base *d(const string &v) const = 0;
virtual ostream &print(ostream &o) const = 0;
};

ostream &operator<<(ostream &o, const Base &e) { e.print(o); return o; }

class Int : public Base {
const int n;
public:
Int(int m) : n(m) {}
~Int() {}
const Base *clone() { return new Int(n); }
const Base *d(const string &v) const { return new Int(0); }
ostream &print(ostream &o) const { return o << n; }
};

class Var : public Base {
const string var;
public:
Var(string v) : var(v) {}
~Var() {}
const Base *clone() { return new Var(var); }
const Base *d(const string &v) const { return new Int(var == v ? 1 : 0); }
ostream &print(ostream &o) const { return o << var; }
};

class Plus : public Base {
const Base *e1, *e2;
public:
Plus(const Base *a, const Base *b) : e1(a), e2(b) {}
~Plus() { delete e1; delete e2; }
const Base *clone() { return new Plus(e1, e2); }
const Base *d(const string &v) const { return new Plus(e1->d(v),
e2->d(v)); }
ostream &print(ostream &o) const
{ return o << "(" << *e1 << " + " << *e2 << ")"; }
};

class Times : public Base {
const Base *e1, *e2;
public:
Times(const Base *a, const Base *b) : e1(a), e2(b) {}
~Times() { delete e1; delete e2; }
const Base *clone() { return new Times(e1, e2); }
const Base *d(const string &v) const
{ return new Plus(new Times(e1, e2->d(v)), new Times(e1->d(v), e2)); }
ostream &print(ostream &o) const { return o << "(" << *e1 << " * " << *e2
<< ")"; }
};

class Expr {
public:
Base *e;
Expr(Base *a) : e(a) {}
};

const Expr operator+(const Expr e1, const Expr e2)
{ return Expr(new Plus(e1.e->clone(), e2.e->clone())); }
const Expr operator*(const Expr e1, const Expr e2)
{ return Expr(new Times(e1.e->clone(), e2.e->clone())); }

ostream &operator<<(ostream &o, const Expr e) { return o << e.e; }

int main() {
Var vx("x"), va("a"), vb("b"), vc("c");
Expr x(&vx), a(&va), b(&vb), c(&vc);
Expr e = a*x*x + b*x + c;
cout << "d(" << *(e.e) << ", x) = " << *(e.e->d("x")) << endl;
return 0;
}
 
L

Lew

Jon said:
Because you would only blame Java's poor performance on me rather than Java.

Oh, so when it's time to shit or get off the pot, you just attack the person
and (likely inaccurately) predict bad behavior on their part to excuse your
lack of intellectual honesty, Mr. Cambridge Don.
 
J

Jon Harrop

Lew said:
Oh, so when it's time to shit or get off the pot, you just attack the
person and (likely inaccurately) predict bad behavior on their part to
excuse your lack of intellectual honesty, Mr. Cambridge Don.

Sounds like you can't write an efficient Java implementation of this
benchmark either.
 
R

Roger Lindsjö

Jon said:
Following recent discussions here about benchmarking and when Java's
performance is worst, I proposed the idea of symbolic computation because
this entails very high allocation rates for short-lived objects when
written in an idiomatic functional style and Java should be uniquely bad at
this. While you might be able to work around the functional style using
imperative constructs, it will be extremely tedious and error-prone for
such applications.

So, perhaps someone would like to translate this simple symbolic simplifier
into Java:

I'll post the code tonight. I have not tried to reduce the line count.

My program first prints the expression, then run 10 loops with 10000000
iterations and prints the duration for each loop. At the end it prints
the final reduction. The machine is a 2.4GHz Core 2 Quad.

[* x [+ [+ [* 12 0] [+ 23 8]] y]]
1355
1281
1275
1273
1277
1284
1279
1279
1272
1274
[* x [+ 31 y]]

I'm used to Ocaml, but it looked like Haskell, so I think my simplify is
correct.
The program simply traverses a symbolic expression, reconstructing it whilst
applying some simple rewrite rules to simplify it. For example, "(a+0)*1*b"
becomes "a*b". The applications are obvious: parsers, compilers,
interpreters, scientific computing and computer algebra.

Note that the symbolic expression is immutable: you must not destroy your
input! Also, numbers are all arbitrary precision integers.

The last rule, that it should handle arbitrary precision integers made
me use BigIntegers, with int the program ran about twice as fast.

//Roger Lindsjö
 
D

Daniel Pitts

Jon said:
Sounds like you can't write an efficient Java implementation of this
benchmark either.
What's with the personal attacks here? No one accused you of being unable.

I personally said I was unwilling to at this time.
 
R

Roger Lindsjö

Roger said:
I'm used to Ocaml, but it looked like Haskell, so I think my simplify is
correct.

Of course I meant to say "I'm not used to Ocaml..."

//Roger Lindsjö
 
R

Roger Lindsjö

Before being too enthusiastic, I did not have the associative rules
implemented, now my times are about 1.6 seconds for 10000000 iterations.

Also, while testing I notice that with the rules (at least the way I
implemented them) [+ [+ x 3] [+ 1 y]] would get simplified to
[+ [+ [+ x 3] 1] y]. Jon, it that true for your implementation also? (No
need posting code if it does not fully work).

Jon, feel free to email me, just use my first name instead of news.nospam.

//Roger Lindsjö
 
J

Jon Harrop

Roger said:
Before being too enthusiastic, I did not have the associative rules
implemented, now my times are about 1.6 seconds for 10000000 iterations.

Also, while testing I notice that with the rules (at least the way I
implemented them) [+ [+ x 3] [+ 1 y]] would get simplified to
[+ [+ [+ x 3] 1] y]. Jon, it that true for your implementation also? (No
need posting code if it does not fully work).

That is the same, yes.
 
C

Chris Dollin

Jon said:
Following recent discussions here about benchmarking and when Java's
performance is worst, I proposed the idea of symbolic computation because
this entails very high allocation rates for short-lived objects when
written in an idiomatic functional style and Java should be uniquely bad at
this. While you might be able to work around the functional style using
imperative constructs, it will be extremely tedious and error-prone for
such applications.

So, perhaps someone would like to translate this simple symbolic simplifier
into Java:

http://www.lambdassociates.org/studies/study10.htm

[[CAVEAT: hacked together over a lunchtime or so, in no way any
official statement of anything from anyone, etc.]]

OK, my broken-handed implementation running on a 3.06GHz Xeon with 2GB
memory (not that that matters much, since the JVM is only using the
default which is more like 64M) using FC5 with

Java(TM) SE Runtime Environment (build 1.6.0-b105)
Java HotSpot(TM) Client VM (build 1.6.0-b105, mixed mode, sharing)

and running (of course) with -server takes 2.498s, 2.94s, 3.002s,
for the 10000000-times loop (ie neglecting start-up times, which
according to `time` are about 0.2s). I don't know why there's so
much variability in the reported times.

Drat. OK, those times are using BigDecimal, not BigInteger, which
gets me 3.258s, 3.281s, 3.207s as the loop times. (Using plain
`int` gets those times down to 1.986s, 1.62s, 1.987s, as a measure
of the work done by the big-integer arithmetics.)

-server about halves the runtime. My initial -server-free code took
about 9s [with BigDecimal], but some straightforward optimisation work
(none of which comes close to the obvious caching cheat; just reordering
the tests and ensuring that the important values 0 and 1 were testable
with ==) reduced that.

I tried the obvious tactic of going for double-dispatches to avoid
multiple consecutive instanceof tests, but that made it /slower/,
perhaps because hotspot couldn't collapse code across method
boundaries? I don't really mind because in this case the multiple
dispatches made the code rather more long-winded than it already
was.

Count the line-size yourselves and subtract lines you deem not
relevant to the benchmark.

;;; -- code starteth here ----------------------------------

package temp;

import static org.junit.Assert.assertEquals;
import static org.junit.Assert.assertSame;

import java.math.BigDecimal;
import java.util.StringTokenizer;

import org.junit.Test;

/*
let rec ( +: ) f g = match f, g with
| `Int n, `Int m -> `Int (n +/ m)
| `Int (Int 0), e | e, `Int (Int 0) -> e
| f, `Add(g, h) -> f +: g +: h
| f, g -> `Add(f, g)

let rec ( *: ) f g = match f, g with
| `Int n, `Int m -> `Int (n * / m)
| `Int (Int 0), e | e, `Int (Int 0) -> `Int (Int 0)
| `Int (Int 1), e | e, `Int (Int 1) -> e
| f, `Mul(g, h) -> f *: g *: h
| f, g -> `Mul(f, g)

let rec simplify = function
| `Int _ | `Var _ as f -> f
| `Add (f, g) -> simplify f +: simplify g
| `Mul (f, g) -> simplify f * : simplify g

*/
public class SimplifyViaInstanceOf
{ @test public void ensureParseCompleteConstants()
{
assertSame( zero, parse( "0" ) );
assertSame( one, parse( "1" ) );
assertEquals( new Val( new BigDecimal( "17" ) ), parse( "17" ) );
assertEquals( new Val( new BigDecimal( "1234567890" ) ), parse( "1234567890" ) );
}

@test public void ensureParseCompleteNames()
{
assertEquals( new Sym( "x" ), parse( "x" ) );
assertEquals( new Sym( "y" ), parse( "y" ) );
assertEquals( new Sym( "calico" ), parse( "calico" ) );
assertEquals( new Sym( "whoot" ), parse( "whoot" ) );
}

@test public void ensureParseSimpleExpression()
{
assertEquals( new Add( new Sym( "x" ), new Sym( "y" ) ), parse( "(x+y)" ) );
assertEquals( new Add( one, new Sym( "y" ) ), parse( "(1+y)" ) );
assertEquals( new Add( zero, new Sym( "y" ) ), parse( "(0+y)" ) );
assertEquals( new Add( new Sym( "x" ), zero ), parse( "(x+0)" ) );
assertEquals( new Mul( new Sym( "x" ), new Sym( "y" ) ), parse( "(x*y)" ) );
assertEquals( new Mul( new Sym( "z" ), zero ), parse( "(z*0)" ) );
assertEquals( new Mul( zero, new Sym( "x" ) ), parse( "(0*x)" ) );
}

@test public void ensureParseCompoundExpression()
{
assertEquals( new Mul( parse( "(x+y)" ), one ), parse( "((x+y)*1)" ) );
assertEquals( new Mul( parse( "(x+y)" ), zero ), parse( "((x+y)*0)" ) );
assertEquals( new Mul( parse( "(x+y)" ), new Sym( "x" ) ), parse( "((x+y)*x)" ) );
assertEquals( new Mul( parse( "(x+y)"), parse( "(0*1)" ) ), parse( "((x+y)*(0*1))" ) );
}

private Exp parse( String string )
{
StringTokenizer st = new StringTokenizer( string, "+*()", true );
return parse( st, st.nextToken() );
}

private Exp parse( StringTokenizer st, String current )
{
char ch = current.charAt(0);
if (ch == '(')
{
Exp L = parse( st, st.nextToken() );
ch = st.nextToken().charAt( 0 );
Exp R = parse( st, st.nextToken() );
String ket = st.nextToken();
if (!ket.equals( ")" )) throw new RuntimeException( "ket needed: " + ket );
return ch == '*' ? new Mul( L, R ) : new Add( L, R );
}
else if (Character.isDigit( ch ))
return Val.create( new BigDecimal( current ) );
else if (current.equals( "*" ) || current.equals( "+" ) || current.equals( " )" ))
throw new RuntimeException( "misplaced punctuation: " + current );
else
return new Sym( current );
}

@test public void ensureSimplifiesAddZero()
{
assertEquals( parse( "x" ), parse( "(x+0)" ).simplify() );
assertEquals( parse( "x" ), parse( "(0+x)" ).simplify() );
}

@test public void ensureAddsLiterals()
{
assertEquals( parse( "3" ), parse( "(1+2)" ).simplify() );
assertSame( zero, parse( "0+0" ).simplify() );
}

@test public void ensureMultipliesLiterals()
{
assertEquals( parse( "15" ), parse( "(5*3)" ).simplify() );
assertSame( zero, parse( "(0*0)" ).simplify() );
assertSame( zero, parse( "(x*0)" ).simplify() );
assertSame( zero, parse( "(0*y)" ).simplify() );
}

@test public void ensureSimplifiesMulZero()
{
assertSame( zero, parse( "(x*0)" ).simplify() );
assertSame( zero, parse( "(0*x)" ).simplify() );
}

@test public void ensureSimplifiesMulOne()
{
assertEquals( parse( "x" ), parse( "(x*1)" ).simplify() );
assertEquals( parse( "x" ), parse( "(1*x)" ).simplify() );
}

@test public void ensureLeftAssociatesAdds()
{
assertEquals( parse( "((x+y)+z)" ), parse( "(x+(y+z))" ).simplify() );
}

@test public void ensureLeftAssociatesMuls()
{
assertEquals( parse( "((x*y)*z)" ), parse( "(x*(y*z))" ).simplify() );
}

static abstract class Exp
{
abstract Exp simplify();
}

static final Val zero = new Val( BigDecimal.ZERO );
static final Val one = new Val( BigDecimal.ONE );

static class Val extends Exp
{
final BigDecimal val;

Val( BigDecimal val ) { this.val = val; }

Exp simplify() { return this; }

public String toString() { return val.toString(); }

public boolean equals( Object other )
{ return other instanceof Val && val.equals( ((Val) other).val ); }

public static Exp create( BigDecimal d )
{ return d.equals( BigDecimal.ZERO ) ? zero : d.equals( BigDecimal.ONE ) ? one : new Val( d ); }
}

static class Sym extends Exp
{
String name;
Sym( String name ) { this.name = name; }
Exp simplify() { return this; }
public boolean equals( Object other )
{ return other instanceof Sym && name.equals( ((Sym) other ).name ); }
public String toString() { return name; }
}

static abstract class Inf extends Exp
{
final Exp L, R;
Inf( Exp L, Exp R ) { this.L = L; this.R = R; }
}

static class Add extends Inf
{
Add( Exp L, Exp R ) { super( L, R ); }
Exp simplify()
{
return sum( L.simplify(), R.simplify() );
}
private Exp sum( Exp sL, Exp R )
{
if (sL ==zero) return R;
if (R == zero) return sL;
if (sL instanceof Val && R instanceof Val) return Val.create( ((Val) sL).val.add( ((Val) R).val ) );
if (R instanceof Add) return sum( sum( sL, ((Add) R).L ), ((Add) R).R );
return new Add( sL, R );
}
public String toString() { return "(" + L + " + " + R + ")"; }

public boolean equals( Object other )
{ return other instanceof Add && same( (Add) other ); }

private boolean same( Add other )
{ return L.equals( other.L ) && R.equals( other.R ); }
}

static class Mul extends Inf
{
Mul( Exp L, Exp R ) { super( L, R ); }
Exp simplify(){ return prod( L.simplify(), R.simplify() ); }
private Exp prod( Exp L, Exp R )
{
if (L == zero || R ==zero) return zero;
if (L == one) return R;
if (R == one) return L;
if (L instanceof Val && R instanceof Val) return Val.create( ((Val) L).val.multiply( ((Val) R).val ) );
if (R instanceof Mul) return prod( prod( L, ((Mul) R).L ), ((Mul) R).R );
return new Mul( L, R );
}
public String toString() { return "(" + L + " * " + R + ")"; }

public boolean equals( Object other )
{ return other instanceof Mul && same( (Mul) other ); }

private boolean same( Mul other )
{ return L.equals( other.L ) && R.equals( other.R ); }
}

public static void main( String [] args )
{
// [* x [+ [+ [* 12 0] [+ 23 8]] y]]
Exp a28_8 = new Add( Val.create( new BigDecimal( 23 ) ), Val.create( new BigDecimal( 8 ) ) );
Exp m12_0 = new Mul( Val.create( new BigDecimal( 12 ) ), Val.create( new BigDecimal( 0 ) ) );
Exp aAB = new Add( m12_0, a28_8 );
Exp aABY = new Add( aAB, new Sym( "y" ) );
Exp it = new Mul( new Sym( "x" ), aABY );
Exp itx = new SimplifyViaInstanceOf().parse( "(x*(((12*0)+(23+8))+y))" );
System.err.println( ">> original expression: " + it );
System.err.println( ">> original expression: " + itx );
System.err.println( ">> simplified expression: " + itx.simplify() );
long base = System.currentTimeMillis();
for (int i = 0; i < 10000000; i += 1) it.simplify();
long after = System.currentTimeMillis();
System.err.println( ">> took " + (after - base)/1000.0 + "s" );
}
}
 
J

Jon Harrop

Chris said:
OK, my broken-handed implementation running on a 3.06GHz Xeon with 2GB
memory (not that that matters much, since the JVM is only using the
default which is more like 64M) using FC5 with

Java(TM) SE Runtime Environment (build 1.6.0-b105)
Java HotSpot(TM) Client VM (build 1.6.0-b105, mixed mode, sharing)

and running (of course) with -server takes 2.498s, 2.94s, 3.002s,
for the 10000000-times loop (ie neglecting start-up times, which
according to `time` are about 0.2s). I don't know why there's so
much variability in the reported times.
Great.

Drat. OK, those times are using BigDecimal, not BigInteger, which
gets me 3.258s, 3.281s, 3.207s as the loop times. (Using plain
`int` gets those times down to 1.986s, 1.62s, 1.987s, as a measure
of the work done by the big-integer arithmetics.)

Let's forget about the arbitrary-precision stuff and focus on the rest for
now. On my machine, your original code gives 2.853. Changing to
machine-precision ints that drops to 1.508s. The equivalent OCaml takes
only 0.711s.
-server about halves the runtime. My initial -server-free code took
about 9s [with BigDecimal], but some straightforward optimisation work
(none of which comes close to the obvious caching cheat; just reordering
the tests and ensuring that the important values 0 and 1 were testable
with ==) reduced that.

Yes, the OCaml compiler will automatically factor common subexpressions like
the constant integer expressions.
I tried the obvious tactic of going for double-dispatches to avoid
multiple consecutive instanceof tests, but that made it /slower/,
perhaps because hotspot couldn't collapse code across method
boundaries? I don't really mind because in this case the multiple
dispatches made the code rather more long-winded than it already
was.
Interesting.

Count the line-size yourselves and subtract lines you deem not
relevant to the benchmark.

I get 175 LOC for Roger's Java code, 117 LOC for your Java and 40 LOC for my
OCaml.
 
R

Roger Lindsjö

I have been talking to Don off line and I used Chris Dollin's idea of
using cached values for 0 and 1. (Chris and my code were actually quite
similar from start, and on par for speed, but after using Cris ideo of
cached 0 and 1 my end up 10% faster or so). I have changed some names to
look more like Chris code to make them easier to compare. Initially I
did not use instanceof but instead had a type for each element which I
could use to check the type. This was actually slower than instanceof
(surprised me).

Using BigInteger or BigDecimal made no difference in my test, so I used
BigDecimal.

The machine I used was a 2.4GHz Core 2 Quad with 2GB memory, and while
running any of the programs they seemed to use 100% of one core.

java Simplify 10000000 3
[* x [+ [+ [* 12 0] [+ 23 8]] y]]
Bench 0
Took 1.16 seconds
Bench 1
Took 1.11 seconds
Bench 2
Took 1.104 seconds
[* x [+ 31 y]]

However, after installing Ocaml I compiled Dan's program and ran. The
output was (I modified it to run several times):
time ./simplify
Took 1.514769s
Took 1.515770s
Took 1.518769s

So now I get confused, according to Don this code should be about twice
as fast as the Java code, but in my tests Java is slightly faster. Why
is that? Perhaps I have the wrong compiler, mine says 3.09.3. There
seemed to be a 3.10 out, but is wasn't part of Fedora 7 regular
repository. Could upgrading to 3.10 actually make the program at least
twice as fast?

Can anyone explain? Could it be that on my architecture the HotSpot is
better at optimizing than the Ocaml compiler?

I then modified my program to use long instead of BigDecimal and got the
following:
java SimplifyLong 10000000 3
[* x [+ [+ [* 12 0] [+ 23 8]] y]]
Bench 0
Took 0.61 seconds
Bench 1
Took 0.549 seconds
Bench 2
Took 0.558 seconds
[* x [+ 31 y]]

Simplify.java
import java.math.BigDecimal;

public class Simplify {
public static void main(String[] args) {
int lim = Integer.parseInt(args[0]);
int nbench = Integer.parseInt(args[1]);

Expr expr =
mul(var("x"),
add(
add(mul(rat(12),rat(0)), add(rat(23),rat(8))),
var("y")
)
);

System.out.println(expr);
Expr reduced = null;
for (int batch = 0; batch < nbench; batch++) {
long start = System.currentTimeMillis();
System.err.println("Bench " + batch);
for (int i = 0; i < lim; i++) {
reduced = expr.simplify();
}
System.err.println("Took " +
(System.currentTimeMillis() - start) / 1000.0 + " seconds");
}
System.out.println(reduced);
}

private static Expr mul(Expr a, Expr b) {
return new Mul(a,b);
}
private static Expr add(Expr a, Expr b) {
return new Add(a,b);
}
private static Expr rat(long a) {
return Val.create(BigDecimal.valueOf(a));
}
private static Expr var(String s) {
return new Sym(s);
}

public static abstract class Expr {
public abstract Expr simplify();
}

public static final class Add extends Expr {
private final Expr l, r;

public Add(Expr l, Expr r) {
this.l = l;
this.r = r;
}

public Expr simplify() {
return add(l.simplify(), r.simplify());
}

private Expr add(Expr l, Expr r) {
if (Val.ZERO == l) {
return r;
} else if (Val.ZERO == r) {
return l;
} else if (l instanceof Val && r instanceof Val) {
return add((Val) l, (Val) r) ;
} else if (r instanceof Add) {
return add(l, (Add) r);
} else {
return new Add(l, r);
}
}

private Expr add(Val l, Val r) {
return Val.create(l.getValue().add(r.getValue()));
}

private Expr add(Expr l, Add r) {
return new Add(new Add(l, r.l), r.r).simplify();
}

public String toString() {
return "[+ " + l + " " + r +"]";
}
}

public static class Mul extends Expr {
private final Expr l, r;
public Mul(Expr l, Expr r) {
this.l = l;
this.r = r;
}

public Expr simplify() {
return mul(l.simplify(), r.simplify());
}

private Expr mul(Expr l, Expr r) {
if (Val.ZERO == l || Val.ZERO == r) {
return Val.ZERO;
} else if (Val.ONE == l) {
return r;
} else if (Val.ONE == r) {
return l;
} else if (l instanceof Val && r instanceof Val) {
return mul((Val) l, (Val) r) ;
} else if (r instanceof Mul) {
return mul(l, (Mul) r);
} else {
return new Mul(l, r);
}
}

private Expr mul(Val l, Val r) {
return Val.create(l.getValue().multiply(r.getValue()));
}

private Expr mul(Expr l, Mul r) {
return new Mul(new Mul(l, r.l), r.r).simplify();
}

public String toString() {
return "[* " + l + " " + r +"]";
}
}

public static class Sym extends Expr {
private final String symbol;

public Sym(String symbol) {
this.symbol = symbol;
}

public Expr simplify() {
return this;
}

public String toString() {
return symbol;
}
}

public static class Val extends Expr {
public static final Val ZERO = new Val(BigDecimal.ZERO);
public static final Val ONE = new Val(BigDecimal.ONE);

private final BigDecimal value;

private Val(BigDecimal value) {
this.value = value;
}

public Expr simplify() {
return this;
}

public static Val create(BigDecimal value) {
if (BigDecimal.ZERO == value || BigDecimal.ZERO.equals(value)) {
return ZERO;
} else if (BigDecimal.ONE == value && BigDecimal.ONE.equals(value)) {
return ONE;
} else {
return new Val(value);
}
}

public BigDecimal getValue() {
return value;
}

public String toString() {
return String.valueOf(value);
}
}
}

//Roger Lindsjö
 
S

Steve Wampler

Jon said:
Following recent discussions here about benchmarking and when Java's
performance is worst, I proposed the idea of symbolic computation because
this entails very high allocation rates for short-lived objects when
written in an idiomatic functional style and Java should be uniquely bad at
this. While you might be able to work around the functional style using
imperative constructs, it will be extremely tedious and error-prone for
such applications.

So, perhaps someone would like to translate this simple symbolic simplifier
into Java:

http://www.lambdassociates.org/studies/study10.htm

Here's another Java one. I didn't bother with parsing an
input string because it looks like people were timing
just the time to walk the parse tree and simplify it.

I've been hesitant to post it because it feels like cheating
(and the code isn't very OO - I actually tend to think in
a different language for these puzzles and then cast the
result into Java).

Here are times on my dual-Opteron@2GHz, using BigInteger,
each run does 10,000,000 simplifications and then outputs
the original string and the simplified version (taken
from the last run):

------------------------------------------------------
time java -server Simplify 10000000 10
10000000 '[* x [+ [+ [* 12 0][+ 23 8]] y ]]' -> '[* x [+ 31 y]]':: 0.0090
10000000 '[* x [+ [+ [* 12 0][+ 23 8]] y ]]' -> '[* x [+ 31 y]]':: 0.0
10000000 '[* x [+ [+ [* 12 0][+ 23 8]] y ]]' -> '[* x [+ 31 y]]':: 0.0
10000000 '[* x [+ [+ [* 12 0][+ 23 8]] y ]]' -> '[* x [+ 31 y]]':: 0.0
10000000 '[* x [+ [+ [* 12 0][+ 23 8]] y ]]' -> '[* x [+ 31 y]]':: 0.0
10000000 '[* x [+ [+ [* 12 0][+ 23 8]] y ]]' -> '[* x [+ 31 y]]':: 0.0
10000000 '[* x [+ [+ [* 12 0][+ 23 8]] y ]]' -> '[* x [+ 31 y]]':: 0.0
10000000 '[* x [+ [+ [* 12 0][+ 23 8]] y ]]' -> '[* x [+ 31 y]]':: 0.0
10000000 '[* x [+ [+ [* 12 0][+ 23 8]] y ]]' -> '[* x [+ 31 y]]':: 0.0010
10000000 '[* x [+ [+ [* 12 0][+ 23 8]] y ]]' -> '[* x [+ 31 y]]':: 0.0
java -server Simplify 10000000 10 0.11s user 0.03s system 103% cpu 0.135 total
------------------------------------------------------

Why so fast? Because the simplified string is built up as the parse
is constructed. So the simplification loop just outputs the result!
I'll grant that this isn't in the spirit of things, but I couldn't see
where it was prohibited :) After all, if you're allowed to simplify
while building the parse tree, this isn't much more of a stretch...

(Naturally, this would work when parsing of input strings is done instead
of hand-building the parse tree, as well.)

-----------------------------------------------------
import java.math.BigInteger;
import java.util.Date;

public class Simplify {

private static BigInteger zero = new BigInteger("0");
private static BigInteger one = new BigInteger("1");

private class Expr {
public String type;
public Expr left;
public Expr right;
public BigInteger iVal;
public String sVal;

public Expr(int nVal) {
type = "int";
iVal = new BigInteger(""+nVal);
sVal = iVal.toString();
}

public Expr(String var) {
type = "var";
sVal = var;
}

public Expr(String nType, Expr nLeft, Expr nRight) {
if (nType == "+") {
if (nLeft.eq(zero)) { mkCopy(nRight); }
else if (nRight.eq(zero)) { mkCopy(nLeft); }
else if ((nLeft.type == "int") && (nRight.type == "int")) {
type = "int";
iVal = nLeft.iVal.add(nRight.iVal);
sVal = iVal.toString();
}
else lassoc(nType, nLeft, nRight);
}
if (nType == "*") {
if (nLeft.eq(one)) { mkCopy(nRight); }
else if (nRight.eq(one)) { mkCopy(nLeft); }
else if ((nLeft.type == "int") && (nRight.type =="int")) {
type = "int";
iVal = nLeft.iVal.multiply(nRight.iVal);
sVal = iVal.toString();
}
else lassoc(nType, nLeft, nRight);
}
}

private void lassoc(String nType, Expr nLeft, Expr nRight) {
if (nRight.type == nType) {
type = nType;
left = new Expr(nType, nLeft, nRight.left);
right = nRight.right;
sVal = "["+nType+" "+left+" "+right+"]";
}
else mkCopy(nType, nLeft, nRight, zero,
"["+nType+" "+nLeft+" "+nRight+"]");
}

private void mkCopy(Expr e) {
mkCopy(e.type, e.left, e.right, e.iVal, e.sVal);
}

private void mkCopy(String nType, Expr nLeft, Expr nRight,
BigInteger nVal, String nsVal) {
type = nType;
left = nLeft;
right = nRight;
iVal = nVal;
sVal = nsVal;
}

public boolean eq(BigInteger i) {
return (type == "int") && (iVal.equals(i));
}

public String toString() { return sVal; }

public String simplify() { return sVal; }
}

public static String simplify(Expr expr) {
return expr.simplify();
}

public void runTest(int limit) {
String s = "[* x [+ [+ [* 12 0][+ 23 8]] y ]]";
System.out.print(""+limit+" '"+s+"' -> ");
Expr e = new Expr("*",
new Expr("x"),
new Expr("+",
new Expr("+",
new Expr("*", new Expr(12), new Expr(0) ),
new Expr("+", new Expr(23), new Expr(8) ) ) ,
new Expr("y") ) );
Date start = new Date();
String newS = null;
for (int i = 0; i < limit; ++i) {
newS = simplify(e);
}
Date stop = new Date();
System.out.println("'"+newS+"':: "+
(stop.getTime() - start.getTime())/1000.0);
}

public static void main(String[] args) {
int limit = 10000000;
if (args.length > 0) limit = Integer.parseInt(args[0]);
int nRuns = 10;
if (args.length > 1) nRuns = Integer.parseInt(args[1]);

Simplify simplify = new Simplify();
for (int i = 0; i < nRuns; ++i) {
simplify.runTest(limit);
}
System.exit(0);
}
}
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,065
Latest member
OrderGreenAcreCBD

Latest Threads

Top