Immutable Datastructures with good Sharing

J

Jan Burse

markspace said:
Cloning or copying has got to be slow.

Yes, thats why I wrote (21:33):

"When you clone you need to go through the whole array and copy
each element, this takes time. When you share, you don't need
to spend this time."

Probably there is a problem with the newsreader
used by you guys. I see this over and over by
posts here from various people, they only use
a very small view of the thread as context.

Bye
 
J

Jan Burse

Jussi said:
There is a book titled PURELY FUNCTIONAL DATA
STRUCTURES by Chris Okasaki.

Instead of buying the book, one might have a look
at his thesis. Found here:

http://www.cs.cmu.edu/~rwh/theses/okasaki.pdf

I had a look yesterday. And had the feeling that some
parts are a little bit over the top for implementation
in Java, namely the lazy data structure stuff.

The lazy data structure stuff would need to
first figure out a way to easily and efficiently
implement this in Java, maybe for some special
cases only. Guess some closure would be needed.

But otherwise you find the double stack
queue again there.

Bye
 
A

Arved Sandstrom

Instead of buying the book, one might have a look
at his thesis. Found here:

http://www.cs.cmu.edu/~rwh/theses/okasaki.pdf

I had a look yesterday. And had the feeling that some
parts are a little bit over the top for implementation
in Java, namely the lazy data structure stuff.

The lazy data structure stuff would need to
first figure out a way to easily and efficiently
implement this in Java, maybe for some special
cases only. Guess some closure would be needed.

But otherwise you find the double stack
queue again there.

Bye

Okasaki is a great read (I have the hard copy myself), but his work on
functional data structures (either his online thesis or the book) is -
oddly enough - about data structures in *functional* languages. :) It's
not necessary for getting the general concepts and implementing in an
imperative language like Java [1]. And in order to get the most out of
Okasaki's work one had best have a better than adequate grasp of
university-level CS-type math. If you can immediately understand him
(assuming you read the prior material) when he proclaims that:

"We have just shown that lazy evaluation is necessary to implement
amortized data structures purely functionally."

[Start of section 3.3.2 in online thesis; start of Section 6.2.2 in
book] then you're doing alright. :)

Given your questions, might I suggest looking at
http://java.dzone.com/articles/fast-immutable-persistent? This guy is
working in Groovy. I won't vouch for the quality of his code but he
appears to be talking about exactly what interests you.

I expect that you have been looking at the Eric Lippert links that Peter
Duniho has pointed you at. These are useful articles.

AHS

1. Java is imperative OOP, Scala and F# can be functional OOP if you try
hard enough.
--
You should know the problem before you try to solve it.
Example: When my son was three he cried about a problem with his hand. I
kissed it several times and asked him about the problem. He peed on his
hand.
-- Radia Perlman, inventor of spanning tree protocol
 
J

Jan Burse

Arved said:
"We have just shown that lazy evaluation is necessary to implement
amortized data structures purely functionally."

I you strike out the phrase "purely functionally", then I
read it as "sufficient".

I can imagine that a little destructive mechanism under
the hood of a datastructure, does not need to violate it
beeing immutable. Here is an example, take the hashCode()
function java.lang.String. Usually the value is cached, so
code reads:

if (hash==0) {
/* compute the hash in h */
hash = h;
}
return hash;

This does not destroy immutability of the String. So I
would be interested in Java solutions for the problem
at hand. And this solution could be inspired by solutions
coming from the purely functional world, but they don't
need to be based on purely functional programming.

The only requirement is "immutable" and "good sharing".

Bye
 
A

Andreas Leitgeb

Jan Burse said:
Thanks for the link, looks like he is implementing
the queue with two stacks, with the extra that
he has addFirst() and addLast(), besides only enQueue().

But then again, the queue(s) before and after re-arranging the
stacks do not share much data...
 
J

Jan Burse

Andreas said:
But then again, the queue(s) before and after re-arranging the
stacks do not share much data...

Thats why I don't particularly like this solutions,
and hope that there will come up some other suggestions...
 
J

Jan Burse

Andreas said:
But then again, the queue(s) before and after re-arranging the
stacks do not share much data...

Thats why I don't particularly like this solutions,
and hope that there will come up some other suggestions...

There is only a sharing between subsequent enQueue(),
and also between subsequent deQueue(). But as soon as
a deQueue() its the empty stack the other stack is reversed.

Well that is not bad, but a little bit arbitrary. Maybe
there are datastructures where something happens based
on some bound, similarly like the elements allocated ahead
in an ArrayList. When you create an ArrayList it will have a
default allocation of 10 elements (implementation dependent)
for use, and it then grows by 150% (implementation dependent)
when additional elements are needed.

But an ArrayList is not immutable. You cannot keep a pointer
on it, it will get modified by subsequent operations...

Bye
 
G

Giovanni Azua

Hello,

I haven't read/followed the whole OP Thread discussion but I thought it was
worth mentioning/recommending the Google "Guava Project":
http://code.google.com/p/guava-libraries/

I think they really "went to town" on that one. You have strongly typed
Immutable Collection definitions for all Java Collection types e.g.
ImmutableMap.Builder, ImmutableList.Builder. The design is awesome e.g. the
Builder pattern as prescribed in Effective Java (last edition) and the
performance gains are noticeable as well e.g. In a distributed system I have
been working on lately, switching the attribute instances of the DTO Beans
from ArrayList Java Collection to use instead the implementation from Guava
gave some noticeable 15-20% Serialization performance gain e.g. Test case
involving 1-Echo-Server and 1K-Clients.

I really enjoy the improvement in code readability as well, it suits the
appropriate template types e.g.

// from
List<RequestData> requests = new ArrayList<RequestData>();

// to
List<RequestData> requests = Lists.newArrayList();

Guava really rocks! Would this be the Java Collections design debt now
finally paid by Joshua Bloch?

HTH,
Best regards,
Giovanni
 
J

Jan Burse

Giovanni said:
Hello,

I haven't read/followed the whole OP Thread discussion but I thought it was
worth mentioning/recommending the Google "Guava Project":
http://code.google.com/p/guava-libraries/

I think they really "went to town" on that one. You have strongly typed
Immutable Collection definitions for all Java Collection types e.g.
ImmutableMap.Builder, ImmutableList.Builder. The design is awesome e.g. the
Builder pattern as prescribed in Effective Java (last edition) and the
performance gains are noticeable as well e.g. In a distributed system I have
been working on lately, switching the attribute instances of the DTO Beans
from ArrayList Java Collection to use instead the implementation from Guava
gave some noticeable 15-20% Serialization performance gain e.g. Test case
involving 1-Echo-Server and 1K-Clients.

I really enjoy the improvement in code readability as well, it suits the
appropriate template types e.g.

// from
List<RequestData> requests = new ArrayList<RequestData>();

// to
List<RequestData> requests = Lists.newArrayList();

Guava really rocks! Would this be the Java Collections design debt now
finally paid by Joshua Bloch?

HTH,
Best regards,
Giovanni

Thank you very much for the hint. Just googled the google
code library for "Immutable". Found another project as well:

http://code.google.com/p/pcollections/

And one based on Scala:

http://code.google.com/p/scala-deque/

And some not projects which do not yet show some download.

Bye
 
J

Jan Burse

Giovanni said:
I think they really "went to town" on that one. You have strongly typed
Immutable Collection definitions for all Java Collection types e.g.
ImmutableMap.Builder, ImmutableList.Builder.

Somebody here, who could throw a spot light on how they do a Queue?

Bye
 
G

Giovanni Azua

Hi Jan,

Thank you very much for the hint. Just googled the google
code library for "Immutable". Found another project as well:

http://code.google.com/p/pcollections/

And one based on Scala:

http://code.google.com/p/scala-deque/

And some not projects which do not yet show some download.

Bye
Be careful with what you re-use. What I recommended is __Google's core
libraries__ "The Guava project contains several of Google's core libraries
that we rely on in our Java-based projects: collections, caching, primitives
support, concurrency libraries, common annotations, string processing, I/O,
and so forth."

What you found are OS projects from independent non-Google developers that
are just _hosted_ in Google code. I haven't used those and can't recommend
them. The difference might be somewhat similar to watching Hollywood movies
vs. Alternative "artsy" low cost movies :) so to speak.

HTH,
Best regards,
Giovanni
 
J

Jan Burse

Giovanni said:
Be careful with what you re-use. What I recommended is __Google's core
libraries__ "The Guava project contains several of Google's core libraries
that we rely on in our Java-based projects: collections, caching, primitives
support, concurrency libraries, common annotations, string processing, I/O,
and so forth."

I don't need all this, I only need
some collection.
What you found are OS projects from independent non-Google developers that
are just _hosted_ in Google code. I haven't used those and can't recommend
them. The difference might be somewhat similar to watching Hollywood movies
vs. Alternative "artsy" low cost movies so to speak.

Well being a Googler might imply profes-
sionality. But being a non-Googler then
does not necessarily imply non-profes-
sionality.

This is the fallacy of:

If

A -> B

Then

~A -> ~B

You can easily verify this conclusion
to be false by a truth table.

Bye
 
A

Arne Vajhøj

I don't need all this, I only need
some collection.


Well being a Googler might imply profes-
sionality. But being a non-Googler then
does not necessarily imply non-profes-
sionality.

This is the fallacy of:

If

A -> B

Then

~A -> ~B

You can easily verify this conclusion
to be false by a truth table.

I think that was exactly what Giovanni said.

Arne
 
L

Lew

Giovanni said:
I haven't read/followed the whole OP Thread discussion but I thought it was
worth mentioning/recommending the Google "Guava Project":
http://code.google.com/p/guava-libraries/

I think they really "went to town" on that one. You have strongly typed
Immutable Collection definitions for all Java Collection types e.g.
ImmutableMap.Builder, ImmutableList.Builder. The design is awesome e.g. the
Builder pattern as prescribed in Effective Java (last edition) and the
performance gains are noticeable as well e.g. In a distributed system I have
been working on lately, switching the attribute instances of the DTO Beans
from ArrayList Java Collection to use instead the implementation from Guava
gave some noticeable 15-20% Serialization performance gain e.g. Test case
involving 1-Echo-Server and 1K-Clients.

I really enjoy the improvement in code readability as well, it suits the
appropriate template types e.g.

// from
List<RequestData> requests = new ArrayList<RequestData>();

// to
List<RequestData> requests = Lists.newArrayList();

Guava really rocks! Would this be the Java Collections design debt now
finally paid by Joshua Bloch?

I'm not saying anything against Guava, but I fail to see the advantage of that particular idiom. 'Lists.newArrayList' vs. 'new ArrayList<>' - eh, mezza mezz'.

Whatever floats your boat.
 
A

Arne Vajhøj

I'm not saying anything against Guava, but I fail to see the advantage of that particular idiom. 'Lists.newArrayList' vs. 'new ArrayList<>' - eh, mezza mezz'.

Whatever floats your boat.

I guess it is Block consider factory over constructor.

But I om with you on this one.

I don't understand this weird trend to try and write "new free"
code.

Arne
 
T

Tom Anderson

I haven't read/followed the whole OP Thread discussion but I thought it was
worth mentioning/recommending the Google "Guava Project":
http://code.google.com/p/guava-libraries/

I think they really "went to town" on that one. You have strongly typed
Immutable Collection definitions for all Java Collection types e.g.
ImmutableMap.Builder, ImmutableList.Builder.

If you had read the whole thread, or indeed just the OP's initial post,
you would understand that these classes are not helpful. They are
certainly immutable, they don't have the sharing properties that Jan is
after. They also don't expose any methods for doing the quasi-mutation he
needs to do.

To recap, Jan wants to be able to say:

ImmutableStack<String> a = new ImmutableStack<String>(); // empty
ImmutableStack<String> b = a.push("foo");
assert a.isEmpty();
assert b.size() == 1;

That is, the quasi-mutation creates a new instance exhibiting the change,
rather than changing an existing instance.

The Guava Immutable* classes do not have such methods.
The design is awesome e.g. the Builder pattern as prescribed in
Effective Java (last edition)

I find the builder pattern immensely annoying, myself.
and the performance gains are noticeable as well e.g. In a distributed
system I have been working on lately, switching the attribute instances
of the DTO Beans from ArrayList Java Collection to use instead the
implementation from Guava gave some noticeable 15-20% Serialization
performance gain e.g. Test case involving 1-Echo-Server and 1K-Clients.

That is a respectable gain!
I really enjoy the improvement in code readability as well, it suits the
appropriate template types e.g.

// from
List<RequestData> requests = new ArrayList<RequestData>();

// to
List<RequestData> requests = Lists.newArrayList();

I see absolutely no improvement in the code here, only unhelpful
obfuscation.

tom
 
J

Jan Burse

Jan said:
Dear All,

Value objects are some well known immutable
datastructures, when an operation is applied,
typically simply a new object is created. So
for example doing:

3 = 1 + 2

With java.lang.Integer objects would involve
creating a new Integer with value left.intValue()
+ right.intValue().

But instead of call the constructor, one can also call
Integer.valueOf(), and there you have some sharing.
Namely the class Integer caches the value objects for
small values (at least the last time I looked into
the Oracle source, it was there...).

More elaborate sharing is seen in the String class
by the substring() operation. This operation does
reuse the original character array and only encapsulates
the offset and the length in a new object. I wouldnt
recomende that to get one character from a 1 MB
long string, since it will prevent the original
character array from GC. But in some situation this
is quite handy.

I am now looking for immutable datastructures with
a similar sharing. In the String class example, the
operation was substring() that produced a new immutable
object. I am looking for:

Some stack class, where: (Easy)
pop() creates a new immutable stack
push() creates a new immutable stack
With good sharing.

Some queue class, where: (Hard?)
enqueue() creates a new immutable queue
dequeue() creates a new immutable queue
With good sharing.

What is your favorite datastructure?

Bye

Here is some explanation how an immutable set would work,
based on some tree where cloning is only performed along
the insert or delete path:

http://stackoverflow.com/questions/3233473/immutable-data-structures-performance

Could I derive a queue from this?

Bye
 
J

Jan Burse

Tom said:
I see absolutely no improvement in the code here, only unhelpful
obfuscation.

I guess it requires JDK 1.7 or so for the parameter
type inference. Not sure, picked something up like
that...
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,774
Messages
2,569,598
Members
45,149
Latest member
Vinay Kumar Nevatia0
Top