Class Constants - pros and cons

A

Alan Gutierrez

Lew said:
Andreas said:
On some deeper level, a relational DB seems to actually use the "separate
arrays" approach, too. Otherwise I cannot explain the relatively low
cost
of adding another column to a table of 100 million entries already in it.

There's a big difference between a database with tens of thousands,
maybe far more manhours of theory, development, testing, user feedback,
optimization efforts, commercial competition and evolution behind it,
and an ad-hoc use of in-memory arrays by a solo programmer.

A database system is far, far more than a simple "separate arrays"
approach. There are B[+]-trees, caches, indexes, search algorithms,
stored procedures, etc., etc., etc. Your comment is like saying that
"on some deeper level" a steel-and-glass skyscraper is like the
treehouse you built for your kid in the back yard.

In other words, nobody ever got fired for buying IBM.
 
A

Alan Gutierrez

Andreas said:
Except, if the machine is already RAM-stuffed to its limits...

Even if the machine wasn't yet fully RAM'ed, then buying more RAM
*and* using the arrays-kludge(yes, that's it, afterall) would allow
even larger galaxies to be simulated.

The RAM is cheaper than programmer time argument is useful to salt the
tail of the newbie that seeks to dive down every micro-optimization
rabbit hole that he come across on the path to the problems that truly
deserve such intense consideration. You have to admire the moxie of the
newbie that wants to catenate last name first as fast as possible, but
you explain to them that their are plenty of dragons to slay further
down the road.

It is not a good argument for someone who brings a problem that is truly
limited by available memory. Memory management is an appropriate
consideration for the problem. Memory management is the problem.

Memory procurement is the non-programmer solution. Throw money at it.
Scale up rather than scaling out, because we can scale up with cash, but
scaling out requires programmers who understand algorithms.

You're right that scaling up hits a foreseeable limit. I like to have
the limitations of my program be unforeseeable. That is, if I'm going to
read something into memory, say, every person in the world who would
loan money to me personally without asking questions, I'd like to know
that hitting the limits of the finite resource employed on a
contemporary computer system correlates to situation in reality that is
unimaginable.

Moore's Law does not excuse brute force.

Which is why I am similarly taken aback to hear RAM prices quoted for
something that has obvious solutions in plain old Java.
On some deeper level, a relational DB seems to actually use the "separate
arrays" approach, too. Otherwise I cannot explain the relatively low cost
of adding another column to a table of 100 million entries already in it.

On some deeper level, a relational database through an object relational
mapping layer will be paging information in and out of memory, on and
off of disk, as you need it. That is the feature you need to address
your memory problem.

Lately, I've been mucking about with `MappedByteBuffer`, so I imagine
for your (hypothetical) problem of modeling the Galaxy, you would model
it by keeping the primitives you describe in the `MappedByteBuffer` and
creating objects from them as needed. This is not `Flyweight` to my
mind, where you keep objects that map to finite set of values, these
values are assembled into a larger structure in an infinite number of
permutations. These atomic components exist within the larger structure,
but they are reused. Interned `String` is a flyweight to my mind.

I'm not sure what the pattern is for the short term objectification of a
record, but that is a lot of what Hibernate is about. Making objecty
that which is stringy, just long enough for you do your GRUD in the
security of your type-safe world.
100% agree to these points.

You create an `Star` object that can read the information from a
`MappedByteBuffer` at a particular index, and you can simply change the
`read` and `write` method of the star.

You've reached down to the deeper level of the ORM+RDBMS stack and
extracted the only design pattern you need to address the problem of
reading the Universe into memory.
 
M

Martin Gregorie

In other words, nobody ever got fired for buying IBM.
Regardless of what you might think of their business methods, and in the
past they didn't exactly smell of roses, their software quality control
and their hardware build quality are both hard to beat. I've used S/38
and AS/400 quite a bit and never found bugs in their system software or
lost work time due to hardware problems.

For elegant systems design ICL had them beat hands down, but although ICL
quality was OK by IT standards the IBM kit was more reliable.

IME anyway.
 
L

Lew

Alan said:
The RAM is cheaper than programmer time argument is useful to salt the
tail of the newbie that seeks to dive down every micro-optimization
rabbit hole that he come across on the path to the problems that truly
deserve such intense consideration. You have to admire the moxie of the
newbie that wants to catenate last name first as fast as possible, but
you explain to them that their are plenty of dragons to slay further
down the road.

It is not a good argument for someone who brings a problem that is truly
limited by available memory. Memory management is an appropriate
consideration for the problem. Memory management is the problem.

Memory procurement is the non-programmer solution. Throw money at it.
Scale up rather than scaling out, because we can scale up with cash, but
scaling out requires programmers who understand algorithms.

You're right that scaling up hits a foreseeable limit. I like to have
the limitations of my program be unforeseeable. That is, if I'm going to
read something into memory, say, every person in the world who would
loan money to me personally without asking questions, I'd like to know
that hitting the limits of the finite resource employed on a
contemporary computer system correlates to situation in reality that is
unimaginable.

Moore's Law does not excuse brute force.

Which is why I am similarly taken aback to hear RAM prices quoted for
something that has obvious solutions in plain old Java.

I'm pretty surprised to hear a clean object model described as "brute force",
but OK. The point of a spirited discussion is to expose all sides of an issue.

I'd go with clean design first, which to my mind an object model is, then play
around with non-expandable, hard-to-maintain, bug-prone parallel-array
solutions if the situation truly demanded it, but I just don't see that demand
in the scenario under discussion.
 
A

Alan Gutierrez

Lew said:
I'm pretty surprised to hear a clean object model described as "brute
force", but OK. The point of a spirited discussion is to expose all
sides of an issue.

I'd go with clean design first, which to my mind an object model is,
then play around with non-expandable, hard-to-maintain, bug-prone
parallel-array solutions if the situation truly demanded it, but I just
don't see that demand in the scenario under discussion.

The scenario under discussion is, I want to do something that will reach
the limits of system memory. Your solution is procure memory. My
solution is to use virtual memory.

Again, it seems to me that `MappedByteBuffer` and a bunch of little
facades to the contents of the `MappedByteBuffer` is a preferred
solution that respects memory usage. The design is as expandable,
easy-to-maintain and bug free as a great big array of objects, without
having to think much about memory management at all.

I don't know where "parallel" arrays come into play in the problem
described. I'm imagining that, if the records consist entirely of
numeric values, that you can treat them as fixed length records.
 
A

Alan Gutierrez

Martin said:
Regardless of what you might think of their business methods, and in the
past they didn't exactly smell of roses, their software quality control
and their hardware build quality are both hard to beat. I've used S/38
and AS/400 quite a bit and never found bugs in their system software or
lost work time due to hardware problems.

For elegant systems design ICL had them beat hands down, but although ICL
quality was OK by IT standards the IBM kit was more reliable.

IME anyway.

I wasn't really picking on IBM.

I was addressing the fallacy of the appeal to authority. The argument
that a monolithic system contains institutionalized knowledge that is
superior to any other solution offered to a problem that the monolithic
system could conceivably address.
 
L

Lew

Alan said:
The scenario under discussion is, I want to do something that will reach
the limits of system memory. Your solution is procure memory. My
solution is to use virtual memory.

Again, it seems to me that `MappedByteBuffer` and a bunch of little
facades to the contents of the `MappedByteBuffer` is a preferred
solution that respects memory usage. The design is as expandable,
easy-to-maintain and bug free as a great big array of objects, without
having to think much about memory management at all.

I like that idea.
 
A

Alan Gutierrez

Lew said:
I like that idea.

Oh, yeah! Well another thing mister... You, I, uh, but... Wait...

Well, golly gee. Thanks.

I'd run off to write some code to illustrate my point.

package comp.lang.java.programmer;

import java.nio.ByteBuffer;

public interface ElementIO<T> {
public void write(ByteBuffer bytes, int index, T item);
public T read(ByteBuffer bytes, int index);
public int getRecordLength();
}

package comp.lang.java.programmer;

import java.nio.MappedByteBuffer;
import java.util.AbstractList;

public class BigList<T> extends AbstractList<T> {
private final ElementIO<T> io;

private final MappedByteBuffer bytes;

private int size;

public BigList(ElementIO<T> io, MappedByteBuffer bytes, int size) {
this.io = io;
this.bytes = bytes;
this.size = size;
}

// result is not `==` to value `set` so only use element type that
// defines `equals` (and `hashCode`).
@Override
public T get(int index) {
return io.read(bytes, index * io.getRecordLength());
}

@Override
public T set(int index, T item) {
if (index < 0 || index >= size) {
throw new IndexOutOfBoundsException();
}
T result = get(index);
io.write(bytes, index * io.getRecordLength(), item);
return result;
}

@Override
public void add(int index, T element) {
size++;
// probably off by one, but you get the idea...
for (int i = size - 2; i >= index; i--) {
set(index + 1, get(index));
}
set(index, element);
}

// and `remove` and the like, but of course only `get`, `set`
// and `add` to the very end can be counted on to be performant.

@Override
public int size() {
return size;
}
}

Create the above with however much `MappedByteBuffer` you need for your
Universe. Define `ElementIO` to read and write your `Star` type. Each
time you read a `Star` in `ElementIO` you do mint a new `Star` so that
is like Flyweight in some way, but seems like a little `Bridge` or
`Adaptor`.

If you shutdown soft and record the size, you can reopen the list. If
you change `Star` you need need to update `ElementIO` and rebuild your
list, but not probably not your code that references `Star` or the
`BigList`.

I see it now. Looking for the word parallel in the long thread didn't
find it for me, but that's what is described here. That does sound a bit
fragile.

Anyway, it seems like there is a middle ground between ORM+RMDBS and
everything in memory. My hobby horse. (Rock, rock, rock.)
 
A

Andreas Leitgeb

Lew said:
I'd go with clean design first, which to my mind an object model is,
then play around with non-expandable, hard-to-maintain, bug-prone
parallel-array solutions if the situation truly demanded it,

Not sure, whether this "if the situation truly demanded it" is actually
an "if (false)" for you, but in case it isn't, then we reached agreement.

We might still disagree for certain real situations, though ;-)
 
M

Martin Gregorie

I wasn't really picking on IBM.
Fair point. The 'nobody got fired for buying IBM' hit my reaction button.
It had rather dire connotations in the past, as in 'if you DON'T buy IBM,
our senior execs will visit your senior execs and you *will* be fired and
put on our black list'. I knew one or two people whose bosses had
received that visit when they bought 3rd party disks.
I was addressing the fallacy of the appeal to authority. The argument
that a monolithic system contains institutionalized knowledge that is
superior to any other solution offered to a problem that the monolithic
system could conceivably address.
Sure: a myth that's perpetrated by said monoliths and bought into by
their adherents: it saves the adherents from having to think.
 
T

Tom McGlynn

Alan Gutierrez wrote:
id you read this thread? Like, say, yesterday, when Tom McGlynn
wrote:

I'm a little intrigued by the discussion of the appropriate choices
for
the architecture for an n-body calculation by a group which likely has
little experience if any in the field. Note that this has been an
area of continuous
study for substantially longer than the concept of relational
databases has existed:
the first N-body calculations by digital computers were made in the
1950's.
My own experience here is woefully out of date, but below are a couple
of reasons
why I might consider an architecture similar to what I gave as an
illustration
earlier. The motivation for that example was to illustrate the
othogonality
of my understanding of singleton and flyweight, but there could be
reasons
to go this route. E.g.,

Direct n-body calculations need to compute the distance between
pairs of
objects. The distances between nearby objects need to be calculated
orders
of magnitude more frequently than between distant objects. If the
data can
be organized such that nearby in [simulated] space stars tend to be
nearby
in memory, then cache misses may be substantially reduced. This can
improve
performance by an order of magnitude or more. In Java the only
structure you
have available that allows for managing this (since nearby pairs
change
with time) is a primitive array. Java gives no way, as far as I know,
to
manage the memory locations of distinct objects.

Since the actual n-body calculation will often have been highly
optimized in some
other language, the role of Java code in an n-body system may be to
provide initial
conditions to, show the status of, or analyze the results of the
calculation.
Communication with the core calculation might use JNI, shared memory,
or other I/O techniques.
In each of these the fact that one dimensional primitive arrays share
a common
model between languages makes them an attractive way of passing the
data.


Note that I'm not saying that this approach must or even should be
used: just that it
can make sense in realistic circumstances. However personally -- and
given the GOFs
endorsement there is some support more broadly -- I don't see the use
of this kind of
simple flyweight as a particularly odious approach.


Regards,
Tom McGlynn
 
L

Lew

Fair point. The 'nobody got fired for buying IBM' hit my reaction button.
It had rather dire connotations in the past, as in 'if you DON'T buy IBM,
our senior execs will visit your senior execs and you *will* be fired and
put on our black list'. I knew one or two people whose bosses had
received that visit when they bought 3rd party disks.

Sure: a myth that's perpetrated by said monoliths and bought into by
their adherents: it saves the adherents from having to think.

Yeah, because people with multi-million dollar/euro/yuan budgets never, ever
think about how they spend their money, and there's just no chance that IBM
got where it is by not delivering what they promise for mission-critical systems.

I thought this was supposed to be a group of intelligent, educated, technical
people.
 
M

Martin Gregorie

Yeah, because people with multi-million dollar/euro/yuan budgets never,
ever think about how they spend their money, and there's just no chance
that IBM got where it is by not delivering what they promise for
mission-critical systems.
I wasn't even thinking about the big shots (and remember the 360/195 that
never was?) - more when back in the day the Big Blue SEs were considered
to be the gods of system design and implementation.
 
R

Robert Klemme

Magnus Warker a écrit :

Don't top-post!


No. That would be a bug. You'd write 'if ( color.equals( Color.WHITE ) )'.

That depends on the rest of Color's class definition (whether there are
public constructors, whether the class is serializable and whether
custom deserialization is in place - all stuff someone who makes this an
enum does not have to take care of manually btw). For enums (whether as
language construct or properly implemented although this is a bad idea
since Java 5 IMHO) I would rather use "==" here because it is more
efficient and stands out visually.
Since enums are classes, they can contain behavior. That means you won't
need if-chains nor case constructs to select behavior; just invoke the
method directly from the enum constant itself and voilà!

One can even have custom code *per enum value* which makes implementing
state patterns a breeze. See

http://java.sun.com/j2se/1.5.0/docs/guide/language/enums.html

Kind regards

robert
 
L

Lew

That depends on the rest of Color's class definition (whether there are
public constructors, whether the class is serializable and whether
custom deserialization is in place - all stuff someone who makes this an
enum does not have to take care of manually btw).  For enums (whether as
language construct or properly implemented although this is a bad idea
since Java 5 IMHO) I would rather use "==" here because it is more
efficient and stands out visually.

Yeah, I was corrected on that some time upthread and already
acknowledged the error. I had it in my head that we were dicussing
String contants rather than final instances of the 'Color' class.

As for using == for enums, that is guaranteed to work by the language
and is best practice.

As for writing type-safe enumeration classes that are not enums,
there's nothing wrong with that if you are in one of those corner use
cases where Java features don't quite give you what you want. The
only one I can think of is inheritance from an enumeration. However,
I agree with you in 99.44% of cases - it's almost always bad to extend
an enumeration and Java enums pretty much always do enough to get the
job done. So when one is tempted to write a type-safe enumeration
class that is not an enum, one is almost certainly making a design
mistake and an implementation faux-pas.
One can even have custom code *per enum value* which makes implementing
state patterns a breeze.

I have not so far encountered a real-life situation where a 'switch'
on an enum value is cleaner or more desirable than using enum
polymorphism. For state machines I'm more likely to use a Map (or
EnumMap) to look up a consequent state than a switch. Have any of you
all found good use cases for a 'switch' on an enum?
 
R

Robert Klemme

Yeah, I was corrected on that some time upthread and already
acknowledged the error. I had it in my head that we were dicussing
String contants rather than final instances of the 'Color' class.

I hadn't made it through all branches of the thread. I should have read
to the end before posting. Sorry for the additional noise.
As for using == for enums, that is guaranteed to work by the language
and is best practice.

As for writing type-safe enumeration classes that are not enums,
there's nothing wrong with that if you are in one of those corner use
cases where Java features don't quite give you what you want. The
only one I can think of is inheritance from an enumeration. However,
I agree with you in 99.44% of cases - it's almost always bad to extend
an enumeration and Java enums pretty much always do enough to get the
job done. So when one is tempted to write a type-safe enumeration
class that is not an enum, one is almost certainly making a design
mistake and an implementation faux-pas.


I have not so far encountered a real-life situation where a 'switch'
on an enum value is cleaner or more desirable than using enum
polymorphism. For state machines I'm more likely to use a Map (or
EnumMap) to look up a consequent state than a switch. Have any of you
all found good use cases for a 'switch' on an enum?

Personally I cannot remember a switched usage of enum, The only reasons
to do it that come to mind right now are laziness and special
environments (e.g. where you must reduce the number of classes defined
for resource reasons, maybe Java mobile). But then again, you would
probably rather use ints instead of an enum type...

Kind regards

robert
 
T

Tom Anderson

I have not so far encountered a real-life situation where a 'switch' on
an enum value is cleaner or more desirable than using enum polymorphism.
For state machines I'm more likely to use a Map (or EnumMap) to look up
a consequent state than a switch. Have any of you all found good use
cases for a 'switch' on an enum?

We've done it once. We have some internationalisation code where,
approximating wildly, items can be internationalised by locale or by
country (ie shared across all languages in a locale - prices are the
classic example). We have an enum called something like LocalisationType
with values LOCALE and COUNTRY to identify which is being done. We do have
some polymorphism around it (not actually in the enum, although for the
purposes of this story, that's not interesting), related to the core
business of finding out the right localisation key (locale code or country
code) and resolving the item value for it.

But we also have other bits of code which are not core functionality which
need to do different things for locale- and country-keyed items. The one
that springs to mind is a locale copying utility - if you're copying fr_FR
into CA to create fr_CA (obviously, only as a starting point for manual
editing), you want to copy locale-mapped items (which will be
French-language text) but not location-mapped items (which will be prices
in euros and so on). We could have put a shouldCopyWhenCopyingLocale()
method on the enum, or even a copyIntoNewLocale() method which did nothing
in the location case, but this seemed like polluting the enum with
behaviour that belonged in the copier. So we put a switch in the copier
instead.

There's probably a more OO-clean way to break the decision up, but this
was simple and worked, so that was good enough for us.

tom
 
T

Tom Anderson

Here's a bit of what the GOF has to say about flyweights. (Page 196
in my version)....

"A flyweight is a shared object that can be used in multiple contexts
simultaneously. The flyweight acts as an independent object in each
context--it's indistinguishable from an instance of the object that's
not shared.... The key concept here is the distinction between intrinsic
and extrinsic state. Intrinsic state is stored in the flyweight. It
consists of information that's independent of the flyweight's context,
thereby making it shareable. Extrinsic state depends on and varies with
the flyweights context and therefore can't be shared. Client objects
are responsible for passing extrinsic state to the flyweight when it
needs it."

That's reasonably close to what I had in mind.

Yes, point taken. I'm still not happy with your usage, though.

IIRC, the example in GoF is of a Character class in a word processor. So,
a block of text is a sequence of Character objects. Each has properties
like width, height, vowelness, etc and methods like paintOnScreen. But
because every lowercase q behaves much the same as every other lowercase
q, rather than having a separate instance for every letter in the text, we
have one for every distinct letter.

The extrinsic state in this example is the position in the text, the
typeface, the style applied to the paragraph, etc. Certainly, things that
are not stored in the Character. But also not things that intrinsically
belong in the Character anyway; rather, things inherited from enclosing
objects.

Whereas in your case, the array offset *is* something intrinsic to the
Star. If it had been something else, say the coordinates of the centre of
mass of the local cluster, then i'd agree that that was Flyweightish. But
i'm not so sure about the array index.

It might well be that i have an over-narrow idea of what a Flyweight is.
Getting back to my original concern, I don't think enumeration is a good
word for the concept either. Enumerations are often used for an
implementation of the basis set -- favored in Java by special syntax.
However the word enumeration strongly suggests a list. In general the
set of special values may have a non-list relationship (e.g., they could
form a hierarchy). I like the phrase 'basis set' I used above but that
suggests that other elements can be generated by combining the elements
of the basis so it's not really appropriate either.

I can't think of a good word for this. Do we need one? What are some
examples of this pattern in the wild?

tom
 
T

Tom Anderson

I'm a little intrigued by the discussion of the appropriate choices for
the architecture for an n-body calculation by a group which likely has
little experience if any in the field.

Direct n-body calculations need to compute the distance between pairs of
objects. The distances between nearby objects need to be calculated
orders of magnitude more frequently than between distant objects. If
the data can be organized such that nearby in [simulated] space stars
tend to be nearby in memory, then cache misses may be substantially
reduced. This can improve performance by an order of magnitude or more.
In Java the only structure you have available that allows for managing
this (since nearby pairs change with time) is a primitive array. Java
gives no way, as far as I know, to manage the memory locations of
distinct objects.

True, although it doesn't prevent cache locality - whereas the parallel
arrays approach immediately rules out locality of the coordinates of a
single star, because they'll be in different arrays. If you want locality,
you'd have to pack the values of all three coordinates into one big array,
which of course is possible.

The dual of the fact that java doesn't let you control locality is that
JVMs are free to control it. There is research going back at least ten
years now into allocation and GC strategies that improve locality. Indeed,
for some popular kinds of collectors, locality is a standard side-effect -
any moving collector where objects are moved in a depth-first traversal of
(some subgraph of) the object graph will tend to put objects shortly after
some other object that refers to them, and thus also close to objects that
are also referred to by that object. It may not help enormously for
mesh-structured object graphs, but it works pretty well for the trees that
are common in real life. If these Stars are held in an octree, for
example, we might expect decent locality.

tom
 
T

Tom McGlynn

I'm a little intrigued by the discussion of the appropriate choices for
the architecture for an n-body calculation by a group which likely has
little experience if any in the field.
Direct n-body calculations need to compute the distance between pairs of
objects. The distances between nearby objects need to be calculated
orders of magnitude more frequently than between distant objects.  If
the data can be organized such that nearby in [simulated] space stars
tend to be nearby in memory, then cache misses may be substantially
reduced.  This can improve performance by an order of magnitude or more.
In Java the only structure you have available that allows for managing
this (since nearby pairs change with time) is a primitive array.  Java
gives no way, as far as I know, to manage the memory locations of
distinct objects.

True, although it doesn't prevent cache locality - whereas the parallel
arrays approach immediately rules out locality of the coordinates of a
single star, because they'll be in different arrays. If you want locality,
you'd have to pack the values of all three coordinates into one big array,
which of course is possible.

The dual of the fact that java doesn't let you control locality is that
JVMs are free to control it. There is research going back at least ten
years now into allocation and GC strategies that improve locality. Indeed,
for some popular kinds of collectors, locality is a standard side-effect -
any moving collector where objects are moved in a depth-first traversal of
(some subgraph of) the object graph will tend to put objects shortly after
some other object that refers to them, and thus also close to objects that
are also referred to by that object. It may not help enormously for
mesh-structured object graphs, but it works pretty well for the trees that
are common in real life. If these Stars are held in an octree, for
example, we might expect decent locality.

Putting everything in a single array is certainly fine. It doesn't
change the basic flyweight idea here.

Since in most n-body codes stars are never destroyed, garbage
collection as such may not come into play much. I'd be curious how
much storage reallocation is done in current JVMs where there is very
little creation or destruction of objects. Of course if one built a
changeable star hierarchy then one would be continually creating and
destroying branch nodes of the hierarchy (as stars move) even though
the leaf nodes would be immortal. Perhaps the churn of branch nodes
would be enough for the gc to move the leafs appropriately.

It is certainly possible that a clever enough JVM could address this
automatically and efficiently. The n-body code may have the advantage
in that it can do things predictively rather than reactively, but a
system approach would likely be able to adapt to changes in the local
environment better.

But I'm not trying to persuade people that using flyweights in the
sense that I suggested in my example is necessarily the right thing to
do in all circumstances, merely to note that it may be a rational or
even desirable approach in some. That example was given to give a
concrete realization of an object that simultaneously implemented
flyweight and singleton, not for its intrinsic merit but I was a
little bemused by reaction to it.

Regards,
Tom McGlynn
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top