impl a collection

R

Roedy Green

Not the char[] array inside a String. There is method that exists to
modify it and no way to add one.


Gad those typos are atrocious. What I meant to say was:

Note the char[] array inside a String. There is no method that exists
to modify it, or replace it and no way to add one.
 
R

Roedy Green

New String is smart enough to avoid the copy if the substring and the
full string are identical. There is no benefit in breaking the pin,
and a penalty of creating a duplicate.

But in either case, you still get a brand new String object. Just the
inner char[] is sometimes recycled.
 
T

Tony Morris

Roedy Green said:
One thing that I might also point out is that an "immutable char array"
simply does not exist - never has - Arrays are always mutable.

Not the char[] array inside a String. There is method that exists to
modify it and no way to add one.

They are as immutable as the primitive ints inside an Integer, which
normally also are mutable.

Incorrect.
An array is a mutable type.
A java.lang.Integer isn't.
There is a very strong distinction here.

Granted, any "immutable" type can have it's private state data altered by
use of reflection
(http://www.xdweb.net/~dibblego/java/trivia/answers.html#q1), but this is
purely academic, and for the sake of this explanation, assume it doesn't
exist.

The fact that java.lang.String holds an internal char[] (which is again,
another implementation detail that doesn't hold true for all VMs) does not
make it "immutable" - the fact that the char[] member is declared private
means that clients can't access it - since if they could, they could mute it
because arrays are mutable types, always.
So to reiterate, an array that is declared private does not imply that it is
immutable, merely accessible according to access scope rules - an array is
always mutable.

--
Tony Morris
(BInfTech, Cert 3 I.T.)
Software Engineer
(2003 VTR1000F)
Sun Certified Programmer for the Java 2 Platform (1.4)
Sun Certified Developer for the Java 2 Platform
 
S

Scott Ellsworth

Roedy Green said:
XML is flagrant conspicuous waste. See Veblen's Theory of the Leisure
Class.
http://www.amazon.com/exec/obidos/ASIN/0140187952/canadianmindprod

XML is waste for the sake of waste.

I find it morally repugnant.

I find it useful as all hell.

Programs that use XML internally for transmission and communication tend
to use it externally for save files and connections to other programs,
and vice versa. Those who use binary tend to use their own opaque
binary serialization/communication for everything. Given how well xml
gzips, the difference is just not that great.

Real world example: a client of mine has a few hundred interlocking
projects with all sorts of dependencies.

Eclipse creates binary files that can be edited only by their own tools,
so I cannot easily create project files for Eclipse that match their
layout without figuring out how to write a rather convoluted plugin.

IDEA creates XML with a DTD. I can create an IDEA project with a
transform or some java code. I have an xslt sheet that this, and it did
not take long to write.

I realize you were complaining about wasted bandwidth when it is used as
a communication protocol, not a file, but I really have not seen that
many places where people use different tools for those different tasks,
at least for the clients I work with.

Further, I have read your "standard binary" proposals, but being able to
edit the data with BBEdit is useful. I can watch SOAP go over the wire,
and make some sense of it without needing to write any tools.

Compressed, optimized binary makes a lot of sense for some tasks, but
XML has got a lot going for it, at least from where I sit. It is NOT
waste for the sake of waste.

Scott
 
R

Roedy Green

Those who use binary tend to use their own opaque
binary serialization/communication for everything. Given how well xml
gzips, the difference is just not that great.

Why do you think to two are incompatible? Why not have a standard
predigested, preparsed binary format for XML, just as flexible, just
more rigidly defined, and naturally error free, more compact, and
faster to process, with smaller classes?
 
R

Roedy Green

Not entirely accurate - since, you certainly can.
"Java has a design intention of ..." is more accurate.

If you do, it is a bug, not a feature. Can you think of an example?

Exec is one place, and file naming is another, but in both cases the
intent is communicate with the local OS, so there is not much hope for
avoiding some platform dependency.

The usual ones, endianness, the sizes of primitives, the formats of
various internal primitives, alignment, padding, address structure,
how the String class is implemented are all hidden.
 
R

Roedy Green

That's the point - there is no concept of "you [sic] current JVM" anywhere
in this thread, except for an assumption with no real basis.

I assumed Java - correct me if I am wrong, but that is the topic of this
forum.

That was never in question.

The Java language spec and the JVM spec give the designer considerable
latitude. Java is deliberately designed as a very black box to allow
maximum flexibility of implementation. I suppose you are surprised to
find out the pinning is STILL there and Java so carefully hid the
implementation details from you, that you never suspected the
feature's existence.
 
T

Tony Morris

If you do, it is a bug, not a feature. Can you think of an example?
Exec is one place, and file naming is another, but in both cases the
intent is communicate with the local OS, so there is not much hope for
avoiding some platform dependency.

Neither of these are examples of relying on an implementation detail.
They rely on OS-specific behaviour - very distinct from implementation
behaviour.

I can think of a zillion ways of writing code that relies on an
implementation detail.
Here are two:
-An optimisation that relies on an implementation detail (such as how the
Sun VM (not Java) implements the substring method).

-http://www.xdweb.net/~dibblego/java/trivia/answers.html#q1 - That relies on
the java.lang.String class being implemented by holding an internal char
array (which again, is not Java, but an implementation detail).

--
Tony Morris
(BInfTech, Cert 3 I.T.)
Software Engineer
(2003 VTR1000F)
Sun Certified Programmer for the Java 2 Platform (1.4)
Sun Certified Developer for the Java 2 Platform
 
T

Tony Morris

Roedy Green said:
That's the point - there is no concept of "you [sic] current JVM" anywhere
in this thread, except for an assumption with no real basis.

I assumed Java - correct me if I am wrong, but that is the topic of this
forum.

That was never in question.

The Java language spec and the JVM spec give the designer considerable
latitude. Java is deliberately designed as a very black box to allow
maximum flexibility of implementation. I suppose you are surprised to
find out the pinning is STILL there and Java so carefully hid the
implementation details from you, that you never suspected the
feature's existence.

I'm not surprised - in fact, I think you may be misunderstanding exactly
what that implementation detail is.

"Java so carefully hid the implementation details ..."
No it didn't - the Sun VM might have (hypothetically), but Java certainly
didn't - this is the point.
When were we assuming that we were referring to the Sun VM? This assumption
seems to have appeared from "thin air".
Again, I was referring to Java, not any particular vendor's VM.
I still don't think you are seeing the important distinction.

--
Tony Morris
(BInfTech, Cert 3 I.T.)
Software Engineer
(2003 VTR1000F)
Sun Certified Programmer for the Java 2 Platform (1.4)
Sun Certified Developer for the Java 2 Platform
 
R

Roedy Green

I can think of a zillion ways of writing code that relies on an
implementation detail.
Here are two:
-An optimisation that relies on an implementation detail (such as how the
Sun VM (not Java) implements the substring method).

I think we are having troubles with the word "rely". To me it implies
the code will fail if underlying implementation changes.

I see nothing wicked about choosing code that is equally readable that
you happen to know is implemented more efficiently. That is never
likely to be harmful. That is just common sense. Why deliberately poke
a stick in your eye choosing the slower implementation?
 
R

Roedy Green

When were we assuming that we were referring to the Sun VM? This assumption
seems to have appeared from "thin air".

A have been talking about Java and Sun's JVMs all along. I that a
reasonable assumption unless you specify otherwise.

But he black boxness applies to all JMS.
 
C

Chris Smith

Tony said:
That's the point - there is no concept of "you [sic] current JVM" anywhere
in this thread, except for an assumption with no real basis.

I assumed Java - correct me if I am wrong, but that is the topic of this
forum.

Tony,

Yes, this is technically an implementation detail. However, there are
implementation details and then there are implementation details. This
particular "implementation detail" is well-documented enough, not to
mention strongly implied by the API, that someone would need to be crazy
to implement the Java core API in a way that exhibits substantial
performance differences from this one. Just out of curiosity, I checked
GNU classpath and discovered that it, too, implements String in this way
(at least in 0.8; and I'd bet money it doesn't change in 0.9 either).
Do you have any other VMs in mind that are worth checking?

--
www.designacourse.com
The Easiest Way to Train Anyone... Anywhere.

Chris Smith - Lead Software Developer/Technical Trainer
MindIQ Corporation
 
C

Chris Smith

Tony said:
The fact that java.lang.String holds an internal char[] (which is again,
another implementation detail that doesn't hold true for all VMs) does not
make it "immutable" - the fact that the char[] member is declared private
means that clients can't access it - since if they could, they could mute it
because arrays are mutable types, always.
So to reiterate, an array that is declared private does not imply that it is
immutable, merely accessible according to access scope rules - an array is
always mutable.

I'm pretty sure everyone following this discussion knows this fact about
arrays already.

Nevertheless, it's convenient to refer to a char[] that's held as a
private data member of String as immutable, in the context where it is
found, and I don't see how that could be called incorrect. The fact
that it would be mutable if it were directly accessible is moot. All
common uses of the "mutable" classification -- for example, reasoning
about thread-safety, or reasoning about security managers -- are
entirely possible on the guarantee that the array is only referenced by
private members of String, and that String has no code to modify it.

Do you have an authority that insists that the word "mutable" can only
be used to refer to direct language enforcement of immutability? The
word is used more broadly in quite a few places.

--
www.designacourse.com
The Easiest Way to Train Anyone... Anywhere.

Chris Smith - Lead Software Developer/Technical Trainer
MindIQ Corporation
 
R

Roedy Green

Incorrect.
An array is a mutable type.
A java.lang.Integer isn't.
There is a very strong distinction here.

The analogy is this: An immutable String contains theoretically
mutable, but practically immutable char[].

An immutable Integer contains a theoretically mutable, but practically
immutable int.


The whole pinning business can only work because the base char[] will
never change.
 
R

Roedy Green

But he black boxness applies to all JMS.

try that again. I was typing it as partner was pulling my arm to
leave for a meeting.

But the black-boxness applies to all JVMs because they all implement
the same JVM Sun standard which gives them incredible amounts of
leeway on how to implement.
 
R

Roedy Green

All
common uses of the "mutable" classification -- for example, reasoning
about thread-safety, or reasoning about security managers -- are
entirely possible on the guarantee that the array is only referenced by
private members of String, and that String has no code to modify it.

Mutable means changeable. The char[] inside a String is not
changeable, either to a new reference or to have the individual chars
of the array change. Therefore it is in the ordinary English sense
immutable.

mutable and immutable are not java keywords, so I think ordinary
English prevails. The ordinary English meaning is what matters from a
practical point of view as well.


It would be correct to say that arrays are usually mutable, but the
char[] value array inside a String is not.


This thread sounds like a philosophy or religious discussion of
argument for the sake of argument, more of a game than serious
misunderstanding. I am rapidly losing enthusiasm for it.
 
T

Tony Morris

This thread sounds like a philosophy or religious discussion of
argument for the sake of argument, more of a game than serious
misunderstanding. I am rapidly losing enthusiasm for it.

Agreed.
Let's move on.

--
Tony Morris
(BInfTech, Cert 3 I.T.)
Software Engineer
(2003 VTR1000F)
Sun Certified Programmer for the Java 2 Platform (1.4)
Sun Certified Developer for the Java 2 Platform
 
S

Scott Ellsworth

Roedy Green said:
Why do you think to two are incompatible? Why not have a standard
predigested, preparsed binary format for XML, just as flexible, just
more rigidly defined, and naturally error free, more compact, and
faster to process, with smaller classes?

My objection is more from experience with opaque formats than anything
else.

In my experience, and in the files I have seen, most people doing XML
tend to use it for everything, and to be fairly transparent about it.
There are exceptions, such as Microsoft's file formats, and some I have
seen where the files are just chunks of base 64 encoded glop, but in the
main, elements get stored human readable. As long as that property is
preserved, the actual representation on disk is less important.

Again, in my experience, the majority of binary file formats seem to
store the memory image of their data, without any metadata to make it
parsable. Predigesting XML would be far better, but I have a suspicion
that the programmers who would be most attracted to it are those who
would write binary glop unless the tools were very compelling. Very,
very compelling. Again, not impossible to overcome, but I am not sure
the will is there.

The one big concern I have is tools - damn near anything can read an xml
text file, and produce something I can figure out, without needing any
extra tools installed on servers, other people's computers, and the
like. If the undigester tools were a required part of the parsers, then
I might have the same level of access by default.

Put another way, by being plain text, XML lets me watch what goes over
the wire and on the disk without needing any collusion by the developers
and system administrators involved. Alternative formats that have the
same property would keep me just as happy, but my experiences make me
suspicious.

Scott
 
R

Roedy Green

My objection is more from experience with opaque formats than anything
else.

so's mine.

Finding a correct HTML document is rare. Find an incorrect binary one
is also rare.

The problem is decent binary viewers are also rare. People keep
thinking of times they tried to decode an undocumented ad hoc binary
format with a hex viewer when they think binary format. A binary XML
would be far from undocumented, and would have thousands of possible
viewers and editors. It would have a binary DTD to PRECICELY specify
it. The only difference would be the exchange documents would be
compact and 99.9% conforming. Any deviation would be clear bug, not
ordinary human error.

You would not even give up human readability. These things could be
converted back to XML or any of a hundred more readable
representations at the click of a button.

I think some people imagine you would have to write a custom binary
viewer for every DTD. Not so!!!
 
N

NOBODY

You can get 512 MB of RAM for about $60 US on Ebay.
The cost of that RAM overhead you are so upset about is

is $.0000018


Hi,

When you will be forced to buy a special motherboard that is not limited
to 1.5 gig ram, you will understand.

When you exhauted your 2 gigs ram and must swap in another 1 gig swap,
you will understand.

The things you will understand, are:

-GC'ing is cpu waste, so you must limit it to bare minimum to keep a
clear design.

-using fluffy data structure is very taxing on alloc/gc


Trust me, I code on an apliance that swallows thousands of elements per
seconds, and must hold them for 30 seconds, thats a cache of about 60000
elements, each of about 10k. That is 60 megs of ram, just for 1 queue.
I got many queue, the code, the web interface and ~100 threads.

I was given -Xmx512m (512 megs) to do all that, because there is a db and
many other processes on the box. 2 gigs is our limit.

So when I you discover those strings problem, you must do something about
it. When you find the linkedlist is bigger than arraylist, which is
bigger than your own code for singlelinekedlist, you change the
structure.

When you find out that your performance is 5x better when not swapping,
and your boss cannot give more ram because the whole platform would have
to be redesigned, you roll your sleves and do not let the computer do
what you wouldn't do yourself...


Have fun optimizing!
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,578
Members
45,052
Latest member
LucyCarper

Latest Threads

Top