Why String is Immutable?

T

Tim Tyler

Roedy Green said:
That inhibits any sort of optimisation on the read side since the
underlying substring address could change at any instant.

The address of any substring of a shared substring of a
string would remain fixed. Unless of course the substring
itself is modified.
That also adds overhead to every potential change to check for
dependent substrings.

IIRC, StringBuffer uses this strategy anyway, so this can't
be described as added overhead - this overhead is there in
the existing implementation today.
I could see you inventing a subStringBuilder, but I think
you would still want the various benefits of an immutable String.

The problems are all to do with concurrency. Those problems
could be dealt with in other ways besides making your
language's main string class immutable.

Immutability is such a fundamental notion that you probably
ought to be able to request an immutable object of *any* type.

This could be done by using an "immutable" modifier when
creating the object. In a more dynamic language you could
simply pass the object to a cryogenic cloning factory, to
obtain a frozen copy of it.

Immutable objects would never change - except when they are
deleted completely;

All non-primitive immutable objects would consist entirely
of other immutable objects.

Any attempts to modify them would produce exceptions.
 
T

Tim Tyler

Tor Iver Wilhelmsen said:
StringBuffer b = new StringBuffer("Why not? This works.")

That makes a String and then converts is to a StringBuffer - it's not:

StringBuffer b = "It is clear what this means - but it doesn't compile";

It's also an example of Java's verbosity. I don't want to tell the
compiler *twice* that I'm using a StringBuffer on the same line -
it just makes Java code more painful and boring to read.
The String + does exactly the same as StringBuffer.append().

....while being much neater and shorter.
Syntactic sugar is not that important.

Without it we'd be somewhere around: LET two = 1.plus(1);
 
T

Tim Tyler

Mark Thornton said:
It used to do this, but I believe it doesn't any more. It caused too
many problems.

It's not very good if Java can't manage to implement the copy-on-write
pattern in a fundamental case like this one.

These days StringBuffer toString() reads:

public synchronized String toString() {
return new String(value, 0, count);
}

That apparently represents significantly less efficient string handling
than is possible in theory. It looks like a painful step to have to
take to get thread safety - assuming that is why it was done.
 
M

Mark Thornton

Tim said:
These days StringBuffer toString() reads:

public synchronized String toString() {
return new String(value, 0, count);
}

That apparently represents significantly less efficient string handling
than is possible in theory. It looks like a painful step to have to
take to get thread safety - assuming that is why it was done.

I don't think the problem was thread safety. Rather it was meeting
conflicting performance expectations in the use of StringBuffer. The
current code be less efficient in some cases but it corresponds more
accurately with many users expectations. Less surprises means reduced
chance of misuse leading to very poor performance. Have a look at
'closed' bugs relating to StringBuffer to see the problems. In
particular 4259569, 4724129, and related bugs.

Mark Thornton
 
R

Roedy Green

All non-primitive immutable objects would consist entirely
of other immutable objects.

Internally such an object would have to be considered part of a
different class. Other wise every set of a variable would have to be
proceed by a immutability check on the object.

If method a calls b calls c and c modifies the object, you really
should not start out calling a, but most of the time a does not modify
the object. Just when and how is the lock invoked? With Java's
immutable objects the methods are entirely missing so the check is
compile time.

I think it is trickier to implement what you want than you think.
 
R

Roedy Green

It's also an example of Java's verbosity. I don't want to tell the
compiler *twice* that I'm using a StringBuffer on the same line -
it just makes Java code more painful and boring to read.
Somebody else noticed!

In Bali I suggested a declare like this

Dalmatian( parms ) d;

as a shortcut for Dalmatian d = new Dalmatian( parms );

Sometimes classes have very long names, especially when qualified. It
just creates opportunities for deceptive code where the two names are
not quite identical.
 
O

Oliver Wong

Roedy Green said:
Internally such an object would have to be considered part of a
different class. Other wise every set of a variable would have to be
proceed by a immutability check on the object.

If method a calls b calls c and c modifies the object, you really
should not start out calling a, but most of the time a does not modify
the object. Just when and how is the lock invoked? With Java's
immutable objects the methods are entirely missing so the check is
compile time.

I think it is trickier to implement what you want than you think.

I don't think so: you'd just add two optional new keywords: mutable and
immutable. E.g.:


mutable String myMutableString = "hello";
immutable String myImmutableString = "world";
....
/*The signature of this method indicates that it does not care whether the
String it receives is mutable or not. */
public void writeLine(String s) {
System.out.println(s);
}
/*The signature of this method indicates that it requires the String it
receives to be mutable*/
public void reverseString(mutable String s) {
s.reverseMe();
}
/*The signature of this method indicates that it requires the String it
receives to be immutable*/
public void cacheMe(immutable String key, Object value) {
hashMap.put(key, value);
}

- Oliver
 
T

Tim Tyler

Roedy Green said:
Internally such an object would have to be considered part of a
different class. Other wise every set of a variable would have to be
proceed by a immutability check on the object.

I prefer prototype-based languages. They tend not to have
classes in the first place - so the issue doesn't come up in that way.
http://en.wikipedia.org/wiki/Prototype-based_programming
If method a calls b calls c and c modifies the object, you really
should not start out calling a, but most of the time a does not modify
the object. Just when and how is the lock invoked?

If you call a, and a calls b and b calls c, which attempts to modify
an immutable object, that would generate an exception.
With Java's immutable objects the methods are entirely missing so the
check is compile time.

Unfortunately, the fact that they are completely disjoint classes stops
you from being able to use mutable strings and immutable strings
interchangably - and none of the "String" syntax sugar can be used with
StringBuffers.

An intelligent lint program may well be able to figure out
from the context that a call to a is going to call b in this
case, which will call c and is therefore likely to fail.
If not it certainly ought to be able to tell you that a
call to a might call b, which could call c - which /could/ fail.

Information about immutability would be just as available at
compile time as any other type information - if immutable
objects were constructed using an "immutable" modifier.

Besides, you can't check everything at compile time, and the philosophy
that you always ought to check as much as you can is seriously misguided
- since that way you get static languages, types wired into your program
all over the place, generic classes, refactoring hell - and you get SAD,
rather than RAD.

Lastly, if you *really* want to make an immutable object the more
traditional way - by making an object with all the modification
methods missing - nothing would stop you from doing that.
 
T

Tim Tyler

Roedy Green said:
Somebody else noticed!

I've certainly read your complaints about this one before.

You have a similar problem with XML - IIRC ;-)
In Bali I suggested a declare like this

Dalmatian( parms ) d;

as a shortcut for Dalmatian d = new Dalmatian( parms );

Yes. Even in C++ you would go:

StringBuffer buffer("foo");

....instead of:

StringBuffer buffer = new StringBuffer("foo");

The type of the constructor call is implicit.
 
T

Tim Tyler

Mark Thornton said:
Tim Tyler wrote:

I don't think the problem was thread safety. Rather it was meeting
conflicting performance expectations in the use of StringBuffer. The
current code be less efficient in some cases but it corresponds more
accurately with many users expectations. Less surprises means reduced
chance of misuse leading to very poor performance. Have a look at
'closed' bugs relating to StringBuffer to see the problems. In
particular 4259569

Sun's evaluation back in 1999 - "not a bug".

I agree with them - this is not a bug.
4724129 [...]

A real bug. However it was a programming mistake. Things worked
before it was introduced - and were subsequently fixed again before
Java 1.5 came out.

Copy on write is generally a performance/memory usage optimisation.
Users may not be expecting it when profiling, but nine times out of ten
it is faster anyway - because duplicating memory every time you make a
String out of a StringBuffer is so wasteful.
 
T

Tor Iver Wilhelmsen

Tim Tyler said:
That makes a String and then converts is to a StringBuffer

No, it take the interned String in the constant pool and creates a
StringBuffer using that.
StringBuffer b = "It is clear what this means - but it doesn't compile";

Yes it is, except with a few more characters.
It's also an example of Java's verbosity. I don't want to tell the
compiler *twice* that I'm using a StringBuffer on the same line -
it just makes Java code more painful and boring to read.

Ah, someone who feels C++'s confusing way of creating objects is good.
But in C++, too, you use "new classname()" when creating references.
...while being much neater and shorter.

If shortness is your goal, try PL/1 or J.
Without it we'd be somewhere around: LET two = 1.plus(1);

Readability beats "writability" any day as long as you write something
that is supposed to be maintained later on, perhaps even by someone
else than you.

Do you comment your code? If so, why?
 
T

Tor Iver Wilhelmsen

Tim Tyler said:
StringBuffer buffer("foo");

...instead of:

StringBuffer buffer = new StringBuffer("foo");

No, in addition to.

StringBuffer buffer("foo");

creates a local object, while

StringBuffer &buffer = new StringBuffer("foo");

creates it in the "heap". Java only has the latter form of object
instantiation.

But in C++, if StringBuffer had a non-explicit constructor taking a
char[], then

StringBuffer buffer = "Implicit use of a constructor";

works, but again: For the local object creation that Java does not have.
The type of the constructor call is implicit.

Yes, because the type of the variable *has* to match the type of the
object.

But you are ignoring this case:

Map m = new HashMap();

where the type of the reference isn't the type of the object. In a
language that actually has "real" inheritance (instead of C++'s "keep
this long list of caveats in mind" inheritance), such use (writing to
the most general interface needed) is common.
 
T

Tim Tyler

Tor Iver Wilhelmsen said:

[Concise construction in C++]
StringBuffer buffer("foo");

...instead of:

StringBuffer buffer = new StringBuffer("foo");
[...]
The type of the constructor call is implicit.

Yes, because the type of the variable *has* to match the type of the
object.

But you are ignoring this case:

Map m = new HashMap();

where the type of the reference isn't the type of the object. In a
language that actually has "real" inheritance (instead of C++'s "keep
this long list of caveats in mind" inheritance), such use (writing to
the most general interface needed) is common.

I reckon about 9 times out of 10 - across all the Java code in existence,
when you see code of the form:

Foo var = new Bar(...)

Foo and Bar are the same string of characters.

IMO, the best solution to this one is to use a more dynamic language -
so you don't have to specify or constrain Foo if you don't want to.
 
T

Tor Iver Wilhelmsen

Tim Tyler said:
I reckon about 9 times out of 10 - across all the Java code in
existence, when you see code of the form:

That doesn't mean it's good form. Or rather, it's because people don't
use the type system in ways that will give benefits later. You can
either use concrete types all over the place, needing to search and
replace in the whole source tree when you decide to change it, or you
can use an interface/abstract base and change it one place.
IMO, the best solution to this one is to use a more dynamic language
- so you don't have to specify or constrain Foo if you don't want
to.

Oh, definitely. But the number of Python jobs are few.
 
B

Brendan Guild

Tim Tyler wrote in
In the case of double, the syntax sugar of operators can be applied
to the mutable version - and not to the immutable version.

With Strings it's the other way around - it's the mutable version
that is awkward to use; and it's the immutable version that has all
the syntax sugar.

What mutable version of Double do you mean? I find it hard to believe
that there is any syntactic sugar for such a thing. I'm not entirely up-
to-date on the recent improvements in Java, but I haven't been able to
find what you are refering to.

Do you mean org.apache.commons.lang.mutable.MutableDouble?
 
R

Roedy Green

What mutable version of Double do you mean? I find it hard to believe
that there is any syntactic sugar for such a thing. I'm not entirely up-
to-date on the recent improvements in Java, but I haven't been able to
find what you are refering to.

he is referring to the boxing/unboxing that will interconvert it to
double.
 
T

Tim Tyler

Brendan Guild said:
Tim Tyler wrote in news:[email protected]:

What mutable version of Double do you mean? I find it hard to believe
that there is any syntactic sugar for such a thing. I'm not entirely up-
to-date on the recent improvements in Java, but I haven't been able to
find what you are refering to.

Do you mean org.apache.commons.lang.mutable.MutableDouble?

I mean double - as in:

double d = 2.0;
d = d * d;

I regard double variables as mutable.

As evidence in favour of that, interpretation, I present the fact that
operations on them are not atomic - and inspecting their value from a
different thread can expose their value changing in the middle of an
update.
 
H

Harry Bosch

You are right, operations on a double are not thread-safe.
So:
make you classes thread-safe.
- or -
synchronize access to these variables
- or -
mark the double/long as volitile
- or -
check out java.util.concurrent.atomic package in jdk 5

I believe I am correct on the volitile thing, you may want to check.
 
K

Kenneth P. Turvey

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

You are right, operations on a double are not thread-safe.
So: [Snip]
- or -
mark the double/long as volatile

This doesn't really solve the synchronization issue, does it?

- --
Kenneth P. Turvey <[email protected]>
http://kt.squeakydolphin.com (not much there yet)
Jabber IM: (e-mail address removed)
Phone: (314) 255-2199
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)

iD8DBQFDQX/ji2ZgbrTULjoRAt2AAJ9RJX/4eplgCyyD6ZJCMLaoHeo/8QCfcUFr
Kvj5f+r2tM7PM2vcKFwxu3Y=
=TCuI
-----END PGP SIGNATURE-----
 
B

Brendan Guild

Tim Tyler wrote in
I mean double - as in:

double d = 2.0;
d = d * d;

I regard double variables as mutable.

As evidence in favour of that, interpretation, I present the fact
that operations on them are not atomic - and inspecting their value
from a different thread can expose their value changing in the middle
of an update.

I don't understand how your evidence supports your belief, but I
certainly agree that double variables are mutable. I had considered
that you might have meant double variables but rejected it because
Double variables are also mutable in exactly the same way. In that
case, what is the 'immutable version' you were talking about? 'final'
variables still have syntactic sugar.

I thought we were talking about values, where it makes perfect sense to
have mutable and immutable objects, but it is absurd to have mutable
numbers. I have seen Scheme interpreters that allowed you to reassign
the numeral symbols. We might do that like this:

2.set(3);

Which would make 1+2==5, but probably not 1+1==3. But for numbers to be
truly mutable, we'd want the latter. I can't imagine all the madness
that would lead to.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,066
Latest member
VytoKetoReviews

Latest Threads

Top