Are JavaScript strings mutable?

W

Water Cooler v2

Are JavaScript strings mutable? How're they implemented -

1. char arrays
2. linked lists of char arrays
3. another data structure

I see that the + operator is overloaded for the string class and hence
it is possible to do:

txt += "foo bar";

This suggests that strings are mutable. However, kindly give me your
informed opinion.
 
V

VK

Water said:
Are JavaScript strings mutable? How're they implemented -

As a sequence of Unicode-16 characters. That also means that unlike in
low-level languages in JavaScript there is not direct relation byte <>
character. Also JavaScript doesn't have Char datatype (nor Byte for
this matter), it knows only strings containing single Unicode-16
character.

This is the max one can read out of specs IMHO because the internal
implementation was left totally up to engines' producers. So are they
mutable, immutable or carried by little green gnomes :) depends I
guess on the particular browser.

Say in JScript and JScript.Net (IE) many parts are borrowed from the
system. In the particular JScript String object is layer on
System.String and it is immutable - as System.String itself.
So in the proposed case:
txt += "foo bar";
the engine creates anonymous string object for "foo bar", creates new
joined string, set the reference to this new string from txt and marks
both former txt and "foo bar" as GC ready.
If you conclude from this that the string concatenations are relatively
slow in JScript - you are hundred times right :)

For Firefox one needs to read the source code I guess (at mozilla.org).
 
R

Richard Cornford

Water said:
Are JavaScript strings mutable?

There are no operations or functions in javascript that can modify a
string primitive.
How're they implemented -

It is important to realise that javascript is specified in terms of
behaviour and any implementation that behaves as required by the
specification is acceptable.
1. char arrays
2. linked lists of char arrays
3. another data structure

All of the above, and any other way the implementers thought suitable in
their implementation.
I see that the + operator is overloaded for the string class
and hence it is possible to do:

txt += "foo bar";

This suggests that strings are mutable.

No, it suggests that the string value of a variable or object property
may be replaced with a new string value. In this case a new string value
that is the result of concatenating 'foo bar' to the original value.
However, kindly give me your
informed opinion.

If you really are interested in this level of detail it would be a good
idea to start reading ECMA 262.

Richard.
 
R

RobG

VK said on 19/04/2006 6:19 AM AEST:
As a sequence of Unicode-16 characters. That also means that unlike in
low-level languages in JavaScript there is not direct relation byte <>
character. Also JavaScript doesn't have Char datatype (nor Byte for
this matter), it knows only strings containing single Unicode-16
character.

This is the max one can read out of specs IMHO because the internal
implementation was left totally up to engines' producers. So are they
mutable, immutable or carried by little green gnomes :) depends I
guess on the particular browser.

Say in JScript and JScript.Net (IE) many parts are borrowed from the
system. In the particular JScript String object is layer on
System.String and it is immutable - as System.String itself.
So in the proposed case:
txt += "foo bar";
the engine creates anonymous string object for "foo bar", creates new
joined string, set the reference to this new string from txt and marks
both former txt and "foo bar" as GC ready.

if txt has not already been given a string value (say empty string "" or
some other value), the result will be:

"undfinedfoo bar"
If you conclude from this that the string concatenations are relatively
slow in JScript - you are hundred times right :)

You need to define relative to what. JavaScript is much slower than
compiled languages, but it isn't built for speed. Consider:


Method 1: txt += 'more text';

Method 2: txt = txt + 'more text';

Method 3: txt = [txt, 'more text'].join('');


In Firefox, method 1 is fastest but all 3 methods take about the same
time for say less than 10,000 concatenations.

In IE, method 3 is about as fast as Firefox method 3, but 1 takes 20
times longer than 3 and method 2 about 6 times longer than that. A test
case is provided below (careful, IE takes over minute to run it, Firefox
a couple of seconds).


<script type="text/javascript">

var ipsum = ['Facilisis ', 'illum ', 'et ', 'qui ', 'wisi ',
'nonummy ', 'sit, ', 'dolore ', 'delenit ', 'in ', 'ad ', 'at, ',
'vel ', 'wisi. ', 'Ut ', 'dolor ', 'nisl ', 'laoreet ', 'odio, ',
'delenit. ', 'Facilisi ', 'esse ', 'elit ', 'eu ', 'vel '];

function getRand(r){
return (Math.random()*r)|0;
}

var iterations = 30000;
var catString = '';
var catArray = [];
var j = ipsum.length;

var s = new Date();
var i = iterations;
while (i--){
catString += ipsum[getRand(j)];
}
var f = new Date();
var txt = 'Using += ' + (f-s);

s = new Date();
i = iterations;
while (i--){
catString = catString + ipsum[getRand(j)];
}
f = new Date();
txt += '<br>Using = + ' + (f-s);

s = new Date();
i = iterations;
while (i--){
catArray.push(ipsum[getRand(j)]);
}
var x = catArray.join('');
f = new Date();
txt += '<br>Using push/join ' + (f-s);

document.write(txt);

</script>


Incidentally, there is an impsum lorem generator here:

<URL:http://www.lindquist.dk/tools/LorumIpsumGenerator.asp>
 
D

Douglas Crockford

Water said:
Are JavaScript strings mutable? How're they implemented -

1. char arrays
2. linked lists of char arrays
3. another data structure

I see that the + operator is overloaded for the string class and hence
it is possible to do:

txt += "foo bar";

This suggests that strings are mutable. However, kindly give me your
informed opinion.

Why are you asking for opinions?

Strings are immutable.

http://javascript.crockford.com/survey.html
 
D

Dr John Stockton

JRS: In article <[email protected]>, dated Wed,
19 Apr 2006 01:58:50 remote, seen in RobG
In IE, method 3 is about as fast as Firefox method 3, but 1 takes 20
times longer than 3 and method 2 about 6 times longer than that. A test
case is provided below (careful, IE takes over minute to run it, Firefox
a couple of seconds).

Careful : different versions of the same browser may differ in speed,
and different computers certainly do.
 
V

VK

Dr said:
Careful : different versions of the same browser may differ in speed,
and different computers certainly do.

Right. But the system environment remains the same - so say
System.String on Windows is still immutable, so JScript using it as
allocator is immutable too, either it's IE4, IE5, IE6 or IE7.

And I checked MSDN on it (JScript String <> System.String) by looking
into jscript.dll - in the name of pure science of course :)
btw there are no more undocumented methods except CollectGarbage in
JScript :-(, I searched through the entire name table.

This way the usage of overloaded + operator for string concatenations
is relatively very uneffective in JScript - but this is already
experimentally proved by RobG.

concat() method is specially adjusted for immutable string handling
(like in Java), so I would expect - but not guarantee of course - to
have roughly the same behavior across scripting platforms. And there is
of course the old "back slash trick" to instantiate long multiline
literals.

At the same time I am not sure why Douglas Crockford thinks that
JavaScript strings are guaranteed to be immutable *everywhere*. The
specs do not pose any explicit requirements on it, and the issue seems
more system-dependent rather then language-dependent (?).
 
V

VK

Thomas said:
There is no such thing as "Unicode-16".

"16-bit Unicode value"

"Unsigned 16-bit value representing a Unicode character"

"Unsigned 16-bit value representing a character from a table as defined
by The Unicode Consortium".

Happy now? :)
 
T

Thomas 'PointedEars' Lahn

VK said:
"16-bit Unicode value"

I could accept that as an informal term, but see below.
"Unsigned 16-bit value representing a Unicode character"

There is such a thing, however those values would only represent the lower
subset of the ca. 1112064 possible characters defined by The Unicode
Standard, Version 4.0.
"Unsigned 16-bit value representing a character from a table as defined
by The Unicode Consortium".

Same here.
Happy now? :)

No. You are still confusing character set and encoding.

(The) Unicode (Standard, Version 4.0) specifies a character set, including
a specification of several encodings for characters of that character set,
the Unicode Consortium's implementation and improvement of the Universal
Character Set (UCS, ISO/IEC 10646).

UTF-16, which is used to encode string data in conforming implementations of
ECMAScript (ECMA-262), among others, is one of several possible encodings.
With UTF-16, _one_ Unicode character is encoded with one _or more_ code
units of 16 bits length each.

<URL:http://unicode.org/faq/>


PointedEars
 
A

Alexis Nikichine

VK said:
Dr John Stockton wrote:

At the same time I am not sure why Douglas Crockford thinks that
JavaScript strings are guaranteed to be immutable *everywhere*. The
specs do not pose any explicit requirements on it, and the issue seems
more system-dependent rather then language-dependent (?).

Just taking a guess, let's imagine, for instance, that some clever
implementation make use of 'aliasing' for its string implementation (or
even interned string):

var a, b;
a = b = "long"; // you might expect that a and b refers
// to the "same" string

there's no way you can modify both a and b in a single statement. Let me
try:

a+= "er";
alert( a ); // "long"
alert( b ); // "longer" (same player, shoot again)

So the things behave as if a new string is created out of the one that a
used to refer too, and the appended one: that new string is then the
one referred to by a. All this because the original, "long" string could
not be mutated into "longer".

Of course implementations are not required internally to alias identical
strings, even in trivial cases, as long as the behavior conforms to the
ECMA standard.

Interestingly, I tried to mutate a String object (not a javascript
string per se) along with all of its alias, but couldn't. It seems ecma
has made enough provision for this "immutability" property to be
preserved for String objects.

Cheers,

Alexis
 
D

Dr John Stockton

JRS: In article <[email protected]>
, dated Thu, 20 Apr 2006 04:01:01 remote, seen in
news:comp.lang.javascript said:
Right. But the system environment remains the same - so say
System.String on Windows is still immutable, so JScript using it as
allocator is immutable too, either it's IE4, IE5, IE6 or IE7.


If you had read what I QUOTED - I quoted it for that purpose - you
should have realised that I was referring to the absolute durations
given therein. YWII.
 
J

John G Harris

Douglas said:
Why are you asking for opinions?

Strings are immutable.

Strings are primitive values so it's not possible for a javascript
program to tell if they are mutable or immutable. They might be
implemented either way, or both.

Equally, it's not possible for a javascript program to tell if numbers
are mutable or immutable.

In fact, for primitive values it's doubtful if 'mutable' and 'immutable'
can be given a useful meaning.


This could do with some tidying.

John
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,768
Messages
2,569,574
Members
45,048
Latest member
verona

Latest Threads

Top