Iterating sparse arrays

  • Thread starter Christopher Benson-Manica
  • Start date
C

Christopher Benson-Manica

If an array is sparse, say something like

var foo=[];
foo[3]=4;
foo['bar']='baz';
foo['quux']='moo';

is there a way to iterate through the entire array?
 
R

Richard Cornford

Christopher said:
If an array is sparse, say something like

var foo=[];
foo[3]=4;
foo['bar']='baz';
foo['quux']='moo';

is there a way to iterate through the entire array?
<snip>

for (var c = foo.length; c--; ){
... // using foo[c]
}

The values at indexes 0 to 2 will return Undefined values. The named
properties are not related to the Arrayness of the Array object but
rather to its Objectness. As such they should not be considered relevant
to a desire to iterate through the "entire array".

Richard.
 
D

Douglas Crockford

If an array is sparse, say something like
var foo=[];
foo[3]=4;
foo['bar']='baz';
foo['quux']='moo';

is there a way to iterate through the entire array?

It is incorrect to use an array when the keys are non-integers. If you
have keys like 'bar' and 'quux', you should be using an object. You can
then use the for..in statement to iterate through them.

var foo = {};
foo[3] = 4;
foo.bar = 'baz';
foo['quux'] = 'moo';

for (key in foo) {
...
}

See http://www.crockford.com/javascript/survey.html
 
R

Robert

Christopher Benson-Manica said:
If an array is sparse, say something like

var foo=[];
foo[3]=4;
foo['bar']='baz';
foo['quux']='moo';

is there a way to iterate through the entire array?

Yes.

Note: The length property includes the numeric indexes. Since three is
the highest index, the length property has a value of four.

Robert

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
<title>Sparse array</title>

<script type="text/javascript">

var foo=[];
foo[3]=4;
foo['bar']='baz';
foo['quux']='moo';

var accumulate = "";

for (var i in foo)
{
accumulate += "foo["+ i + "] = " + foo + "; ";
}

alert(accumulate);

alert("number of elements in foo = " + foo.length);
</script>
</head>
<body>
<p>Demonstrate retrieving all set array values.</p>
</body>
 
R

Robert

If an array is sparse, say something like

var foo=[];
foo[3]=4;
foo['bar']='baz';
foo['quux']='moo';

is there a way to iterate through the entire array?

It is incorrect to use an array when the keys are non-integers.[/QUOTE]

I think this is too strong of a statement since the array declaration
supports string values as references. See my post to this thread for
the example code.

When you declare an object an array your javascript supports methods
like pop and the length property with the array object. When you
declare a variable an object you get fewer methods and properties. The
author of the javascript code should decide what is more appropriate.

I do not think we can tell from the segment of code shown what is more
appropriate.

The object declaration does allow for a more compact declaration:

var foo= { 3: 4, bar: "baz", "quux": "moo" };


Robert
 
D

Douglas Crockford

It is incorrect to use an array when the keys are non-integers.
I think this is too strong of a statement since the array declaration
supports string values as references.

Wrong is wrong even if you do not understand the difference.
 
R

Robert

With better designed programming languages, associative arrays distinguish
between methods and data, so you can have a key "push" which doesn't
conflict with the method push(). Alas, Javascript doesn't seem to have this
concept, which is a shame.

This is unfortunate. Seems like 'informal' languages win out in market
share over 'formal' one.
It's usually considered good practice (in other languages with similar
semantics to Javascript) that associative arrays are used for storing

People freak out over the term 'associative arrays' in this forum.
'cause it's not in the standard they stay.
stupid data and objects are used for storing smart data; that is, it's
intended that the data gets modified by calling methods on the object.

Robert
 
C

Christopher Benson-Manica

Douglas Crockford said:
It is incorrect to use an array when the keys are non-integers. If you
have keys like 'bar' and 'quux', you should be using an object. You can
then use the for..in statement to iterate through them.
var foo = {};
foo[3] = 4;
foo.bar = 'baz';
foo['quux'] = 'moo';
for (key in foo) {
...
}

That is a beautiful thing - thank you for showing me the light!
 
M

Michael Winter

In fact...

for (i in a) alert(i.value)

...produces 'undefined',

It should in every case. The value assigned to i on each iteration is the
property name as a string, and strings don't have a value property. If you
want the value of the enumerated property, use

a
Using an object instead of an array for 'a' does the same thing.

Again, it should. This whole business has nothing to do with arrays. See
my other post (and the link therein).
I'm going to need a real associative array;

There are several in the archives. Mine is at
<URL:http://www.mlwinter.pwp.blueyonder.co.uk/clj/hash.js> (14.4KB). This
particular implementation allows objects to be true keys (rather than only
their toString value). However, it's also more readable than efficient (or
small). If you do end up using it, at the very least I'd advise you to
make a copy, remove methods that you aren't using (including the Iterator,
if appropriate) and run it through JSMin
(<URL:http://www.crockford.com/javascript/jsmin.html>). That'll take out
about 10.5KBs of comments and whitespace. If you want to be adventurous,
you could also shorten the variable names - that'll shave off another
kilobyte or so.

I'm terrible at formal testing procedures, so the script hasn't undergone
any, but playing with it hasn't uncovered any errors (unlike the last time
I uploaded it[1] :-().

Mike


[1] My apologies for asking for a check of a blatently non-functioning
script.
 
M

Michael Winter

[...] looking at the code, you seem to be using toString() as your hash
function.

Yes, as their's no native alternative (answering your later question). One
could force client code to implement a hashCode method like Java, but that
would add (*way*) too much of a burden.
[...] I'm going to end up with a vast linear list [...]

True, iff all of the objects have the same toString result[1], but I seem
to remember good performance (see the end of
<URL:http://groups.google.co.uk/groups?selm=opsilrs6aax13kvk@atlantis>).

[snip]
a = {}
b = {}
ASSERT(a != b)

Object equality is based on references to the object itself, not its
content. References are only equal if they refer to the same object or a
joined object (algorithm: 11.9.3; joined objects 13.1.2).
ASSERT(!(a < b))
ASSERT(!(a > b))

In relational comparisons (11.8.5), the internal ToPrimitive operator
(9.1) is called on both objects with the hint, Number. If a valueOf method
is defined, the return value is used when comparing, otherwise toString is
used (8.6.2.6).

Mike

And here I was thinking I'd take a break from Usenet...


[1] If they're your own objects, you could implement a toString method
that generates a return based on the content of that object.
Alternatively, you could just generate random numbers:

function MyObject() {
var id = String((Math.random() % 1) * 100000);

this.toString = function() {return id;};
}

Although there would be some clashes, it would prevent a single list and
should have relatively little overhead.
 
R

Richard Cornford

Michael Winter wrote:
[1] If they're your own objects, you could implement a toString
method that generates a return based on the content of that object.
Alternatively, you could just generate random numbers:

function MyObject() {
var id = String((Math.random() % 1) * 100000);

this.toString = function() {return id;};
}

Although there would be some clashes, it would prevent a single
list and should have relatively little overhead.

I would have thought a non-repeating sequence would avoid clashes:-

(function(){
var baseNum = 1; //or maybe (new Date()).getTime();
Object.prototype.toString = function(){
var id = ' _$'+(++baseNum);
return (this.toString = function(){
return id;
})();
};
})();

Richard.
 
M

Michael Winter

On Thu, 20 Jan 2005 21:48:46 -0000, Richard Cornford

[snip]
I would have thought a non-repeating sequence would avoid clashes:-

It would, but the issue isn't really avoiding clashes altogether but
reducing the length of the internal linked lists. A random return value
just happened to be the first thing that entered my mind.

[snip]

Mike
 
R

rh

David said:
Michael Winter wrote:
[...]
Here's an approach:

_id = 0
_ids = []
_data = []

function put(key, value)
{
var i = _id++;
key[" id"] = i;
_ids = value;
_data = value;
}

function get(key)
{
var i = key[" id"];
return _data;
}

function remove(key)
{
var i = key[" id"];
delete key[" id"];
delete _ids;
delete _data;
}

Only works with object keys, but it'd be easy enough to strings and integers
by using a conventional array for keys of those type. The " id" string
would need to be unique per associative array so that you could use the
same key in multiple associative arrays. The _ids array is necessary to
prevent garbage collection of forgotten keys.


Similar an updated version of the Hashtable constructor that I put
together yesterday. It should handle keys of all types, based on a
modification of Richard's *very neat* idea of assigning a univesal
unique id to the objects. Richards id creation appeared to be limited
to objects that didn't have their own version of "toString", which
would eliminate "Array", "Date", etc. objects as keys, if I followed it
correctly.

[I'm out for the day, but will perhaps get something posted later].
.... /rh
 
R

rh

rh wrote:
Similar an updated version of the Hashtable constructor that I put
together yesterday. It should handle keys of all types, based on a
modification of Richard's *very neat* idea of assigning a univesal
unique id to the objects. Richards id creation appeared to be limited
to objects that didn't have their own version of "toString", which
would eliminate "Array", "Date", etc. objects as keys, if I followed it
correctly.

[Sorry for the bad grammar/spelling - I was in a rush :)]

This (very lightly tested) version can be found at:
<url:http://www3.telus.net/rhall/Misc/DGHash.js>

../rh
 
R

rh

Michael said:
[snip]
People freak out over the term 'associative arrays' in this forum.

No-one has "freaked out" over the term, however there is disagreement with
regard to its suitability.

It seems evident from the discussions on the issue, though, that there
are a number of different conceptions of what constitutes an
"associative array". That in itself is generally sufficient to ensure
there won't be agreement.
ECMAScript provides a syntax that resembles associative array usage and
does, to a certain extent, meet what is expected of the data type.

Some would even say to a "high" extent. ;-)
However, all objects in ECMAScript have predefined members and these may
cause problems, especially as you cannot determine (without supporting
code) whether a member is predefined or part of the "associative array".
There are other issues:

- You cannot reliably enumerate elements within the "associative
array".

A fundamental definition of associative array (at least mine) doesn't
include enumeration -- the supported operations are insert, replace,
remove and lookup. So the question, it seems to me, is whether you can
reliably do lookup.
Whilst a for..in statement will enumerate all properties within an
object, it may not be possible to determine what has been added and
what was predefined. Though I haven't encountered a user agent that
allows a host- or specification-defined property to be enumerated,
that (1) doesn't mean the former would never happen, and (2) doesn't
account for any properties added to the prototype of Object or a
more specific object (such as Array, if you decided to use that for
some reason).

In the case of lookup, it means that one cannot blithely assume that a
retrieval from an object corresponds to a [key,data] pair within the
"associative array". That is a major problem, and it is something that
all users must be aware of (and of course we know they are not).

Can it be reliably determined whether a key-lookup returns a
pre-defined property outside the "associative array"? Yes, provided
that hasOwnProperty is supported, and provided that string has not been
used as a key in the "associative array". And those conditions are
easy to check (and, if necessary, but perhaps not quite so easy, to
supplant).
- You cannot determine the number of elements within an "associative
array".

Though this might not be important in all instances, this feature
is not available in ECMAScript.

The main objection I have to the term is that it leads to confusion.
Examples such as

var hash = new Array();
hash['property'] = value;

do nothing but instill the mistaken belief that arrays can be subscripted
using arbitrary strings.

Agreed, that is entirely wrong.
The same effect could be achieved with

var hash = new Object();
hash.property = value;

or

var hash = {property : value};

but this isn't shown.

Instead of understanding the actual mechanics - which really are quite
simple - new programmers believe a feature exists where in fact it does
not.

On the other hand, if programmers wish to understand Objects in
ECMAScript, they need to understand the close relationship between
Objects and their foundation in associative arrays. Objects are, in
fundamental concept, simply linked associative arrays.

The following demonstrates the battle you're waging is likely lost:

<url:http://en.wikipedia.org/wiki/Associative_array#JavaScript>

The good news, perhaps, is that you can edit it. :)
<...>

Regards,

.../rh
 
R

Richard Cornford

rh wrote:
Can it be reliably determined whether a key-lookup returns a
pre-defined property outside the "associative array"? Yes,
provided that hasOwnProperty is supported, and provided that
string has not been used as a key in the "associative array".
And those conditions are easy to check (and, if necessary,
but perhaps not quite so easy, to supplant).
<snip>

I have never seen - hasOwnProperty - as a solution to this problem as it
still leaves the code subject to the possibility that the language
implementation is exposing internal properties of the object instances
(rather than their prototypes). If that was an unusual, or unexpected,
situation then it probably could be disregarded and the transgressing
implementation dismissed as fatally broken, but ECMA 262 doesn't
explicate forbid the possibility and Mozilla demonstrates its reality.
Try:-

var t = {};
alert(
t.hasOwnProperty('__proto__')+'\n'+
t.hasOwnProperty('__parent__')+'\n'+
'');

- in a Mozilla/Gecko browser and see two properties of an object
instance that were never explicitly created by script code.
Konqueror/Safari offers another set of similar object instance
properties.

Unfortunately, using a javascript object for the storage of name/value
(or key/data, if you prefer) pairs is not safe if the names/keys are
expected to be arbitrary. Hence the preference for implementing such
storage objects in a way that takes control of the arbitrary name/key
strings and shifts then into a 'safe' set. How safe that 'safe' set is
remains a subject for debate.

Richard.
 
R

rh

Richard said:
rh wrote:

<snip>

I have never seen - hasOwnProperty - as a solution to this problem as it
still leaves the code subject to the possibility that the language
implementation is exposing internal properties of the object instances
(rather than their prototypes). If that was an unusual, or unexpected,
situation then it probably could be disregarded and the transgressing
implementation dismissed as fatally broken, but ECMA 262 doesn't
explicate forbid the possibility and Mozilla demonstrates its reality.
Try:-

var t = {};
alert(
t.hasOwnProperty('__proto__')+'\n'+
t.hasOwnProperty('__parent__')+'\n'+
'');

- in a Mozilla/Gecko browser and see two properties of an object
instance that were never explicitly created by script code.
Konqueror/Safari offers another set of similar object instance
properties.

Interesting, and how dastardly of them :-(! So in your interpretation
of ECMA-262/3:

[15.2.5 Properties of Object Instances

Object instances have no special properties beyond those inherited from
the Object prototype object.]

are you saying because the word "special" is there, that brower
manufacturer's implementations can add properties to Object instances,
as long as they're not "special", and remain in conformance with the
standard?

Surely, anything added by the language implementation that affects the
usage behaviour of an object instance is "special".

Nonetheless, I note that an attempt to delete the properties given
above returns false (presumabley because they have a DontDelete
attribute). That makes these particular properties very much "special".


Moreover, if you make an assignment to these properties, no assignment
is made. That makes these properties "special".

And further, these properties are not enumerable. That also makes them
"special".

Clearly it's a transgression of the standard!
Unfortunately, using a javascript object for the storage of name/value
(or key/data, if you prefer) pairs is not safe if the names/keys are
expected to be arbitrary. Hence the preference for implementing such
storage objects in a way that takes control of the arbitrary name/key
strings and shifts then into a 'safe' set. How safe that 'safe' set is
remains a subject for debate.

I'm not sure there's any great argument about using a controlled
approach, after all I did produce a couple of versions of Hashtable.
For a completely arbitrary set of keys (presumably, with control and
some work you can tranform the names of the "bad" keys to something
non-conflicting and use some separate storage).

However, it should be kept in mind that the problem of key conflicts
doesn't just affect the presumption of "associative arrays" at the
language level. It affects all Object usage in ECMAScript. The fact
that you may assign a key/value (such as the key __proto__, or some
other yet to be revealed key) and have nothing happen is a matter for
some concern.

So, facing certain realities, that now leaves the question of how to
determine which keys are going to be problematic, in the event you
actually want to test. I'd still be inclined to use "hasOwnProperty",
but now in combination with "isEnumerable". No guarantee there, but
should be within epsilon (some current or future crap added to the
object instances, potentially remaining an issue).

../rh
 
R

Richard Cornford

rh said:
Richard Cornford wrote:
Interesting, and how dastardly of them :-(! So in your
interpretation of ECMA-262/3:

[15.2.5 Properties of Object Instances

Object instances have no special properties beyond those
inherited from the Object prototype object.]

are you saying because the word "special" is there, that
brower manufacturer's implementations can add properties
to Object instances, as long as they're not "special", and
remain in conformance with the standard?

No, I am referring to:-

<quote cite="ECMA 262 3rd edition: Section 2 "Conformance",
3rd paragraph">
| A conforming implementation of ECMAScript is permitted to
| provide additional types, values, objects, properties, and
| functions beyond those described in this specification. In
| particular, a conforming implementation of ECMAScript is
| permitted to provide properties not described in this
| specification, and values for those properties, for objects
| that are described in this specification.
</quote>

- Which implies that Mozilla/Gecko javascript implementations are fully
conformant with ECMA 262 in this respect.
Surely, anything added by the language implementation that
affects the usage behaviour of an object instance is
"special".
<snip>

__proto__ and __parent__ are very 'special' properties, without any
doubt, but they don't appear to be ruled out by the specification.
And further, these properties are not enumerable. That also
makes them "special".

Clearly it's a transgression of the standard!

My reading of ECMA 262 suggests that they are allowed.

However, it should be kept in mind that the problem of key
conflicts doesn't just affect the presumption of "associative
arrays" at the language level. It affects all Object usage
in ECMAScript. The fact that you may assign a key/value (such
as the key __proto__, or some other yet to be revealed key)
and have nothing happen is a matter for some concern.

The fact that objects may have unspecified and unexpected properties
certainly should be taken into account. However, the implementers of
ECMAScirpt do not appear to be suicidal in the property names they add.
Generally, a normal camel-case property name will not prove problematic,
it seems only necessary to avoid certain unusual character combinations
such as those starting and ending with '__' or '[[' and ']]'. Which is
why the problem becomes significant when ECMAScript objects are to be
used to store values with truly arbitrary name/key strings.
So, facing certain realities, that now leaves the question
of how to determine which keys are going to be problematic,
in the event you actually want to test. I'd still be inclined
to use "hasOwnProperty", but now in combination with
"isEnumerable". No guarantee there, but should be within
epsilon (some current or future crap added to the object
instances, potentially remaining an issue).

I still prefer the other strategy.

Richard.
 
R

rh

No, I am referring to:-

<quote cite="ECMA 262 3rd edition: Section 2 "Conformance",
3rd paragraph">
| A conforming implementation of ECMAScript is permitted to
| provide additional types, values, objects, properties, and
| functions beyond those described in this specification. In
| particular, a conforming implementation of ECMAScript is
| permitted to provide properties not described in this
| specification, and values for those properties, for objects
| that are described in this specification.
</quote>

- Which implies that Mozilla/Gecko javascript implementations are fully
conformant with ECMA 262 in this respect.

Ah, but I think great care is required in interpretation of the
specification. Yes, the above says implementers of the language can add
properties/values to all objects. But some objects, not all, are
restricted by the body of the specification. In particular, 15.2.5,
says that if an implementer cannot add properties/values directly to an
Object object -- that can only be done through additions to the Object
prototype.

In other words, both must apply. Any other interpretation, it seems to
me, throws the standard into contradiction, because there's nothing
that says one has precedence over the other.

Therefore my continued contention would be that Mozilla/Gecko
javascript implementations **are not** in conformance with ECMA 262
with respect to __proto__ and __parent__.

[Non-special objects can't be added to an Object object by the
implementer because the built-in --> native object specification
doesn't allow for it (although my check here is at best cursory).]

The fact that objects may have unspecified and unexpected properties
certainly should be taken into account. However, the implementers of
ECMAScirpt do not appear to be suicidal in the property names they add.
Generally, a normal camel-case property name will not prove problematic,
it seems only necessary to avoid certain unusual character combinations
such as those starting and ending with '__' or '[[' and ']]'. Which is
why the problem becomes significant when ECMAScript objects are to be
used to store values with truly arbitrary name/key strings.

They may not be suicidal -- they remain a bunch of dastards in my book
:).

Given that propertyIsEnumerable doesn't venture into the prototype
chain, a hasOwnProperty check would seem to redundant. So my later
suggestion would be to simply replace the hasOwnProperty with a
propertyIsEnumerable check;
I still prefer the other strategy.

Well everone gets to choose their preference, but I haven't seen much
of a case for a more complex solution.

../rh
 
M

Michael Winter

[snip]
A fundamental definition of associative array (at least mine) doesn't
include enumeration

I know, which is why I didn't say enumeration was part of the definition.
I remember the previous debate well. However, it seems there is a common
desire to enumerate them and this may be unreliable unless you take the
operation into your own hands.
So the question, it seems to me, is whether you can reliably do lookup.

Yes, provided that supporting code is used to perform it. However, I'd
argue that this necessity clearly means that "associative arrays" cannot
be deemed part of the language.

[snip]
The following demonstrates the battle you're waging is likely lost:

I'm certain that there is a number of sites masquerading as references
which "teach" this topic in a misleading way. That in itself would already
signal the battle is lost.

I wouldn't say that's particularly devastating. It even mentions that a
prototype will interfere affect the properties available, however it would
probably take some careful thinking for the reader to realise what the
implications are.
The good news, perhaps, is that you can edit it. :)

If the aim was to actively teach the concept, perhaps I might (though I
wouldn't feel particularly comfortable doing so).

Mike
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,754
Messages
2,569,521
Members
44,995
Latest member
PinupduzSap

Latest Threads

Top