Arrays and Lists ?

E

Emil Macarie

I have recently started playing around with Ruby, and now I am starting
to use it for more serious projects.
I am working on a program that works with quite a bit of data that is
originally stored in array format.
I want to be able to loop through my original data and add or remove
elements.
In another language I would create a linked list, loop through the
original array and add what I need to the list.
However, Ruby seems to lump Arrays and Lists into one object.
I do see the Array class has an insert method, however assuming that
the implementation of an Array is actually a contiguous array in memory,
the insert method will create a new copy of an array with the insert
element appended.
That means that if I want to start with an empty list and insert
10,000 elements, 10,000 arrays will be created out of which I would only
care about the final one.

If that is the case how should I (starting with 0 elementS) append and
create a large linked list ?

This is my impression of what is happening and please help me clear up
any misconceptions I may have. I have scoured Google to some sort of
article about the intention of the Ruby Array class with no luck.
 
H

Hassan Schroeder

=A0I do see the Array class has an insert method, however assuming that
the implementation of an Array is actually a contiguous array in memory,
the insert method will create a new copy of an array with the insert
element appended.
=A0That means that if I want to start with an empty list and insert
10,000 elements, 10,000 arrays will be created out of which I would only
care about the final one.

Forgive me if I'm misunderstanding your point, but --
things =3D Array.new =3D> []
things.object_id =3D> 2163798080
things << Object.new
=3D> [# said:
=3D> [# said:
things.object_id =3D> 2163798080

No new Array is being created by adding an element to the Array.

--=20
Hassan Schroeder ------------------------ (e-mail address removed)
twitter: @hassan
 
B

Brian Candler

Emil Macarie wrote in post #969413:
I do see the Array class has an insert method, however assuming that
the implementation of an Array is actually a contiguous array in memory,
the insert method will create a new copy of an array with the insert
element appended.

Nope. It grows the memory space, and shuffles the existing elements if
necessary (if you're inserting into the middle rather than at the end).

I believe it also grows the memory space by more than is needed, so that
10,000 inserts don't require 10,000 grows.
If that is the case how should I (starting with 0 elementS) append and
create a large linked list ?

Just append to an empty Array. And relax :)
 
E

Emil Macarie

Brian Candler wrote in post #969420:
Emil Macarie wrote in post #969413:

Nope. It grows the memory space, and shuffles the existing elements if
necessary (if you're inserting into the middle rather than at the end).

I believe it also grows the memory space by more than is needed, so that
10,000 inserts don't require 10,000 grows.


Just append to an empty Array. And relax :)

Thanks ! That does make me feel better about using an array as a list
for now. I will have to take a look at the source code as I am curious
it all works.
As for relaxing, I do try :), but the amount of times I've seen people
make these kinds of mistakes has made me rather thin skinned.
 
P

Peter Vandenabeele

Brian Candler wrote in post #969420:
Emil Macarie wrote in post #969413:

Nope. It grows the memory space, and shuffles the existing elements if
necessary (if you're inserting into the middle rather than at the end).

I believe it also grows the memory space by more than is needed, so that
10,000 inserts don't require 10,000 grows.

These are the relevant pieces of code in array.c (ruby trunk)

static VALUE
rb_ary_push_1(VALUE ary, VALUE item)
{
long idx = RARRAY_LEN(ary);

if (idx >= ARY_CAPA(ary)) {
ary_double_capa(ary, idx);
/* dirty hack ... sorry couldn't get cleaner solution to work */
rb_define_global_const("ARRAY_CAPACITY",
INT2FIX(ARY_CAPA(ary)));
/* end of dirty hack */
}
RARRAY_PTR(ary)[idx] = item;
ARY_SET_LEN(ary, idx + 1);
return ary;
}
...
static void
ary_double_capa(VALUE ary, long min)
{
long new_capa = ARY_CAPA(ary) / 2;

if (new_capa < ARY_DEFAULT_SIZE) {
new_capa = ARY_DEFAULT_SIZE;
}
if (new_capa >= ARY_MAX_SIZE - min) {
new_capa = (ARY_MAX_SIZE - min) / 2;
}
new_capa += min;
ary_resize_capa(ary, new_capa);
}

And with this little bit of dirty hacking I found that the
growing of the array (ary_double_capa) is called at these
values of idx:
007:0> a = Array(0)
008:0> (0..10000).each do |i|
009:1* a << ARRAY_CAPACITY
010:1> end
011:0> b = a.uniq #=> [0, 19, 35, 52, 78, 117, 175, 262, 393, 589, 883,
1324, 1986, 2979, 4468, 6702, 10053]
...
023:0> c = []
024:0> b.each_with_index{|v,i| c << ((v*1.0)/b[i-1]) if i > 1 }
...
033:0> c.map{|ratio| "%5.3f" % ratio}
=> ["1.842", "1.486", "1.500", "1.500", "1.496", "1.497", "1.500",
"1.499", "1.499", "1.499", "1.500", "1.500", "1.500", "1.500", "1.500",
"1.842", "1.486", "1.500", "1.500", "1.496", "1.497", "1.500", "1.499",
"1.499", "1.499", "1.500", "1.500", "1.500", "1.500", "1.500"]

So, it is the array capacity is growing exponentially with approx.
50% growth per call (which could be expected from the implementation
of ary_double_capa).

If I understand correctly, this means that the cost of moving the
data around as the array grows is O(n.log(n)).

The more I see of this, the more I love Ruby :)

HTH,

Peter
 
W

Walton Hoops

If I understand correctly, this means that the cost of moving the
data around as the array grows is O(n.log(n)).

The more I see of this, the more I love Ruby :)
Also, Ruby is using realloc to resize the array, meaning it tries to
grow it in place and only copy if it has to. I won't even try to
speculate on how often it manages to grow in place, but it does mean in
the best-case scenario there is no moving cost.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,756
Messages
2,569,540
Members
45,024
Latest member
ARDU_PROgrammER

Latest Threads

Top