Effective perl function to remove one element from array?

J

josh

Hi all:

I am trying to find out the most efficient way to remove an element
from an array, and have the array size shrink by one.

pop, push, and splice won't work too well for me, I am trying to
remove an element that could be in any position.

Here is my current implementation (may not actually compile, just a
example)

<psuedo perl code>
# this array is purposely unsorted
@array = ['bob', 'alice', 'david', 'christy', 'edward', 'henry',
'frank'];
@array = removeFromArray ( @array, "edward" );

sub removeFromArray {
@array = $_[0];
$name = $_[1];

for ( $i = 0; $i < @$array; $i++ ) {
if ( $array[$i] eq $name ) {
unset $array[$i];
}
}
}
</psuedo perl code>

As you can see, this is not exactly the most efficient way of removing
an element as the size of my array grows. And especially when I run
this in a nested loop, say, when I want to compare two arrays and
remove the differences, it will result in a close to BigO(n^2)
efficiency.

Is there a faster, more efficient way to remove an element from an
array (and preferrably not BigO(n) )?

Will it help if I am performinig this on an already sorted array? Then
I can perhaps use some sort of binary search function to find the item
I am looking for?
 
M

Matt Garrish

josh said:
Hi all:

I am trying to find out the most efficient way to remove an element
from an array, and have the array size shrink by one.

pop, push, and splice won't work too well for me, I am trying to
remove an element that could be in any position.

Here is my current implementation (may not actually compile, just a
example)

<psuedo perl code>
# this array is purposely unsorted
@array = ['bob', 'alice', 'david', 'christy', 'edward', 'henry',
'frank'];
@array = removeFromArray ( @array, "edward" );

sub removeFromArray {
@array = $_[0];
$name = $_[1];

for ( $i = 0; $i < @$array; $i++ ) {
if ( $array[$i] eq $name ) {
unset $array[$i];
}
}
}
</psuedo perl code>

As you can see, this is not exactly the most efficient way of removing
an element as the size of my array grows.

Which begs the question, why use an array when a hash would work better?

my %hash = (bob => 1, alice => 1, david => 1, christy =>1, edward => 1,
henry => 1);

delete $hash{'edward'};

It would also make removing duplicates simple because you could just assign
all the keys from your existing hashes to a new hash. And sorting would be
faster, too.

Matt
 
J

Jürgen Exner

josh said:
I am trying to find out the most efficient way to remove an element
from an array, and have the array size shrink by one.

pop, push, and splice won't work too well for me, I am trying to
remove an element that could be in any position.

I would doubt that you can hand code anything that is faster than the
build-in function splice().If there were anything faster, then chances are
that algorithm would have been used to implement splice() in the first
place.
Here is my current implementation (may not actually compile, just a
example)

<psuedo perl code>
# this array is purposely unsorted
@array = ['bob', 'alice', 'david', 'christy', 'edward', 'henry',
'frank'];
@array = removeFromArray ( @array, "edward" );

sub removeFromArray {
@array = $_[0];
$name = $_[1];

for ( $i = 0; $i < @$array; $i++ ) {
if ( $array[$i] eq $name ) {
unset $array[$i];
}
}
}
</psuedo perl code>

As you can see, this is not exactly the most efficient way of removing
an element as the size of my array grows. And especially when I run
this in a nested loop, say, when I want to compare two arrays and
remove the differences, it will result in a close to BigO(n^2)
efficiency.
Is there a faster, more efficient way to remove an element from an
array (and preferrably not BigO(n) )?

This code screams for a better data structure. Why aren't you using a hash?
Then all you need to do is
sub removeFromHash {
delete $hash{$_[0]};
}
I would guess this should be somewhere near O(log n) (access to hash table
is not quite linear).
Will it help if I am performinig this on an already sorted array? Then
I can perhaps use some sort of binary search function to find the item
I am looking for?

Well, sure, but why? I thought your problem was deleting, not finding. A
sorted array won't help you with deleting an element.
Use a proper data structure and your problem becomes trivial.

jue
 
J

John Bokma

Jürgen Exner said:
I would guess this should be somewhere near O(log n) (access to hash table
is not quite linear).

Access to a hash table is normally O(1). That's the whole idea behind a
hash :-D.
 
J

Jürgen Exner

Ooopps, that should have been "not quite constant"
Access to a hash table is normally O(1). That's the whole idea behind
a hash :-D.

True, but only as long as the hash is sparsely populated.
As the hash begins to fill up you will get hash conflicts and then you are
loosing O(1).

jue
 
U

Uri Guttman

JE> Ooopps, that should have been "not quite constant"

JE> True, but only as long as the hash is sparsely populated.
JE> As the hash begins to fill up you will get hash conflicts and then you are
JE> loosing O(1).

and perl will grow the hash as needed to keep it efficient. one of the
nice little behind the scenes things. true it is not exactly O(1) but
you can assume that for common uses. the real killer is when it grows so
large that you have to swap :). then you might as well use a real db
which will be more efficient since they are designed to handle this
whereas an in ram hash is not.

uri
 
J

John Bokma

Jürgen Exner said:
Ooopps, that should have been "not quite constant"


True, but only as long as the hash is sparsely populated.
As the hash begins to fill up you will get hash conflicts and then you are
loosing O(1).

Ok, you get O(k), which is still O(1) as long as k << n, which is
hopefully the case :-D.
 
J

John Bokma

Uri said:
and perl will grow the hash as needed to keep it efficient. one of the
nice little behind the scenes things. true it is not exactly O(1) but

The fun part behind the big O notation is that O(k) for k << n, is still
O(1). So "not exactly O(1)" does not exists.
 
T

Tassilo v. Parseval

Also sprach Jürgen Exner:
josh wrote:
Is there a faster, more efficient way to remove an element from an
array (and preferrably not BigO(n) )?

This code screams for a better data structure. Why aren't you using a hash?
Then all you need to do is
sub removeFromHash {
delete $hash{$_[0]};
}
I would guess this should be somewhere near O(log n) (access to hash table
is not quite linear).

Hash-access is O(n) in the absolutely worst-case (for open hashing where
linear lists are used). However, the average case can be considered
constant, too, which has to do with the fact that hashes are resized
(normally their amount of buckets is doubled). So you have a complexity
of O(1 + a/2), where 'a' is the average list-length with a = m/n. 'm' is
the number of keys whereas 'n' is the number of buckets. In case of
Perl, some empirical tests suggest that 'a' is between 0.35 and 0.7.
When treating it as a random variable, I am quite sure that it will have
an expected value of 0.5. The standard deviation, I would guess, is then
probably around 0.1 or even less.

It is therefore reasonable to assume that access to Perl hashes is
constant in all but the contrived cases.

Tassilo
 
T

Tassilo v. Parseval

Also sprach John Bokma:
Jürgen Exner wrote:

Ok, you get O(k), which is still O(1) as long as k << n, which is
hopefully the case :-D.

Why would that be the case? If you have O(ln(n)) and set n to 10^16,
then k is around 37. That should qualify as k << n but still, it is not
constant. It can't be constant as long as it is parametrized with n and
the function is monotonously increasing with n (log is such a function).

Tassilo
 
A

Anno Siegel

Jürgen Exner said:
josh said:
I am trying to find out the most efficient way to remove an element
from an array, and have the array size shrink by one.

pop, push, and splice won't work too well for me, I am trying to
remove an element that could be in any position.

I would doubt that you can hand code anything that is faster than the
build-in function splice().If there were anything faster, then chances are
that algorithm would have been used to implement splice() in the first
place.
Here is my current implementation (may not actually compile, just a
example)

<psuedo perl code>
# this array is purposely unsorted
@array = ['bob', 'alice', 'david', 'christy', 'edward', 'henry',
'frank'];
@array = removeFromArray ( @array, "edward" );

sub removeFromArray {
@array = $_[0];
$name = $_[1];

for ( $i = 0; $i < @$array; $i++ ) {
if ( $array[$i] eq $name ) {
unset $array[$i];
}
}
}
</psuedo perl code>

As you can see, this is not exactly the most efficient way of removing
an element as the size of my array grows. And especially when I run
this in a nested loop, say, when I want to compare two arrays and
remove the differences, it will result in a close to BigO(n^2)
efficiency.
Is there a faster, more efficient way to remove an element from an
array (and preferrably not BigO(n) )?

This code screams for a better data structure. Why aren't you using a hash?
Then all you need to do is
sub removeFromHash {
delete $hash{$_[0]};
}

The OP's pseudo-code (heavy on the pseudo, light on the code) looks like
an attempt to re-implement grep(). In real (but untested) code:

sub remove_from_array {
my ( $array, $name) = @_;
@$array = grep $_ ne $name, @$array;
$array; # return ref to changed array
}

Depending on what the OP means with "unset $array[ $i]", the grep
line could also be

@$array = map { $_ eq $name ? undef : $_ } @$array;

It is probably true that a hash is a better data structure for the
problem -- "remove the differences" is a well-known keyword -- but
"grep" seems to be what OP was aiming at.

Anno
 
U

Uri Guttman

JB> The fun part behind the big O notation is that O(k) for k << n, is
JB> still O(1). So "not exactly O(1)" does not exists.

well, it does in a way. extending hashes and handling dense ones will
cause the behaviour to be mostly O(1) with a touch of O(N) in it. you
have to factor in the cost of extending the hash. so it is some other
O() function thing which is between 1 and N. i am not about to calculate
it. it is not constant as the size of the hash does affect things. but
the effect is not just linear with N. but saying it is close to O(1) is
fine for practical uses.

uri
 
J

John W. Krahn

josh said:
I am trying to find out the most efficient way to remove an element
from an array, and have the array size shrink by one.

pop, push, and splice won't work too well for me, I am trying to
remove an element that could be in any position.

Here is my current implementation (may not actually compile, just a
example)

<psuedo perl code>
# this array is purposely unsorted
@array = ['bob', 'alice', 'david', 'christy', 'edward', 'henry',
'frank'];
@array = removeFromArray ( @array, "edward" );

sub removeFromArray {
@array = $_[0];
$name = $_[1];

for ( $i = 0; $i < @$array; $i++ ) {
if ( $array[$i] eq $name ) {
unset $array[$i];
}
}
}
</psuedo perl code>

As you can see, this is not exactly the most efficient way of removing
an element as the size of my array grows. And especially when I run
this in a nested loop, say, when I want to compare two arrays and
remove the differences, it will result in a close to BigO(n^2)
efficiency.

Is there a faster, more efficient way to remove an element from an
array (and preferrably not BigO(n) )?

Will it help if I am performinig this on an already sorted array? Then
I can perhaps use some sort of binary search function to find the item
I am looking for?

This will work if there is only one of $name in @array:

for my $i ( 0 .. $#array ) {
if ( $array[ $i ] eq $name ) {
splice @array, $i, 1;
last;
}
}

However, if you have multiple $name entries in @array then you need this
instead:

for my $i ( reverse 0 .. $#array ) {
if ( $array[ $i ] eq $name ) {
splice @array, $i, 1;
}
}


John
 
J

John Bokma

Tassilo said:
Also sprach John Bokma:


Why would that be the case?

In the above context it is. k is some constant, not related to n, so k
is not f(n), otherwise I would have written that.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,020
Latest member
GenesisGai

Latest Threads

Top