Effective perl function to remove one element from array?

josh · May 30, 2004

Hi all:

I am trying to find out the most efficient way to remove an element
from an array, and have the array size shrink by one.

pop, push, and splice won't work too well for me, I am trying to
remove an element that could be in any position.

Here is my current implementation (may not actually compile, just a
example)

<psuedo perl code>
# this array is purposely unsorted
@array = ['bob', 'alice', 'david', 'christy', 'edward', 'henry',
'frank'];
@array = removeFromArray ( @array, "edward" );

sub removeFromArray {
@array = $_[0];
$name = $_[1];

for ( $i = 0; $i < @$array; $i++ ) {
if ( $array[$i] eq $name ) {
unset $array[$i];
}
}
}
</psuedo perl code>

As you can see, this is not exactly the most efficient way of removing
an element as the size of my array grows. And especially when I run
this in a nested loop, say, when I want to compare two arrays and
remove the differences, it will result in a close to BigO(n^2)
efficiency.

Is there a faster, more efficient way to remove an element from an
array (and preferrably not BigO(n) )?

Will it help if I am performinig this on an already sorted array? Then
I can perhaps use some sort of binary search function to find the item
I am looking for?

Matt Garrish · May 30, 2004

josh said:
Hi all:

I am trying to find out the most efficient way to remove an element
from an array, and have the array size shrink by one.

pop, push, and splice won't work too well for me, I am trying to
remove an element that could be in any position.

Here is my current implementation (may not actually compile, just a
example)

<psuedo perl code>
# this array is purposely unsorted
@array = ['bob', 'alice', 'david', 'christy', 'edward', 'henry',
'frank'];
@array = removeFromArray ( @array, "edward" );

sub removeFromArray {
@array = $_[0];
$name = $_[1];

for ( $i = 0; $i < @$array; $i++ ) {
if ( $array[$i] eq $name ) {
unset $array[$i];
}
}
}
</psuedo perl code>

As you can see, this is not exactly the most efficient way of removing
an element as the size of my array grows.

Which begs the question, why use an array when a hash would work better?

my %hash = (bob => 1, alice => 1, david => 1, christy =>1, edward => 1,
henry => 1);

delete $hash{'edward'};

It would also make removing duplicates simple because you could just assign
all the keys from your existing hashes to a new hash. And sorting would be
faster, too.

Matt

Jürgen Exner · May 30, 2004

josh said:
I am trying to find out the most efficient way to remove an element
from an array, and have the array size shrink by one.

pop, push, and splice won't work too well for me, I am trying to
remove an element that could be in any position.

I would doubt that you can hand code anything that is faster than the
build-in function splice().If there were anything faster, then chances are
that algorithm would have been used to implement splice() in the first
place.

Here is my current implementation (may not actually compile, just a
example)

<psuedo perl code>
# this array is purposely unsorted
@array = ['bob', 'alice', 'david', 'christy', 'edward', 'henry',
'frank'];
@array = removeFromArray ( @array, "edward" );

sub removeFromArray {
@array = $_[0];
$name = $_[1];

for ( $i = 0; $i < @$array; $i++ ) {
if ( $array[$i] eq $name ) {
unset $array[$i];
}
}
}
</psuedo perl code>

As you can see, this is not exactly the most efficient way of removing
an element as the size of my array grows. And especially when I run
this in a nested loop, say, when I want to compare two arrays and
remove the differences, it will result in a close to BigO(n^2)
efficiency.
Is there a faster, more efficient way to remove an element from an
array (and preferrably not BigO(n) )?

This code screams for a better data structure. Why aren't you using a hash?
Then all you need to do is
sub removeFromHash {
delete $hash{$_[0]};
}
I would guess this should be somewhere near O(log n) (access to hash table
is not quite linear).

Will it help if I am performinig this on an already sorted array? Then
I can perhaps use some sort of binary search function to find the item
I am looking for?

Well, sure, but why? I thought your problem was deleting, not finding. A
sorted array won't help you with deleting an element.
Use a proper data structure and your problem becomes trivial.

jue

John Bokma · May 30, 2004

Jürgen Exner said:
I would guess this should be somewhere near O(log n) (access to hash table
is not quite linear).

Access to a hash table is normally O(1). That's the whole idea behind a
hash :-D.

Jürgen Exner · May 30, 2004

Ooopps, that should have been "not quite constant"

Access to a hash table is normally O(1). That's the whole idea behind
a hash :-D.

True, but only as long as the hash is sparsely populated.
As the hash begins to fill up you will get hash conflicts and then you are
loosing O(1).

jue

Uri Guttman · May 30, 2004

JE> Ooopps, that should have been "not quite constant"

JE> True, but only as long as the hash is sparsely populated.
JE> As the hash begins to fill up you will get hash conflicts and then you are
JE> loosing O(1).

and perl will grow the hash as needed to keep it efficient. one of the
nice little behind the scenes things. true it is not exactly O(1) but
you can assume that for common uses. the real killer is when it grows so
large that you have to swap

. then you might as well use a real db
which will be more efficient since they are designed to handle this
whereas an in ram hash is not.

uri

John Bokma · May 30, 2004

Jürgen Exner said:
Ooopps, that should have been "not quite constant"

True, but only as long as the hash is sparsely populated.
As the hash begins to fill up you will get hash conflicts and then you are
loosing O(1).

Ok, you get O(k), which is still O(1) as long as k << n, which is
hopefully the case :-D.

John Bokma · May 30, 2004

Uri said:
and perl will grow the hash as needed to keep it efficient. one of the
nice little behind the scenes things. true it is not exactly O(1) but

The fun part behind the big O notation is that O(k) for k << n, is still
O(1). So "not exactly O(1)" does not exists.

Tassilo v. Parseval · May 30, 2004

Also sprach Jürgen Exner:

josh wrote:

Is there a faster, more efficient way to remove an element from an
array (and preferrably not BigO(n) )?

Click to expand...

This code screams for a better data structure. Why aren't you using a hash?
Then all you need to do is
sub removeFromHash {
delete $hash{$_[0]};
}
I would guess this should be somewhere near O(log n) (access to hash table
is not quite linear).

Hash-access is O(n) in the absolutely worst-case (for open hashing where
linear lists are used). However, the average case can be considered
constant, too, which has to do with the fact that hashes are resized
(normally their amount of buckets is doubled). So you have a complexity
of O(1 + a/2), where 'a' is the average list-length with a = m/n. 'm' is
the number of keys whereas 'n' is the number of buckets. In case of
Perl, some empirical tests suggest that 'a' is between 0.35 and 0.7.
When treating it as a random variable, I am quite sure that it will have
an expected value of 0.5. The standard deviation, I would guess, is then
probably around 0.1 or even less.

It is therefore reasonable to assume that access to Perl hashes is
constant in all but the contrived cases.

Tassilo

Tassilo v. Parseval · May 30, 2004

Also sprach John Bokma:

Jürgen Exner wrote:

Ok, you get O(k), which is still O(1) as long as k << n, which is
hopefully the case :-D.

Why would that be the case? If you have O(ln(n)) and set n to 10^16,
then k is around 37. That should qualify as k << n but still, it is not
constant. It can't be constant as long as it is parametrized with n and
the function is monotonously increasing with n (log is such a function).

Tassilo

Anno Siegel · May 30, 2004

Jürgen Exner said:
josh said:

I am trying to find out the most efficient way to remove an element
from an array, and have the array size shrink by one.

pop, push, and splice won't work too well for me, I am trying to
remove an element that could be in any position.

Click to expand...

I would doubt that you can hand code anything that is faster than the
build-in function splice().If there were anything faster, then chances are
that algorithm would have been used to implement splice() in the first
place.

Here is my current implementation (may not actually compile, just a
example)

<psuedo perl code>
# this array is purposely unsorted
@array = ['bob', 'alice', 'david', 'christy', 'edward', 'henry',
'frank'];
@array = removeFromArray ( @array, "edward" );

sub removeFromArray {
@array = $_[0];
$name = $_[1];

for ( $i = 0; $i < @$array; $i++ ) {
if ( $array[$i] eq $name ) {
unset $array[$i];
}
}
}
</psuedo perl code>

As you can see, this is not exactly the most efficient way of removing
an element as the size of my array grows. And especially when I run
this in a nested loop, say, when I want to compare two arrays and
remove the differences, it will result in a close to BigO(n^2)
efficiency.
Is there a faster, more efficient way to remove an element from an
array (and preferrably not BigO(n) )?

Click to expand...

This code screams for a better data structure. Why aren't you using a hash?
Then all you need to do is
sub removeFromHash {
delete $hash{$_[0]};
}

The OP's pseudo-code (heavy on the pseudo, light on the code) looks like
an attempt to re-implement grep(). In real (but untested) code:

sub remove_from_array {
my ( $array, $name) = @_;
@$array = grep $_ ne $name, @$array;
$array; # return ref to changed array
}

Depending on what the OP means with "unset $array[ $i]", the grep
line could also be

@$array = map { $_ eq $name ? undef : $_ } @$array;

It is probably true that a hash is a better data structure for the
problem -- "remove the differences" is a well-known keyword -- but
"grep" seems to be what OP was aiming at.

Anno

Uri Guttman · May 30, 2004

JB> The fun part behind the big O notation is that O(k) for k << n, is
JB> still O(1). So "not exactly O(1)" does not exists.

well, it does in a way. extending hashes and handling dense ones will
cause the behaviour to be mostly O(1) with a touch of O(N) in it. you
have to factor in the cost of extending the hash. so it is some other
O() function thing which is between 1 and N. i am not about to calculate
it. it is not constant as the size of the hash does affect things. but
the effect is not just linear with N. but saying it is close to O(1) is
fine for practical uses.

uri

John W. Krahn · May 30, 2004

josh said:
I am trying to find out the most efficient way to remove an element
from an array, and have the array size shrink by one.

pop, push, and splice won't work too well for me, I am trying to
remove an element that could be in any position.

Here is my current implementation (may not actually compile, just a
example)

<psuedo perl code>
# this array is purposely unsorted
@array = ['bob', 'alice', 'david', 'christy', 'edward', 'henry',
'frank'];
@array = removeFromArray ( @array, "edward" );

sub removeFromArray {
@array = $_[0];
$name = $_[1];

for ( $i = 0; $i < @$array; $i++ ) {
if ( $array[$i] eq $name ) {
unset $array[$i];
}
}
}
</psuedo perl code>

As you can see, this is not exactly the most efficient way of removing
an element as the size of my array grows. And especially when I run
this in a nested loop, say, when I want to compare two arrays and
remove the differences, it will result in a close to BigO(n^2)
efficiency.

Is there a faster, more efficient way to remove an element from an
array (and preferrably not BigO(n) )?

Will it help if I am performinig this on an already sorted array? Then
I can perhaps use some sort of binary search function to find the item
I am looking for?

This will work if there is only one of $name in @array:

for my $i ( 0 .. $#array ) {
if ( $array[ $i ] eq $name ) {
splice @array, $i, 1;
last;
}
}

However, if you have multiple $name entries in @array then you need this
instead:

for my $i ( reverse 0 .. $#array ) {
if ( $array[ $i ] eq $name ) {
splice @array, $i, 1;
}
}

John

John Bokma · Jun 6, 2004

Tassilo said:
Also sprach John Bokma:

Why would that be the case?

In the above context it is. k is some constant, not related to n, so k
is not f(n), otherwise I would have written that.

Program to find the largest integer element of an array.	1	Mar 2, 2022
Accessing array index addresses with custom datatype in a function	0	Jun 2, 2022
Remove a specific element from an Array	14	Jul 9, 2007
No more than N element of an array	12	Jul 25, 2013
How to remove an empty line which is created when i deleted a element from my xml file?	0	Oct 1, 2016
Perl array bug?	30	Aug 19, 2010
size of an array via a ref to it	3	Feb 23, 2012
matching array element to hash key	2	Sep 17, 2009

Effective perl function to remove one element from array?

josh

Matt Garrish

Jürgen Exner

John Bokma

Jürgen Exner

Uri Guttman

John Bokma

John Bokma

Tassilo v. Parseval

Tassilo v. Parseval

Anno Siegel

Uri Guttman

John W. Krahn

John Bokma

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads