counting number of uniques in a multidimensional array column

J

Jack

Hi I have data in a multidim array and DONT want to create another
array representing just 1 column from this multidim array.. I want to
determine the number of uniques, I did this easily with just a regular
array (code below), does anyone know how to do this over just 1 column
of a multidim array (in other words, number of uniques across 1 column
of the multi dim defined as: multidim[0][0],multidim[1][0],
multidim[2][0].... etc)

sort @$columnarray;
@out = grep($_ ne $prev && ($prev = $_, 1), @$columnarray);
if ($#out == -1) { $#out = 0; }
$out = $#out +1; # makes $#out of 0 = 1 so it gets counted !
push @distinctcounts, $out;

Thanks!
Jack
 
P

Paul Lalli

Jack said:
Hi I have data in a multidim array and DONT want to create another
array representing just 1 column from this multidim array..
Why?

I want to
determine the number of uniques, I did this easily with just a regular
array (code below), does anyone know how to do this over just 1 column
of a multidim array (in other words, number of uniques across 1 column
of the multi dim defined as: multidim[0][0],multidim[1][0],
multidim[2][0].... etc)

sort @$columnarray;

This does nothing at all. You are clearly not enabling warnings in
your development. Please start doing so.
@out = grep($_ ne $prev && ($prev = $_, 1), @$columnarray);
if ($#out == -1) { $#out = 0; }

"if @out is empty, create one undefined element in @out"

Why? Under what circumstances do you believe @out could ever be empty
from the above code (assuming you had sorted @$columnarray correctly)?
Well, I suppose it could be if your array had nothing but undefined
values in it. Is that the circumstance you were going for?
$out = $#out +1; # makes $#out of 0 = 1 so it gets counted !

Now you're assigning $out to be the size of @out. Why not just use the
size of @out?
push @distinctcounts, $out;

The above code looks remarkably like the first answer to
perldoc -q duplicate

Have you seen the other answers?

Have you considered using map to generate a list of the first "columns"
of each array, and using that as your list rather than @{$columnarray}
?

map { $_->[0] } @$columnarray

will give you that.

Paul Lalli
 
J

Jack

Paul said:
Jack said:
Hi I have data in a multidim array and DONT want to create another
array representing just 1 column from this multidim array..
Why?

I want to
determine the number of uniques, I did this easily with just a regular
array (code below), does anyone know how to do this over just 1 column
of a multidim array (in other words, number of uniques across 1 column
of the multi dim defined as: multidim[0][0],multidim[1][0],
multidim[2][0].... etc)

sort @$columnarray;

This does nothing at all. You are clearly not enabling warnings in
your development. Please start doing so.
@out = grep($_ ne $prev && ($prev = $_, 1), @$columnarray);
if ($#out == -1) { $#out = 0; }

"if @out is empty, create one undefined element in @out"

Why? Under what circumstances do you believe @out could ever be empty
from the above code (assuming you had sorted @$columnarray correctly)?
Well, I suppose it could be if your array had nothing but undefined
values in it. Is that the circumstance you were going for?
$out = $#out +1; # makes $#out of 0 = 1 so it gets counted !

Now you're assigning $out to be the size of @out. Why not just use the
size of @out?
push @distinctcounts, $out;

The above code looks remarkably like the first answer to
perldoc -q duplicate

Have you seen the other answers?

Have you considered using map to generate a list of the first "columns"
of each array, and using that as your list rather than @{$columnarray}
?

map { $_->[0] } @$columnarray

will give you that.

Paul Lalli

Just ignore the @$ (this represents a variable) - assume the code is
this:
sort @columnarray;
@out = grep($_ ne $prev && ($prev = $_, 1), @columnarray);
if ($#out == -1) { $#out = 0; }
print $out;

Are you saying the above doesnt work ?? It works great on a single
array. Do you have a better code, if so, what is it ? Also, can you
please answer the question about how to get the distinct count of a
multidim column with an actual example. Appreciate your response.
Thanks, Jack
 
P

Paul Lalli

Jack said:
Just ignore the @$ (this represents a variable)

There was no @$ in your original snippet, so ignoring it is a no-op.
There was, however, @$columnarray, which is a perfectly valid array. I
have no idea why you're saying to get rid of it now.
- assume the code is this:
sort @columnarray;

Once again, THIS LINE DOES NOTHING. You still have not bothered to
turn warnings on? Why? You are asking for help, help is being given
to you, and you're ignoring it. That's really very annoying.
@out = grep($_ ne $prev && ($prev = $_, 1), @columnarray);
if ($#out == -1) { $#out = 0; }
print $out;

Are you saying the above doesnt work ??

I did not say that at all. What part of my post implies that the code
doesn't work? I said that the first line of it does nothing at all,
and the messing about with $#out is pointless.
It works great on a single
array. Do you have a better code, if so, what is it ?

Once again, I point you to the other responses in the FAQ that you
apparently saw to get this code:
perldoc -q duplicate
(Or did you never see that FAQ, and are instead just copy/pasting some
other code you found lying around somewhere?)
Once again, why are you ignoring what I've already told you to do,
preferring instead to believe that I'm just not bothering to help?
Also, can you
please answer the question about how to get the distinct count of a
multidim column with an actual example

I *did*! Why are you ignoring my entire response?! I told you
precisely how to change your example to use a list of the first
columns, rather than a single array. The fact that you ignored that
advice is your problem, not mine.
Appreciate your response.

Really doesn't appear that way.

Paul Lalli
 
J

Jack

Paul said:
There was no @$ in your original snippet, so ignoring it is a no-op.
There was, however, @$columnarray, which is a perfectly valid array. I
have no idea why you're saying to get rid of it now.


Once again, THIS LINE DOES NOTHING. You still have not bothered to
turn warnings on? Why? You are asking for help, help is being given
to you, and you're ignoring it. That's really very annoying.


I did not say that at all. What part of my post implies that the code
doesn't work? I said that the first line of it does nothing at all,
and the messing about with $#out is pointless.


Once again, I point you to the other responses in the FAQ that you
apparently saw to get this code:
perldoc -q duplicate
(Or did you never see that FAQ, and are instead just copy/pasting some
other code you found lying around somewhere?)
Once again, why are you ignoring what I've already told you to do,
preferring instead to believe that I'm just not bothering to help?


I *did*! Why are you ignoring my entire response?! I told you
precisely how to change your example to use a list of the first
columns, rather than a single array. The fact that you ignored that
advice is your problem, not mine.


Really doesn't appear that way.

Paul Lalli

Forgive me if I am limited to some degree. I am just asking if someone
can provide some sample code that works takes $multidimarray[1][0],
$multidimarray[2][0], (a column) and produces a distinct count...

I dont know how to take your suggestion of
map { $_->[0] } @columnarray
and convert that into a solution for that counts the distinct entires
for the first column in a multidimensional array ..

Would you consider elaborating, or perhaps someone who is willing to
help/share.

Thank you,
Jack
 
X

xhoster

Jack said:
Hi I have data in a multidim array and DONT want to create another
array representing just 1 column from this multidim array.. I want to
determine the number of uniques, I did this easily with just a regular
array (code below),

I don't know if the code below actually does work, but I will assume it
does.
does anyone know how to do this over just 1 column
of a multidim array (in other words, number of uniques across 1 column
of the multi dim defined as: multidim[0][0],multidim[1][0],
multidim[2][0].... etc)

my $col_number=0; # or whatever column you want
my $columnarray=[map $_->[$col_number], @multidim];

Now procede as before with $columnarray.

Xho
 
T

Ted Zlatanov

Forgive me if I am limited to some degree. I am just asking if someone
can provide some sample code that works takes $multidimarray[1][0],
$multidimarray[2][0], (a column) and produces a distinct count...

I dont know how to take your suggestion of
map { $_->[0] } @columnarray
and convert that into a solution for that counts the distinct entires
for the first column in a multidimensional array ..

I'll try to help you. Keep in mind that the advice Paul gave was
useful, I'm just restating it and elaborating. Don't feel bad about
missing things here and there, everyone has to start somewhere.

That map call will return the first (0) column of the array as a list.

Your original question was how to find unique elements in a column.

You posted:
sort @$columnarray;
@out = grep($_ ne $prev && ($prev = $_, 1), @$columnarray);
if ($#out == -1) { $#out = 0; }
$out = $#out +1; # makes $#out of 0 = 1 so it gets counted !
push @distinctcounts, $out;

The first line does nothing at all. Paul mentioned that too. Use
warnings and strict mode, if possible, to avoid such code. Sort
*returns* the sorted list, it doesn't modify in place.

In addition your 'uniques' code is not very good. It may work in some
cases, but really you should use a hash. Look at 'perldoc -q
duplicates' and 'perldoc perldata' to get started. Actually all of
the perldoc info is good :)

Here's a (very simple) function to give you the unique items from a
list you pass:

sub uniques
{
my %unique = ();
$unique{$_}++ foreach @_;
return keys %unique;
}

Now use it like this:

my @columnarray = ( [1,2,3], [1,2,3], [4,5,6], [7,8,9], );

foreach my $column (1 .. scalar @{$columnarray[0]})
{
print "Unique elements in column $column: ";
print join ', ',
uniques(map { $_->[$column-1] }
@columnarray
);
print "\n";
}

I formatted this to be easy to understand, and I tested it with the
data above under

use warnings;
use strict;

and it worked correctly. Please learn from the code posted above - it
shows many useful techniques.

Ted
 
P

Paul Lalli

Forgive me if I am limited to some degree.

Being new to Perl is not something that requires forgiveness. Being
unwilling to put forth effort of your own, and only accepting solutions
that are spoonfed to you, is not worthy of forgiveness.
I am just asking if someone can provide some sample code

I know exactly what you're asking. I have answered it 3 times now.
The answer is "No, I will not write code for you. I will, however,
give you all the information you need to do it yourself." If that's
not good enough for you, I strongly suggest you hire a consultant.
that works takes $multidimarray[1][0],
$multidimarray[2][0], (a column) and produces a distinct count...

I dont know how to take your suggestion of
map { $_->[0] } @columnarray
and convert that into a solution for that counts the distinct entires
for the first column in a multidimensional array ..

I told you to take that expression, and operate on that, rather than on
@columnarray itself. What part of that is confusing to you?

Take that expression right there, and put that where you currently have
'@columnarray' in the first quoted line of this message.
Would you consider elaborating, or perhaps someone who is willing to
help/share.

Implying that I am *not* willing to help or share? You have a very
bizarre definition of "help".

*PLONK*

Paul Lalli
 
M

Mumia W.

Paul said:
[ snipped ]
[...]
I dont know how to take your suggestion of
map { $_->[0] } @columnarray
and convert that into a solution for that counts the distinct entires
for the first column in a multidimensional array ..

Would you consider elaborating, or perhaps someone who is willing to
help/share.

Thank you,
Jack

Paul Lalli gave you half of the answer. You're supposed to
figure out the other half. The other half is storing the data
in a hash where the keys are the column data returned from the
map, and the values are incremented once for each entry in the
column.

Hashes have a "magical" quality that makes their keys unique.
Using a hash, you can count the number of unique items in an
array, because each key in a hash appears only once.

1: use Data::Dumper;
2: my @temps = (30, 38, 26, 38, 39);
3: my %hash;
4: for my $tp (@temps) { $hash{$tp} += 1 }
5: print Dumper(\%hash);

Line 4 increments a hash value each time it's found[0] in the
array. Notice that 38 only appears once in the hash, despite
the fact that it appears twice in @temps.


:-O UNTESTED CODE :-O
 
J

Jack

Ted said:
Forgive me if I am limited to some degree. I am just asking if someone
can provide some sample code that works takes $multidimarray[1][0],
$multidimarray[2][0], (a column) and produces a distinct count...

I dont know how to take your suggestion of
map { $_->[0] } @columnarray
and convert that into a solution for that counts the distinct entires
for the first column in a multidimensional array ..

I'll try to help you. Keep in mind that the advice Paul gave was
useful, I'm just restating it and elaborating. Don't feel bad about
missing things here and there, everyone has to start somewhere.

That map call will return the first (0) column of the array as a list.

Your original question was how to find unique elements in a column.

You posted:
sort @$columnarray;
@out = grep($_ ne $prev && ($prev = $_, 1), @$columnarray);
if ($#out == -1) { $#out = 0; }
$out = $#out +1; # makes $#out of 0 = 1 so it gets counted !
push @distinctcounts, $out;

The first line does nothing at all. Paul mentioned that too. Use
warnings and strict mode, if possible, to avoid such code. Sort
*returns* the sorted list, it doesn't modify in place.

In addition your 'uniques' code is not very good. It may work in some
cases, but really you should use a hash. Look at 'perldoc -q
duplicates' and 'perldoc perldata' to get started. Actually all of
the perldoc info is good :)

Here's a (very simple) function to give you the unique items from a
list you pass:

sub uniques
{
my %unique = ();
$unique{$_}++ foreach @_;
return keys %unique;
}

Now use it like this:

my @columnarray = ( [1,2,3], [1,2,3], [4,5,6], [7,8,9], );

foreach my $column (1 .. scalar @{$columnarray[0]})
{
print "Unique elements in column $column: ";
print join ', ',
uniques(map { $_->[$column-1] }
@columnarray
);
print "\n";
}

I formatted this to be easy to understand, and I tested it with the
data above under

use warnings;
use strict;

and it worked correctly. Please learn from the code posted above - it
shows many useful techniques.

Ted

Ted, great job that works killer... can you tell me, I want to exclude
from the counting any null values, I tried adding this without
success..any reply would be appreciated..thanks, Jack

sub uniques
{
my %unique = ();
if (@_ != /^\z/) { $unique{$_}++ foreach @_ } ;
return keys %unique;
}
 
A

axel

Jack said:
sort @columnarray;
@out = grep($_ ne $prev && ($prev = $_, 1), @columnarray);
if ($#out == -1) { $#out = 0; }
print $out;
Are you saying the above doesnt work ?? It works great on a single
array. Do you have a better code, if so, what is it ?

It doesn't work. Even if the last line if a typo for

print $#out;

my @columnarray = qw(a b c d b e c);
Results in:

Useless use of sort in void context at q1.pl line 11.
Use of uninitialized value in string ne at q1.pl line 12.
6
Also, can you
please answer the question about how to get the distinct count of a
multidim column with an actual example. Appreciate your response.

People on this group do not regard to being told to answer questions or
write code.

Axel
 
D

DJ Stunks

Jack said:
Ted, great job that works killer... can you tell me, I want to exclude
from the counting any null values, I tried adding this without
success..any reply would be appreciated..thanks, Jack

sub uniques
{
my %unique = ();
if (@_ != /^\z/) { $unique{$_}++ foreach @_ } ;

1) this ---^^ only operates on a scalar; thus
2) this ^^ array is forced into scalar context; and
3) an array evaluated in scalar context yields the count of
the number of elements in the array; but
4) here ---^^ you mistyped the negated binding operator; therefore
5) this ------^^^^^ attempts to match against whatever is
currently contained in $_; and
6) if the return value for this test (1 or 0) is not equal to the
number of elements in @_ (likely > 1); then
5) the block will be evaluated
return keys %unique;
}

if only there were some way to test the value
foreach element of the array...

-jp
 
T

Tad McClellan

Paul Lalli said:
Jack wrote:


This does nothing at all.


It is useful only during the winter.

If you keep your tower under your desk, it will help to keep your feet warm!
 
T

Tad McClellan

Jack said:
I want to exclude
from the counting any null values,
sub uniques
{
my %unique = ();
if (@_ != /^\z/) { $unique{$_}++ foreach @_ } ;
return keys %unique;


return grep length, keys %unique;

or, since there can only be one anyway:

delete $unique{''};
return keys %unique;
 
T

Ted Zlatanov

Ted, great job that works killer... can you tell me, I want to exclude
from the counting any null values, I tried adding this without
success..any reply would be appreciated..thanks, Jack

sub uniques
{
my %unique = ();
if (@_ != /^\z/) { $unique{$_}++ foreach @_ } ;
return keys %unique;
}

Tad's solution is great, but I just wanted to clarify something. When
you say "null" that actually doesn't mean anything in Perl. Perl
calls undefined values "undef" - this is different from NULL in
C/C++. There are rules about undef and how it's converted to a string
or numeric context, but what's important is to realize that what
you're filtering above is the empty string "", and that's why Tad used
length() in his test.

Interestingly, length(undef) is also 0, which makes Tad's length()
test eliminate undef values as well. For extra credit and fun, figure
out why length(undef) is 0 - you'll learn about the rules I mentioned
above, and you'll be a better Perl programmer for it.

Also, it's good that you posted what you tried even though it didn't
work. People on this newsgroup are very, very helpful when they see
you've tried something on your own. They generally dislike open-ended
questions with vague requirement. This is why you got a good response
from Tad. There's some posting guidelines (look them up on Google
News) posted here regularly, which explain this and more.

Good luck :)

Ted
 
B

Ben Morrow

Quoth Ted Zlatanov said:
Interestingly, length(undef) is also 0, which makes Tad's length()
test eliminate undef values as well. For extra credit and fun, figure
out why length(undef) is 0

.... with a warning...

Ben
 
J

Jack

Ted said:
Forgive me if I am limited to some degree. I am just asking if someone
can provide some sample code that works takes $multidimarray[1][0],
$multidimarray[2][0], (a column) and produces a distinct count...

I dont know how to take your suggestion of
map { $_->[0] } @columnarray
and convert that into a solution for that counts the distinct entires
for the first column in a multidimensional array ..

I'll try to help you. Keep in mind that the advice Paul gave was
useful, I'm just restating it and elaborating. Don't feel bad about
missing things here and there, everyone has to start somewhere.

That map call will return the first (0) column of the array as a list.

Your original question was how to find unique elements in a column.

You posted:
sort @$columnarray;
@out = grep($_ ne $prev && ($prev = $_, 1), @$columnarray);
if ($#out == -1) { $#out = 0; }
$out = $#out +1; # makes $#out of 0 = 1 so it gets counted !
push @distinctcounts, $out;

The first line does nothing at all. Paul mentioned that too. Use
warnings and strict mode, if possible, to avoid such code. Sort
*returns* the sorted list, it doesn't modify in place.

In addition your 'uniques' code is not very good. It may work in some
cases, but really you should use a hash. Look at 'perldoc -q
duplicates' and 'perldoc perldata' to get started. Actually all of
the perldoc info is good :)

Here's a (very simple) function to give you the unique items from a
list you pass:

sub uniques
{
my %unique = ();
$unique{$_}++ foreach @_;
return keys %unique;
}

Now use it like this:

my @columnarray = ( [1,2,3], [1,2,3], [4,5,6], [7,8,9], );

foreach my $column (1 .. scalar @{$columnarray[0]})
{
print "Unique elements in column $column: ";
print join ', ',
uniques(map { $_->[$column-1] }
@columnarray
);
print "\n";
}

I formatted this to be easy to understand, and I tested it with the
data above under

use warnings;
use strict;

and it worked correctly. Please learn from the code posted above - it
shows many useful techniques.

Ted

Ted - this is excellent stuff - how exactly can I capture an example of
2 elements representing a duplicate in a variable from this code ???

thanks again,

Jack
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,754
Messages
2,569,527
Members
44,998
Latest member
MarissaEub

Latest Threads

Top