finding common words

A

Anno Siegel

Tore Aursand said:
I know this approach, but I also know that one shouldn't use 'map' in void
context.

This is a contested issue. In particular, there is a patch in the pipeline
so that "map" in scalar context no longer builds a list. A pity Abigail
seems to have left us, she would have had something to say on the subject :)

However, this one is easily repaired. Either, don't call "map" in void
context:

@common = map { exists $ListB{ $_} ? $_ : () } keys %ListA;

Or use "for" instead:

exists $ListB{ $_} and push @common, $_ for keys %ListA;

Both are probably best expressed as

@common = grep exists $ListB{ $_}, keys %ListA;

which we already know.

Anno
 
U

Uri Guttman

AS> This is a contested issue. In particular, there is a patch in the
AS> pipeline so that "map" in scalar context no longer builds a list.
AS> A pity Abigail seems to have left us, she would have had something
AS> to say on the subject :)

well, i say it is more a semantic issue than an efficiency issue. but
abigail disagrees with that as well. i like to not mislead readers of my
code with red herrings like a map with no returned value. the for
modifier is meant for that.

AS> However, this one is easily repaired. Either, don't call "map" in void
AS> context:

AS> @common = map { exists $ListB{ $_} ? $_ : () } keys %ListA;

and that is just grep implemented in map :)

AS> Both are probably best expressed as

AS> @common = grep exists $ListB{ $_}, keys %ListA;

AS> which we already know.

yep.

uri
 
A

Anno Siegel

Uri Guttman said:
AS> This is a contested issue. In particular, there is a patch in the
AS> pipeline so that "map" in scalar context no longer builds a list.
AS> A pity Abigail seems to have left us, she would have had something
AS> to say on the subject :)

well, i say it is more a semantic issue than an efficiency issue. but
abigail disagrees with that as well. i like to not mislead readers of my
code with red herrings like a map with no returned value. the for
modifier is meant for that.

In any case, the efficiency point is moot after the patch.

I know better than to try to persuade you by observing that the original
semantics of "map" in Lisp did *not* include a return value, you needed
"maplist" for that. You'd rightly reply that this is Perl, not Lisp.
(And I know what you think of Lisp :)

I wouldn't be objected to map in void context if I could see a need for
it. The only obvious thing "for" can't do is "map"s way of accepting
a block in the indirect object slot. That can be repaired by putting
"do" in front of the block (for the "for"-version), though I prefer to
do it differently if possible.

Anno
 
G

Gunnar Hjalmarsson

Anno said:
As it seems, there are no original hashes, that was an artifact of
the first reply in the thread.

Yeah, that's my 'fault'. Since OP's initial post didn't provide any
info about the file structure, I put the example data into two hashes
to get something to work with.
Assuming filehandles $listA and $listB opened (and $/ set to ',',
IIRC), I'd do the obvious (untested):

my %listA;
while ( <$listA> ) {
chomp; # non-standard $/, split won't do the job
my( $key, $val) = split;
$lista{ $key} = $val;
}

my %new_listB;
while ( <$listB> ) {
chomp;
my ( $key, $val) = split;
$new_listB{ $key} = $val if exists $listA{ $key};
}

Note that if you compare that with my initial suggestion, the
difference as regards memory consumption is small. They both involve
two hashes with word lists.

When posting the initial unsophisticated solution, I sure did not
anticipate this long thread. Guess I should have known better. ;-)

Wonder what OP is doing, btw. Last reported being "reading some Perl
introductory books".
 
T

Tassilo v. Parseval

Also sprach Anno Siegel:
This is a contested issue. In particular, there is a patch in the pipeline
so that "map" in scalar context no longer builds a list. A pity Abigail
seems to have left us, she would have had something to say on the subject :)

When I submitted this patch a couple of months ago, I expected it to go
into blead. But as I see right now, it already went into 5.8.1. That
means that the map-returns-list-in-void-context argument no longer
counts for recent perls.

Tassilo
 
A

Anno Siegel

Tassilo v. Parseval said:
Also sprach Anno Siegel:


When I submitted this patch a couple of months ago, I expected it to go
into blead. But as I see right now, it already went into 5.8.1. That
means that the map-returns-list-in-void-context argument no longer
counts for recent perls.

Ah, that's good to know. I didn't notice yet. Does it in fact (as I
inadvertently claimed) optimize any scalar map, or only void context?

Anno
 
U

Uri Guttman

TvP> When I submitted this patch a couple of months ago, I expected it to go
TvP> into blead. But as I see right now, it already went into 5.8.1. That
TvP> means that the map-returns-list-in-void-context argument no longer
TvP> counts for recent perls.

i disagree. see my other post and anno's response.

uri
 
A

Anno Siegel

Uri Guttman said:
subject :)
TvP> When I submitted this patch a couple of months ago, I expected it to go
TvP> into blead. But as I see right now, it already went into 5.8.1. That
TvP> means that the map-returns-list-in-void-context argument no longer
TvP> counts for recent perls.

i disagree. see my other post and anno's response.

What's to disagree about, Uri? Tassilo doesn't claim other arguments
don't count. The map-returns-list-in-void-context one certainly doesn't
anymore, but that was never your main case, if I understand you.

Anno
 
U

Uri Guttman

TvP> When I submitted this patch a couple of months ago, I expected it to go
TvP> into blead. But as I see right now, it already went into 5.8.1. That
TvP> means that the map-returns-list-in-void-context argument no longer
TvP> counts for recent perls.
AS> What's to disagree about, Uri? Tassilo doesn't claim other
AS> arguments don't count. The map-returns-list-in-void-context one
AS> certainly doesn't anymore, but that was never your main case, if I
AS> understand you.

i was never really upset about the efficiency issue which has been
changed. i didn't like the semantic problem of throwing away the list
that map is supposed to make. now the docs will say there is no
generated list in void context so that may be harder to argue. but i
still say that map tells the reader a list is being made so look for
it. for modifier doesn't make such a list. someone (possibly abigail)
also pointed out a subtle difference between map and for, map provides
list context and for provides void context to their expression. that can
be worked around IMO.

i just teach that you code to refect your design and not just implement
it. using map in void context is an implementation that generally
doesn't refect your design IMO.

but we should end this thread now. i won't convert any others here and i
won't change my mind on why i don't like it.

uri
 
T

Tassilo v. Parseval

Also sprach Anno Siegel:
Ah, that's good to know. I didn't notice yet. Does it in fact (as I
inadvertently claimed) optimize any scalar map, or only void context?

Only void context. The optimization for scalar context is a little
more work than for the one for void context. Might look into it
tomorrow. My one-minute attempt right now didn't produce the desired
results.

Tassilo
 
M

Matt Garrish

Uri Guttman said:
i just teach that you code to refect your design and not just implement
it. using map in void context is an implementation that generally
doesn't refect your design IMO.

I agree with you completely. I've never actually used a map in void context
in any of my scripts, as there have always been better ways to write the
code. I can see the temptation to get a little map-happy in the old noggin
in efforts to reduce one's code (or obfuscate), but I wouldn't advocate it
as a replacement for a for loop in cases where it's not warranted. I was
only offering a possible (granted ugly) solution on using map to build the
array.

Matt
 
A

Anno Siegel

Tassilo v. Parseval said:
Also sprach Anno Siegel:



Only void context. The optimization for scalar context is a little
more work than for the one for void context.

I'm sure it is, that's why I asked.
Might look into it
tomorrow. My one-minute attempt right now didn't produce the desired
results.

Thanks for the effort!

Anno
 
B

Bill Smith

my %ListA = ( apple => 1.1, banana => 2.2, cat => 3.3 );
my %ListB = ( apple => 100, boy => 500, cat => 1000 );
my @common = grep { $ListB{$_} } keys %ListA;
for ( sort @common ) {
print "$_\t$ListA{$_}\t$_\t$ListB{$_}\n";
}

Anything wrong with this one? I thought about using 'map' for the 'grep'
part above, but couldn't find a nice way to have it _not_ return anything
when there's no match, ie;

my @common = map { (exists $ListB{$_}) ? $_ : undef } keys %ListA;
foreach ( sort @common ) {
next unless defined;
# ...
}

Comments/suggestions/corrections are appreciated!

I think that you have to tell "map" to return a null list when then
there is no match.

print map +(exists $ListB{$_}) ?
"$_\t$ListA{$_}\t$_\t$ListB{$_}\n" : (),
sort keys %ListA;

This is not my idea of "easy to read", but it does solve your problem.

Bill
 
T

Tassilo v. Parseval

Also sprach Anno Siegel:
I'm sure it is, that's why I asked.

Meanwhile, I have figured out why scalar-map performs more poorly than
necessary. Since it goes through the same branch as list-map, it
unnecessarily makes mortal copies of the values returned. This is
wasteful and therefore can be removed.

This gives an interesting result: scalar-map performs better than
void-map:

ethan@ethan:~/Projects/perls/perl-p-5.8.0@22352$ ./perl ~/map.pl
Rate list void scalar
list 4.74/s -- -57% -60%
void 11.1/s 134% -- -6%
scalar 11.8/s 148% 6% --

as opposed to the non-optimized behaviour:

ethan@ethan:~/Projects/perls/perl-p-5.8.0@22352$ ../installed-perls/perl/pOWn8jv/perl-5.8.0\@22352/bin/perl ~/map.pl
Rate list scalar void
list 4.72/s -- -33% -57%
scalar 7.04/s 49% -- -36%
void 11.0/s 133% 57% --

The advantage of scalar is no statistical noise. It is consistently
better than void in a range of 6% to 8%.

The code:

use Benchmark qw/cmpthese/;

cmpthese(100, {
list => sub { my @a = map $_+1, 1 .. 100000 },
scalar => sub { my $a = map $_+1, 1 .. 100000 },
void => sub { map $_+1, 1 .. 100000 },
});

Since scalar-map runs through considerably more C code in pp_mapwhile
than void-map does, it makes me wonder whether the void case is really
optimally optimized.

Anyway, time to submit the patch now.

Tassilo
 
H

Hunter Johnson

Uri Guttman said:
i was never really upset about the efficiency issue which has been
changed. i didn't like the semantic problem of throwing away the list
that map is supposed to make. now the docs will say there is no
generated list in void context so that may be harder to argue. but i
still say that map tells the reader a list is being made so look for
it.

If the docs say there is no list generated, how is map going to tell
the reader that the docs are wrong?

If a reader reads something extra into what's written (like "here
comes a list" when none is coming), how is that more the writer's
problem than the reader's? Either the writer can write it differently
(using 'for' in this case) or the reader can read it differently. I
don't see how the former is the only right answer.

(And since she's been mentioned here several times, here's a link to
Abigail's post on the topic, from November: http://tinyurl.com/2po9x )

Hunter
 
U

Uri Guttman

HJ> If the docs say there is no list generated, how is map going to tell
HJ> the reader that the docs are wrong?

HJ> If a reader reads something extra into what's written (like "here
HJ> comes a list" when none is coming), how is that more the writer's
HJ> problem than the reader's? Either the writer can write it
HJ> differently (using 'for' in this case) or the reader can read it
HJ> differently. I don't see how the former is the only right answer.

it is a semantic communication to the reader of the code. it is your
(the coder's) responsibility to convey as much accurate information to
the reader as possible. map in a void context is not as accurate as a
for modifier even with the optimization. sure the docs will say it won't
generate a list but its history has always been that way. map is clearer
for generating new stuff and for modifier is clearer for side effects.
remember map has the same signature as grep which also returns a
list which will confuse newbies about this issue. and just because
something works a certain way doesn't mean you have to use it. i choose
not to use map in void context and teach that as well.

uri
 
B

Brad Baxter

V

viv2k

Thanks to all. I know many of u here are experts in Perl.. sorry for
those 'unsophisticated' questions Gunnar. But it's first time I'm
playing with Perl and geez.. it seems to be very powerful in text
manipulation...I'll definitely learn Perl in more detail in future

Can I ask another newbie question please? The solutions provided are
working fine but sometimes depending on the lengths of words, I don't
get aligned results, e.g.

apple 2.2
boy 9.0
definite 2.5
eel 5.0

Is there a way of aligning these comments?

many thanks in advance
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,014
Latest member
BiancaFix3

Latest Threads

Top