Populate a hash from a list elegantly

U

usenet

Kindly consider this sample code, if you will, which illustrates my
question. This code works just fine and does exactly what I want,
but... I dunno... I just don't like the approach I've taken. At first,
I thought to approach this with a split() (using a limit of 1) instead
of a regexp, but I couldn't figure out how to make that work in
anything other than a convoluted manner. I'm interested in maybe
learning different techniques from others who may approach the task
differently.

#!/usr/bin/perl
use strict; use warnings;

my %user; # keys are userid's, values are names
while (my $line = <DATA>) {
$line =~ m{^(\w+) +(.*)$} and $user{$1} = $2;
}
print map { "'$_'\t=>\t'$user{$_}'\n" } sort keys %user;

__DATA__
fredf Fred Flintstone
Barn Barney Rubble
bogus
WF Wilma Flintstone
betty Betty Rubble
 
R

robic0

Kindly consider this sample code, if you will, which illustrates my
question. This code works just fine and does exactly what I want,
but... I dunno... I just don't like the approach I've taken. At first,
I thought to approach this with a split() (using a limit of 1) instead
of a regexp, but I couldn't figure out how to make that work in
anything other than a convoluted manner. I'm interested in maybe
learning different techniques from others who may approach the task
differently.

#!/usr/bin/perl
use strict; use warnings;

my %user; # keys are userid's, values are names
while (my $line = <DATA>) {
$line =~ m{^(\w+) +(.*)$} and $user{$1} = $2;
}
print map { "'$_'\t=>\t'$user{$_}'\n" } sort keys %user;

__DATA__
fredf Fred Flintstone
Barn Barney Rubble
bogus
WF Wilma Flintstone
betty Betty Rubble

Nothing wrong with what you have, but give yourself some diagnostic leeway...

my ($line);
while ($line = <DATA>) {
if ($line =~ /^\s+(\w+)\s+(.*?)\s+$/) {
$user{$1} = $2;
} else {
print "no match for: <$line>\n";
}
}
 
R

robic0

Nothing wrong with what you have, but give yourself some diagnostic leeway...

my ($line);
while ($line = <DATA>) {
if ($line =~ /^\s+(\w+)\s+(.*?)\s+$/) {
$user{$1} = $2;
} else {
print "no match for: <$line>\n";
}
}
excuse me, use this:

if ($line =~ /^\s*(\w+)\s*(.*?)\s*$/) {
 
R

robic0

Too much good wine, use this:

if ($line =~ /^\s*(\w+)\s+(.*?)\s*$/) {

As a general rule in your circumstance, use "split" when a nearly
"homogenous" pattern is assured. Homogenous in the sence that only the
delimiter can be described as a pattern. The source has to be known
to a %99.999 assurance, something output from like an excell csv file.

What you did with the regex was to introduce a restriction on what
the non-delimited data should be. Quality assurance is preferred over
speed.

robic0
 
G

Gunnar Hjalmarsson

Kindly consider this sample code, if you will, which illustrates my
question. This code works just fine and does exactly what I want,
but... I dunno... I just don't like the approach I've taken. At first,
I thought to approach this with a split() (using a limit of 1) instead
of a regexp, but I couldn't figure out how to make that work in
anything other than a convoluted manner. I'm interested in maybe
learning different techniques from others who may approach the task
differently.

my %user; # keys are userid's, values are names
while (my $line = <DATA>) {
$line =~ m{^(\w+) +(.*)$} and $user{$1} = $2;
}

This is one idea:

my %user = map
{ chomp; local @_; no warnings; split(' ', $_, 2) == 2 ? @_ : () }
<DATA>;

According perldoc -f split, use of split in scalar context is
deprecated. "local @_" and "no warnings" take care of that, but the
solution may still not be advisable.
 
R

robic0

This is one idea:

my %user = map
{ chomp; local @_; no warnings; split(' ', $_, 2) == 2 ? @_ : () }
<DATA>;

According perldoc -f split, use of split in scalar context is
deprecated. "local @_" and "no warnings" take care of that, but the
solution may still not be advisable.
I can't fathom....
 
D

DJ Stunks

#!/usr/bin/perl
use strict; use warnings;

my %user; # keys are userid's, values are names
while (my $line = <DATA>) {
$line =~ m{^(\w+) +(.*)$} and $user{$1} = $2;
}
print map { "'$_'\t=>\t'$user{$_}'\n" } sort keys %user;

__DATA__
fredf Fred Flintstone
Barn Barney Rubble
bogus
WF Wilma Flintstone
betty Betty Rubble

I would use a map instead of a while, but adjust the regex slightly to
ensure it fails (ie: won't return a partial match) for bogus entries.

observe:

#!/usr/bin/perl
use strict; use warnings;

my %user = map { m{^(\w+) +(.+)$} } <DATA>;

print map { "'$_'\t=>\t'$user{$_}'\n" } sort keys %user;


__DATA__
fredf Fred Flintstone
Barn Barney Rubble
bogus
WF Wilma Flintstone
betty Betty Rubble

-jp

PS: sorry for the triple-posting in that other thread. Damn you,
Google Groups!
PPS: good newsreader for winXP suggestions?
 
U

Uri Guttman

u> Kindly consider this sample code, if you will, which illustrates my
u> question. This code works just fine and does exactly what I want,
u> but... I dunno... I just don't like the approach I've taken. At first,
u> I thought to approach this with a split() (using a limit of 1) instead
u> of a regexp, but I couldn't figure out how to make that work in
u> anything other than a convoluted manner. I'm interested in maybe
u> learning different techniques from others who may approach the task
u> differently.

u> #!/usr/bin/perl
u> use strict; use warnings;

u> my %user; # keys are userid's, values are names
u> while (my $line = <DATA>) {
u> $line =~ m{^(\w+) +(.*)$} and $user{$1} = $2;
u> }
u> print map { "'$_'\t=>\t'$user{$_}'\n" } sort keys %user;

u> __DATA__
u> fredf Fred Flintstone
u> Barn Barney Rubble
u> bogus
u> WF Wilma Flintstone
u> betty Betty Rubble

use File::Slurp ;

my %user = read_file( \*DATA ) =~ /^(\w+)\s+(.*)$/mg ;

uri
 
J

John W. Krahn

Kindly consider this sample code, if you will, which illustrates my
question. This code works just fine and does exactly what I want,
but... I dunno... I just don't like the approach I've taken. At first,
I thought to approach this with a split() (using a limit of 1) instead
of a regexp, but I couldn't figure out how to make that work in
anything other than a convoluted manner.

That is probably because split()'s limit describes the number of fields to
return and you want two fields (the hash keys and the hash value) not one field.

my ( $key, $val ) = split / +/, $line, 2;



John
 
U

usenet

John said:
That is probably because split()'s limit describes the number of fields to
return and you want two fields (the hash keys and the hash value) not one field.

Ah, I didn't realize that. Thanks, but that wasn't really my problem
(though it would have become a problem)...
my ( $key, $val ) = split / +/, $line, 2;

Yeah, that's where I was actually having trouble, because I can't see
how to translate that into hash-ish (without ugly intermediate scalars
or an intermediate array), such as:

$user{$dunno_what_to_put_here} = (split / +/, $line, 2)[1];
 
J

John W. Krahn

Gunnar said:
This is one idea:

my %user = map
{ chomp; local @_; no warnings; split(' ', $_, 2) == 2 ? @_ : () }
<DATA>;

According perldoc -f split, use of split in scalar context is
deprecated. "local @_" and "no warnings" take care of that, but the
solution may still not be advisable.

So why not use a lexically scoped array?

my %user = map
{ chomp; my @array; ( @array = split( ' ', $_, 2 ) ) == 2 ? @array : () }
<DATA>;



John
 
D

DJ Stunks

Uri said:
u> Kindly consider this sample code, if you will, which illustrates my
u> question. This code works just fine and does exactly what I want,
u> but... I dunno... I just don't like the approach I've taken. At first,
u> I thought to approach this with a split() (using a limit of 1) instead
u> of a regexp, but I couldn't figure out how to make that work in
u> anything other than a convoluted manner. I'm interested in maybe
u> learning different techniques from others who may approach the task
u> differently.

u> #!/usr/bin/perl
u> use strict; use warnings;

u> my %user; # keys are userid's, values are names
u> while (my $line = <DATA>) {
u> $line =~ m{^(\w+) +(.*)$} and $user{$1} = $2;
u> }
u> print map { "'$_'\t=>\t'$user{$_}'\n" } sort keys %user;

u> __DATA__
u> fredf Fred Flintstone
u> Barn Barney Rubble
u> bogus
u> WF Wilma Flintstone
u> betty Betty Rubble

use File::Slurp ;

my %user = read_file( \*DATA ) =~ /^(\w+)\s+(.*)$/mg ;

uri
 
D

DJ Stunks

Uri said:
use File::Slurp ;

my %user = read_file( \*DATA ) =~ /^(\w+)\s+(.*)$/mg ;

this regex does not filter the bogus entry.

try /^(\w+) +(.+)$/mg instead.

-jp
 
G

Gunnar Hjalmarsson

John said:
So why not use a lexically scoped array?

my %user = map
{ chomp; my @array; ( @array = split( ' ', $_, 2 ) ) == 2 ? @array : () }
<DATA>;

Thanks, John. And that made me realize that assigning _explicitly_ to @_
is enough to get rid of the warning:

my %user = map
{ chomp; ( local @_ = split ' ', $_, 2 ) == 2 ? @_ : () } <DATA>;
 
D

Dr.Ruud

DJ Stunks schreef:
PPS: good newsreader for winXP suggestions?


I assume you mean 'text articles' (did you just read 'testicles'?),
since you didn't say 'binary'.

Many are nice to work with:

40tude Dialog
(multi-server, multi-threaded, Unicode)

(MicroPlanet) Gravity, Super Gravity

Forte (Free) Agent

slrn http://slrn.sourceforge.net/
(use an NTFS compressed folder as spool)

Hamster Playground + Outlook Express + OE QuoteFix

Thunderbird

http://www.newsreaders.com/win/clients.html
 
J

John Bokma

Dr.Ruud said:
40tude Dialog
(multi-server, multi-threaded, Unicode)

I use Xnews, and it's probably not the most user friendly program, and has
some minor issues (or major, YMMV), but I still haven't switched to Dialog
(which I want for some time) :-D
 
D

Dr.Ruud

John Bokma:
Dr.Ruud:

I use Xnews, and it's probably not the most user friendly program,
and has some minor issues (or major, YMMV), but I still haven't
switched to Dialog (which I want for some time) :-D

Also very nice is Pimmy, because it has almost no dependencies.

I use an old one, as a sort of watchdog, connected to many pop- and
IMAP-boxes on many servers.
http://www.geminisoft.com/en/pimmy/
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,577
Members
45,054
Latest member
LucyCarper

Latest Threads

Top