Populate a hash from a list elegantly

Discussion in 'Perl Misc' started by usenet@DavidFilmer.com, Mar 9, 2006.

  1. Guest

    Kindly consider this sample code, if you will, which illustrates my
    question. This code works just fine and does exactly what I want,
    but... I dunno... I just don't like the approach I've taken. At first,
    I thought to approach this with a split() (using a limit of 1) instead
    of a regexp, but I couldn't figure out how to make that work in
    anything other than a convoluted manner. I'm interested in maybe
    learning different techniques from others who may approach the task
    differently.

    #!/usr/bin/perl
    use strict; use warnings;

    my %user; # keys are userid's, values are names
    while (my $line = <DATA>) {
    $line =~ m{^(\w+) +(.*)$} and $user{$1} = $2;
    }
    print map { "'$_'\t=>\t'$user{$_}'\n" } sort keys %user;

    __DATA__
    fredf Fred Flintstone
    Barn Barney Rubble
    bogus
    WF Wilma Flintstone
    betty Betty Rubble

    --
    http://DavidFilmer.com
     
    , Mar 9, 2006
    #1
    1. Advertising

  2. robic0 Guest

    On 8 Mar 2006 17:37:10 -0800, wrote:

    >Kindly consider this sample code, if you will, which illustrates my
    >question. This code works just fine and does exactly what I want,
    >but... I dunno... I just don't like the approach I've taken. At first,
    >I thought to approach this with a split() (using a limit of 1) instead
    >of a regexp, but I couldn't figure out how to make that work in
    >anything other than a convoluted manner. I'm interested in maybe
    >learning different techniques from others who may approach the task
    >differently.
    >
    >#!/usr/bin/perl
    > use strict; use warnings;
    >
    > my %user; # keys are userid's, values are names
    > while (my $line = <DATA>) {
    > $line =~ m{^(\w+) +(.*)$} and $user{$1} = $2;
    > }
    > print map { "'$_'\t=>\t'$user{$_}'\n" } sort keys %user;
    >
    >__DATA__
    >fredf Fred Flintstone
    >Barn Barney Rubble
    >bogus
    >WF Wilma Flintstone
    >betty Betty Rubble


    Nothing wrong with what you have, but give yourself some diagnostic leeway...

    my ($line);
    while ($line = <DATA>) {
    if ($line =~ /^\s+(\w+)\s+(.*?)\s+$/) {
    $user{$1} = $2;
    } else {
    print "no match for: <$line>\n";
    }
    }
     
    robic0, Mar 9, 2006
    #2
    1. Advertising

  3. robic0 Guest

    On Wed, 08 Mar 2006 18:07:17 -0800, robic0 wrote:

    >On 8 Mar 2006 17:37:10 -0800, wrote:
    >
    >>Kindly consider this sample code, if you will, which illustrates my
    >>question. This code works just fine and does exactly what I want,
    >>but... I dunno... I just don't like the approach I've taken. At first,
    >>I thought to approach this with a split() (using a limit of 1) instead
    >>of a regexp, but I couldn't figure out how to make that work in
    >>anything other than a convoluted manner. I'm interested in maybe
    >>learning different techniques from others who may approach the task
    >>differently.
    >>
    >>#!/usr/bin/perl
    >> use strict; use warnings;
    >>
    >> my %user; # keys are userid's, values are names
    >> while (my $line = <DATA>) {
    >> $line =~ m{^(\w+) +(.*)$} and $user{$1} = $2;
    >> }
    >> print map { "'$_'\t=>\t'$user{$_}'\n" } sort keys %user;
    >>
    >>__DATA__
    >>fredf Fred Flintstone
    >>Barn Barney Rubble
    >>bogus
    >>WF Wilma Flintstone
    >>betty Betty Rubble

    >
    >Nothing wrong with what you have, but give yourself some diagnostic leeway...
    >
    > my ($line);
    > while ($line = <DATA>) {
    > if ($line =~ /^\s+(\w+)\s+(.*?)\s+$/) {
    > $user{$1} = $2;
    > } else {
    > print "no match for: <$line>\n";
    > }
    > }

    excuse me, use this:

    if ($line =~ /^\s*(\w+)\s*(.*?)\s*$/) {
     
    robic0, Mar 9, 2006
    #3
  4. robic0 Guest

    On Wed, 08 Mar 2006 18:10:51 -0800, robic0 wrote:

    >On Wed, 08 Mar 2006 18:07:17 -0800, robic0 wrote:
    >
    >>On 8 Mar 2006 17:37:10 -0800, wrote:
    >>
    >>>Kindly consider this sample code, if you will, which illustrates my
    >>>question. This code works just fine and does exactly what I want,
    >>>but... I dunno... I just don't like the approach I've taken. At first,
    >>>I thought to approach this with a split() (using a limit of 1) instead
    >>>of a regexp, but I couldn't figure out how to make that work in
    >>>anything other than a convoluted manner. I'm interested in maybe
    >>>learning different techniques from others who may approach the task
    >>>differently.
    >>>
    >>>#!/usr/bin/perl
    >>> use strict; use warnings;
    >>>
    >>> my %user; # keys are userid's, values are names
    >>> while (my $line = <DATA>) {
    >>> $line =~ m{^(\w+) +(.*)$} and $user{$1} = $2;
    >>> }
    >>> print map { "'$_'\t=>\t'$user{$_}'\n" } sort keys %user;
    >>>
    >>>__DATA__
    >>>fredf Fred Flintstone
    >>>Barn Barney Rubble
    >>>bogus
    >>>WF Wilma Flintstone
    >>>betty Betty Rubble

    >>
    >>Nothing wrong with what you have, but give yourself some diagnostic leeway...
    >>
    >> my ($line);
    >> while ($line = <DATA>) {
    >> if ($line =~ /^\s+(\w+)\s+(.*?)\s+$/) {
    >> $user{$1} = $2;
    >> } else {
    >> print "no match for: <$line>\n";
    >> }
    >> }

    >excuse me, use this:
    >
    > if ($line =~ /^\s*(\w+)\s*(.*?)\s*$/) {

    Too much good wine, use this:

    if ($line =~ /^\s*(\w+)\s+(.*?)\s*$/) {
     
    robic0, Mar 9, 2006
    #4
  5. robic0 Guest

    On Wed, 08 Mar 2006 18:13:16 -0800, robic0 wrote:

    >On Wed, 08 Mar 2006 18:10:51 -0800, robic0 wrote:
    >
    >>On Wed, 08 Mar 2006 18:07:17 -0800, robic0 wrote:
    >>
    >>>On 8 Mar 2006 17:37:10 -0800, wrote:
    >>>
    >>>>Kindly consider this sample code, if you will, which illustrates my
    >>>>question. This code works just fine and does exactly what I want,
    >>>>but... I dunno... I just don't like the approach I've taken. At first,
    >>>>I thought to approach this with a split() (using a limit of 1) instead
    >>>>of a regexp, but I couldn't figure out how to make that work in
    >>>>anything other than a convoluted manner. I'm interested in maybe
    >>>>learning different techniques from others who may approach the task
    >>>>differently.
    >>>>
    >>>>#!/usr/bin/perl
    >>>> use strict; use warnings;
    >>>>
    >>>> my %user; # keys are userid's, values are names
    >>>> while (my $line = <DATA>) {
    >>>> $line =~ m{^(\w+) +(.*)$} and $user{$1} = $2;
    >>>> }
    >>>> print map { "'$_'\t=>\t'$user{$_}'\n" } sort keys %user;
    >>>>
    >>>>__DATA__
    >>>>fredf Fred Flintstone
    >>>>Barn Barney Rubble
    >>>>bogus
    >>>>WF Wilma Flintstone
    >>>>betty Betty Rubble
    >>>
    >>>Nothing wrong with what you have, but give yourself some diagnostic leeway...
    >>>
    >>> my ($line);
    >>> while ($line = <DATA>) {
    >>> if ($line =~ /^\s+(\w+)\s+(.*?)\s+$/) {
    >>> $user{$1} = $2;
    >>> } else {
    >>> print "no match for: <$line>\n";
    >>> }
    >>> }

    >>excuse me, use this:
    >>
    >> if ($line =~ /^\s*(\w+)\s*(.*?)\s*$/) {

    >Too much good wine, use this:
    >
    > if ($line =~ /^\s*(\w+)\s+(.*?)\s*$/) {


    As a general rule in your circumstance, use "split" when a nearly
    "homogenous" pattern is assured. Homogenous in the sence that only the
    delimiter can be described as a pattern. The source has to be known
    to a %99.999 assurance, something output from like an excell csv file.

    What you did with the regex was to introduce a restriction on what
    the non-delimited data should be. Quality assurance is preferred over
    speed.

    robic0
     
    robic0, Mar 9, 2006
    #5
  6. wrote:
    > Kindly consider this sample code, if you will, which illustrates my
    > question. This code works just fine and does exactly what I want,
    > but... I dunno... I just don't like the approach I've taken. At first,
    > I thought to approach this with a split() (using a limit of 1) instead
    > of a regexp, but I couldn't figure out how to make that work in
    > anything other than a convoluted manner. I'm interested in maybe
    > learning different techniques from others who may approach the task
    > differently.
    >
    > my %user; # keys are userid's, values are names
    > while (my $line = <DATA>) {
    > $line =~ m{^(\w+) +(.*)$} and $user{$1} = $2;
    > }


    This is one idea:

    my %user = map
    { chomp; local @_; no warnings; split(' ', $_, 2) == 2 ? @_ : () }
    <DATA>;

    According perldoc -f split, use of split in scalar context is
    deprecated. "local @_" and "no warnings" take care of that, but the
    solution may still not be advisable.

    --
    Gunnar Hjalmarsson
    Email: http://www.gunnar.cc/cgi-bin/contact.pl
     
    Gunnar Hjalmarsson, Mar 9, 2006
    #6
  7. robic0 Guest

    On Thu, 09 Mar 2006 03:23:06 +0100, Gunnar Hjalmarsson <> wrote:

    > wrote:
    >> Kindly consider this sample code, if you will, which illustrates my
    >> question. This code works just fine and does exactly what I want,
    >> but... I dunno... I just don't like the approach I've taken. At first,
    >> I thought to approach this with a split() (using a limit of 1) instead
    >> of a regexp, but I couldn't figure out how to make that work in
    >> anything other than a convoluted manner. I'm interested in maybe
    >> learning different techniques from others who may approach the task
    >> differently.
    >>
    >> my %user; # keys are userid's, values are names
    >> while (my $line = <DATA>) {
    >> $line =~ m{^(\w+) +(.*)$} and $user{$1} = $2;
    >> }

    >
    >This is one idea:
    >
    > my %user = map
    > { chomp; local @_; no warnings; split(' ', $_, 2) == 2 ? @_ : () }
    > <DATA>;
    >
    >According perldoc -f split, use of split in scalar context is
    >deprecated. "local @_" and "no warnings" take care of that, but the
    >solution may still not be advisable.

    I can't fathom....
     
    robic0, Mar 9, 2006
    #7
  8. DJ Stunks Guest

    wrote:
    > #!/usr/bin/perl
    > use strict; use warnings;
    >
    > my %user; # keys are userid's, values are names
    > while (my $line = <DATA>) {
    > $line =~ m{^(\w+) +(.*)$} and $user{$1} = $2;
    > }
    > print map { "'$_'\t=>\t'$user{$_}'\n" } sort keys %user;
    >
    > __DATA__
    > fredf Fred Flintstone
    > Barn Barney Rubble
    > bogus
    > WF Wilma Flintstone
    > betty Betty Rubble


    I would use a map instead of a while, but adjust the regex slightly to
    ensure it fails (ie: won't return a partial match) for bogus entries.

    observe:

    #!/usr/bin/perl
    use strict; use warnings;

    my %user = map { m{^(\w+) +(.+)$} } <DATA>;

    print map { "'$_'\t=>\t'$user{$_}'\n" } sort keys %user;


    __DATA__
    fredf Fred Flintstone
    Barn Barney Rubble
    bogus
    WF Wilma Flintstone
    betty Betty Rubble

    -jp

    PS: sorry for the triple-posting in that other thread. Damn you,
    Google Groups!
    PPS: good newsreader for winXP suggestions?
     
    DJ Stunks, Mar 9, 2006
    #8
  9. Uri Guttman Guest

    >>>>> "u" == usenet <> writes:

    u> Kindly consider this sample code, if you will, which illustrates my
    u> question. This code works just fine and does exactly what I want,
    u> but... I dunno... I just don't like the approach I've taken. At first,
    u> I thought to approach this with a split() (using a limit of 1) instead
    u> of a regexp, but I couldn't figure out how to make that work in
    u> anything other than a convoluted manner. I'm interested in maybe
    u> learning different techniques from others who may approach the task
    u> differently.

    u> #!/usr/bin/perl
    u> use strict; use warnings;

    u> my %user; # keys are userid's, values are names
    u> while (my $line = <DATA>) {
    u> $line =~ m{^(\w+) +(.*)$} and $user{$1} = $2;
    u> }
    u> print map { "'$_'\t=>\t'$user{$_}'\n" } sort keys %user;

    u> __DATA__
    u> fredf Fred Flintstone
    u> Barn Barney Rubble
    u> bogus
    u> WF Wilma Flintstone
    u> betty Betty Rubble

    use File::Slurp ;

    my %user = read_file( \*DATA ) =~ /^(\w+)\s+(.*)$/mg ;

    uri

    --
    Uri Guttman ------ -------- http://www.stemsystems.com
    --Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
    Search or Offer Perl Jobs ---------------------------- http://jobs.perl.org
     
    Uri Guttman, Mar 9, 2006
    #9
  10. Guest

    Uri Guttman wrote:
    > use File::Slurp ;
    > my %user = read_file( \*DATA ) =~ /^(\w+)\s+(.*)$/mg ;


    That's great! Thanks.
     
    , Mar 9, 2006
    #10
  11. wrote:
    > Kindly consider this sample code, if you will, which illustrates my
    > question. This code works just fine and does exactly what I want,
    > but... I dunno... I just don't like the approach I've taken. At first,
    > I thought to approach this with a split() (using a limit of 1) instead
    > of a regexp, but I couldn't figure out how to make that work in
    > anything other than a convoluted manner.


    That is probably because split()'s limit describes the number of fields to
    return and you want two fields (the hash keys and the hash value) not one field.

    my ( $key, $val ) = split / +/, $line, 2;



    John
    --
    use Perl;
    program
    fulfillment
     
    John W. Krahn, Mar 9, 2006
    #11
  12. Guest

    John W. Krahn wrote:
    > That is probably because split()'s limit describes the number of fields to
    > return and you want two fields (the hash keys and the hash value) not one field.


    Ah, I didn't realize that. Thanks, but that wasn't really my problem
    (though it would have become a problem)...

    > my ( $key, $val ) = split / +/, $line, 2;


    Yeah, that's where I was actually having trouble, because I can't see
    how to translate that into hash-ish (without ugly intermediate scalars
    or an intermediate array), such as:

    $user{$dunno_what_to_put_here} = (split / +/, $line, 2)[1];

    --
    http://DavidFilmer.com
     
    , Mar 9, 2006
    #12
  13. Gunnar Hjalmarsson wrote:
    > wrote:
    >> Kindly consider this sample code, if you will, which illustrates my
    >> question. This code works just fine and does exactly what I want,
    >> but... I dunno... I just don't like the approach I've taken. At first,
    >> I thought to approach this with a split() (using a limit of 1) instead
    >> of a regexp, but I couldn't figure out how to make that work in
    >> anything other than a convoluted manner. I'm interested in maybe
    >> learning different techniques from others who may approach the task
    >> differently.
    >>
    >> my %user; # keys are userid's, values are names
    >> while (my $line = <DATA>) {
    >> $line =~ m{^(\w+) +(.*)$} and $user{$1} = $2;
    >> }

    >
    > This is one idea:
    >
    > my %user = map
    > { chomp; local @_; no warnings; split(' ', $_, 2) == 2 ? @_ : () }
    > <DATA>;
    >
    > According perldoc -f split, use of split in scalar context is
    > deprecated. "local @_" and "no warnings" take care of that, but the
    > solution may still not be advisable.


    So why not use a lexically scoped array?

    my %user = map
    { chomp; my @array; ( @array = split( ' ', $_, 2 ) ) == 2 ? @array : () }
    <DATA>;



    John
    --
    use Perl;
    program
    fulfillment
     
    John W. Krahn, Mar 9, 2006
    #13
  14. DJ Stunks Guest

    Uri Guttman wrote:
    > >>>>> "u" == usenet <> writes:

    >
    > u> Kindly consider this sample code, if you will, which illustrates my
    > u> question. This code works just fine and does exactly what I want,
    > u> but... I dunno... I just don't like the approach I've taken. At first,
    > u> I thought to approach this with a split() (using a limit of 1) instead
    > u> of a regexp, but I couldn't figure out how to make that work in
    > u> anything other than a convoluted manner. I'm interested in maybe
    > u> learning different techniques from others who may approach the task
    > u> differently.
    >
    > u> #!/usr/bin/perl
    > u> use strict; use warnings;
    >
    > u> my %user; # keys are userid's, values are names
    > u> while (my $line = <DATA>) {
    > u> $line =~ m{^(\w+) +(.*)$} and $user{$1} = $2;
    > u> }
    > u> print map { "'$_'\t=>\t'$user{$_}'\n" } sort keys %user;
    >
    > u> __DATA__
    > u> fredf Fred Flintstone
    > u> Barn Barney Rubble
    > u> bogus
    > u> WF Wilma Flintstone
    > u> betty Betty Rubble
    >
    > use File::Slurp ;
    >
    > my %user = read_file( \*DATA ) =~ /^(\w+)\s+(.*)$/mg ;
    >
    > uri
    >
    > --
    > Uri Guttman ------ -------- http://www.stemsystems.com
    > --Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
    > Search or Offer Perl Jobs ---------------------------- http://jobs.perl.org
     
    DJ Stunks, Mar 9, 2006
    #14
  15. DJ Stunks Guest

    Uri Guttman wrote:
    >
    > use File::Slurp ;
    >
    > my %user = read_file( \*DATA ) =~ /^(\w+)\s+(.*)$/mg ;


    this regex does not filter the bogus entry.

    try /^(\w+) +(.+)$/mg instead.

    -jp
     
    DJ Stunks, Mar 9, 2006
    #15
  16. John W. Krahn wrote:
    > Gunnar Hjalmarsson wrote:
    >> wrote:
    >>>
    >>> my %user; # keys are userid's, values are names
    >>> while (my $line = <DATA>) {
    >>> $line =~ m{^(\w+) +(.*)$} and $user{$1} = $2;
    >>> }

    >>
    >>This is one idea:
    >>
    >> my %user = map
    >> { chomp; local @_; no warnings; split(' ', $_, 2) == 2 ? @_ : () }
    >> <DATA>;
    >>
    >>According perldoc -f split, use of split in scalar context is
    >>deprecated. "local @_" and "no warnings" take care of that, but the
    >>solution may still not be advisable.

    >
    > So why not use a lexically scoped array?
    >
    > my %user = map
    > { chomp; my @array; ( @array = split( ' ', $_, 2 ) ) == 2 ? @array : () }
    > <DATA>;


    Thanks, John. And that made me realize that assigning _explicitly_ to @_
    is enough to get rid of the warning:

    my %user = map
    { chomp; ( local @_ = split ' ', $_, 2 ) == 2 ? @_ : () } <DATA>;

    --
    Gunnar Hjalmarsson
    Email: http://www.gunnar.cc/cgi-bin/contact.pl
     
    Gunnar Hjalmarsson, Mar 9, 2006
    #16
  17. Dr.Ruud Guest

    [OT] good newsreader for winXP (was: Re: Populate a hash from a list elegantly)

    DJ Stunks schreef:

    > PPS: good newsreader for winXP suggestions?



    I assume you mean 'text articles' (did you just read 'testicles'?),
    since you didn't say 'binary'.

    Many are nice to work with:

    40tude Dialog
    (multi-server, multi-threaded, Unicode)

    (MicroPlanet) Gravity, Super Gravity

    Forte (Free) Agent

    slrn http://slrn.sourceforge.net/
    (use an NTFS compressed folder as spool)

    Hamster Playground + Outlook Express + OE QuoteFix

    Thunderbird

    http://www.newsreaders.com/win/clients.html

    --
    Affijn, Ruud

    "Gewoon is een tijger."
     
    Dr.Ruud, Mar 9, 2006
    #17
  18. Dr.Ruud Guest

    DJ Stunks schreef:

    > /^(\w+) +(.+)$/mg


    /^(\w+)[[:blank:]]+(.+)/mg

    (untested)

    --
    Affijn, Ruud

    "Gewoon is een tijger."
     
    Dr.Ruud, Mar 9, 2006
    #18
  19. John Bokma Guest

    Re: [OT] good newsreader for winXP (was: Re: Populate a hash from a list elegantly)

    "Dr.Ruud" <> wrote:

    > 40tude Dialog
    > (multi-server, multi-threaded, Unicode)


    I use Xnews, and it's probably not the most user friendly program, and has
    some minor issues (or major, YMMV), but I still haven't switched to Dialog
    (which I want for some time) :-D

    --
    John Experienced Perl programmer: http://castleamber.com/
     
    John Bokma, Mar 9, 2006
    #19
  20. Dr.Ruud Guest

    Re: [OT] good newsreader for winXP

    John Bokma:
    > Dr.Ruud:


    >> 40tude Dialog
    >> (multi-server, multi-threaded, Unicode)

    >
    > I use Xnews, and it's probably not the most user friendly program,
    > and has some minor issues (or major, YMMV), but I still haven't
    > switched to Dialog (which I want for some time) :-D


    Also very nice is Pimmy, because it has almost no dependencies.

    I use an old one, as a sort of watchdog, connected to many pop- and
    IMAP-boxes on many servers.
    http://www.geminisoft.com/en/pimmy/

    --
    Affijn, Ruud

    "Gewoon is een tijger."
     
    Dr.Ruud, Mar 9, 2006
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Ken Fine
    Replies:
    0
    Views:
    346
    Ken Fine
    Jul 19, 2003
  2. Michael Hudson
    Replies:
    3
    Views:
    372
    Bengt Richter
    Jul 22, 2003
  3. Michele Simionato
    Replies:
    15
    Views:
    578
  4. Enjoy Life
    Replies:
    2
    Views:
    320
    Roland Hall
    Feb 23, 2005
  5. rp
    Replies:
    1
    Views:
    593
    red floyd
    Nov 10, 2011
Loading...

Share This Page