Load file into a hash

Discussion in 'Perl Misc' started by Bill H, Oct 6, 2007.

  1. Bill H

    Bill H Guest

    Is there a "perl" way of loading a file directly into a hash instead
    of using something like this quick example:

    open(FILE,"test.txt");
    while(<FILE>)
    {
    $line = $_;
    chop $line;
    @dbf = split(/\t/,$line);
    $MYHASH{$dbf[0]} = $dbf[1];
    }
    close(FILE);

    where the text file contains entries like this:

    NAME0\tsome value
    NAME1\tanother value

    etc... ?

    Bill H
    Bill H, Oct 6, 2007
    #1
    1. Advertising

  2. Bill H wrote:
    > Is there a "perl" way of loading a file directly into a hash instead
    > of using something like this quick example:
    >
    > open(FILE,"test.txt");
    > while(<FILE>)
    > {
    > $line = $_;
    > chop $line;
    > @dbf = split(/\t/,$line);
    > $MYHASH{$dbf[0]} = $dbf[1];
    > }
    > close(FILE);
    >
    > where the text file contains entries like this:
    >
    > NAME0\tsome value
    > NAME1\tanother value
    >


    C:\TEMP>cat loadarray.pl
    #!perl

    use strict;
    use warnings;

    use Data::Dumper;

    my $filename = shift;
    open my $fh,"<",$filename or die $!;

    my %hash = map { chomp; split /\t/ } <$fh>;

    print Dumper(\%hash);

    close $fh or die $!;

    C:\TEMP>cat data.txt
    UK London
    France Paris
    Italy Rome
    USA Washington
    Germany Berlin

    C:\TEMP>loadarray.pl data.txt
    $VAR1 = {
    'France' => 'Paris',
    'UK' => 'London',
    'Italy' => 'Rome',
    'Germany' => 'Berlin',
    'USA' => 'Washington'
    };

    C:\TEMP>

    Mark
    Mark Clements, Oct 6, 2007
    #2
    1. Advertising

  3. Bill H

    Bill H Guest

    On Oct 6, 7:19 am, Mark Clements <>
    wrote:
    > Bill H wrote:
    > > Is there a "perl" way of loading a file directly into a hash instead
    > > of using something like this quick example:

    >
    > > open(FILE,"test.txt");
    > > while(<FILE>)
    > > {
    > > $line = $_;
    > > chop $line;
    > > @dbf = split(/\t/,$line);
    > > $MYHASH{$dbf[0]} = $dbf[1];
    > > }
    > > close(FILE);

    >
    > > where the text file contains entries like this:

    >
    > > NAME0\tsome value
    > > NAME1\tanother value

    >
    > C:\TEMP>cat loadarray.pl
    > #!perl
    >
    > use strict;
    > use warnings;
    >
    > use Data::Dumper;
    >
    > my $filename = shift;
    > open my $fh,"<",$filename or die $!;
    >
    > my %hash = map { chomp; split /\t/ } <$fh>;
    >
    > print Dumper(\%hash);
    >
    > close $fh or die $!;
    >
    > C:\TEMP>cat data.txt
    > UK London
    > France Paris
    > Italy Rome
    > USA Washington
    > Germany Berlin
    >
    > C:\TEMP>loadarray.pl data.txt
    > $VAR1 = {
    > 'France' => 'Paris',
    > 'UK' => 'London',
    > 'Italy' => 'Rome',
    > 'Germany' => 'Berlin',
    > 'USA' => 'Washington'
    > };
    >
    > C:\TEMP>
    >
    > Mark- Hide quoted text -
    >
    > - Show quoted text -


    I like that Mark. You basically took everything I had in the while
    loop and put it on one line. Nice and neat.

    Bill H
    Bill H, Oct 6, 2007
    #3
  4. On Oct 6, 12:19 pm, Mark Clements <>
    wrote:
    > my %hash = map { chomp; split /\t/ } <$fh>;


    That works but is very fragile - one bad line can screw all your data
    from then on.

    I prefer (the canonical idiom)

    my %hash = map { /(.*?)\t(.*)/ } <$fh>;

    This will ignore lines with no "\t" in them. Do something vaguely
    reasonable with lines containing more than one "\t". Oh, and it's
    shorter too.
    Brian McCauley, Oct 6, 2007
    #4
  5. On Oct 6, 3:06 pm, Brian McCauley <> wrote:
    >
    > my %hash = map { /(.*?)\t(.*)/ } <$fh>;


    Oh, and since we're slurping the file anyhow we can save a few lines
    by using File::Slurp

    use File::Slurp;
    my %hash = map { /(.*?)\t(.*)/ } read_file('test.txt');
    Brian McCauley, Oct 6, 2007
    #5
  6. On Sat, 06 Oct 2007 14:06:18 +0000, Brian McCauley wrote:

    > On Oct 6, 12:19 pm, Mark Clements <>
    > wrote:
    >> my %hash = map { chomp; split /\t/ } <$fh>;

    >
    > That works but is very fragile - one bad line can screw all your data
    > from then on.
    >
    > I prefer (the canonical idiom)
    >
    > my %hash = map { /(.*?)\t(.*)/ } <$fh>;
    >
    > This will ignore lines with no "\t" in them. Do something vaguely
    > reasonable with lines containing more than one "\t". Oh, and it's
    > shorter too.


    Doesn't that include the newline on any line in the second field?

    M4
    Martijn Lievaart, Oct 6, 2007
    #6
  7. Martijn Lievaart <> wrote:
    > On Sat, 06 Oct 2007 14:06:18 +0000, Brian McCauley wrote:



    >> my %hash = map { /(.*?)\t(.*)/ } <$fh>;



    > Doesn't that include the newline on any line in the second field?



    Which part of the regex can match those newlines?


    --
    Tad McClellan
    email: perl -le "print scalar reverse qq/moc.noitatibaher\100cmdat/"
    Tad McClellan, Oct 6, 2007
    #7
  8. Bill H <> wrote:
    > Is there a "perl" way of loading a file directly into a hash instead
    > of using something like this quick example:
    >
    > open(FILE,"test.txt");



    You should always, yes *always*, check the return value from open().


    > while(<FILE>)
    > {
    > $line = $_;



    If you want it in $line, then put it there, rather than put it
    somewhere else and then move it there:

    while ( my $line = <FILE> )


    > chop $line;



    You should use chomp() to remove newlines.


    my %myhash = split /[\t\n]/, do{ local $/; <FILE>};


    --
    Tad McClellan
    email: perl -le "print scalar reverse qq/moc.noitatibaher\100cmdat/"
    Tad McClellan, Oct 6, 2007
    #8
  9. On Sat, 06 Oct 2007 21:36:40 +0000, Tad McClellan wrote:

    > Martijn Lievaart <> wrote:
    >> On Sat, 06 Oct 2007 14:06:18 +0000, Brian McCauley wrote:

    >
    >
    >>> my %hash = map { /(.*?)\t(.*)/ } <$fh>;

    >
    >
    >> Doesn't that include the newline on any line in the second field?

    >
    >
    > Which part of the regex can match those newlines?


    You're right. '.' does not match newlines. Stupid me.

    M4
    Martijn Lievaart, Oct 7, 2007
    #9
  10. Bill H

    Uri Guttman Guest

    >>>>> "BM" == Brian McCauley <> writes:

    BM> On Oct 6, 3:06 pm, Brian McCauley <> wrote:
    >>
    >> my %hash = map { /(.*?)\t(.*)/ } <$fh>;


    BM> Oh, and since we're slurping the file anyhow we can save a few lines
    BM> by using File::Slurp

    BM> use File::Slurp;
    BM> my %hash = map { /(.*?)\t(.*)/ } read_file('test.txt');

    i was going to chime in with slurp as well. :)

    i have a better and faster idiom for slurping files into hashes
    (untested for typos):

    my %hash = read_file('test.txt') =~ /^([^\t]+)\t(.*)$/gm ;

    and for most config type files slurping is fast since they are likely
    smaller than even the OS's I/O block size which can be 64k or even
    256k. there is no savings using perl's i/o system and reading many files
    line by line. it is a great teaching technique but it is slower than
    slurping in a whole file and parsing it in one regex call (as is
    possible in many cases).

    uri

    --
    Uri Guttman ------ -------- http://www.stemsystems.com
    --Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
    Search or Offer Perl Jobs ---------------------------- http://jobs.perl.org
    Uri Guttman, Oct 9, 2007
    #10
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. rp
    Replies:
    1
    Views:
    492
    red floyd
    Nov 10, 2011
  2. Jon Baer
    Replies:
    3
    Views:
    113
    Pit Capitain
    Jan 27, 2006
  3. Srijayanth Sridhar
    Replies:
    19
    Views:
    595
    David A. Black
    Jul 2, 2008
  4. Replies:
    14
    Views:
    229
    Tomi Häsä
    Jan 10, 2005
  5. Arvin Portlock
    Replies:
    6
    Views:
    132
    Arvin Portlock
    Sep 2, 2005
Loading...

Share This Page