hash of arrays

Discussion in 'Perl Misc' started by removeps groups, Sep 13, 2012.

  1. My perl script reads lines of a file, or rows of a database. The first column is the key (the id, an integer), and for each distinct key an entry should be added into the hashmap. The second column is a string (the name). The third and fourth columns will go into the first and second array of thekey-value, respectively. So if the file is (the 4 columns are tab-separated):

    1 First Id 2 3
    1 First Id 2 4
    2 Second Id 3 4

    the hash will contain two items. The first item has key 1. The value has one string with value "First Id", two arrays, and the first array is (2,2) and the second is (3,4). The second item has key 2. The value has one string with value "Second Id", two arrays, (3) and (4). In Java the structurewould be Map<Integer, Triplet<String, List<Integer>, List<Integer>>>.

    To model this data structure in perl I had to do stuff like the following, which works on Strawberry Perl and Linux Perl:

    my %data; # map of id to [ Name, ArrayOfInt, ArrayOfInt ]
    my $value = $data{$id};
    my @tmpvalue = ($name, [], []);
    $value = \@tmpvalue;
    $data{$id} = $value;
    push(@{$$value[1]}, $firstInt);

    I don't know why the particular combination of $ @ {} \ works, but it does.Questions are:

    (1) Is this the most efficient way?
    (2) Why does it work?

    The full script is below


    #!/usr/bin/perl

    use strict;
    use warnings;

    my %data; # map of id to [ Name, ArrayOfInt, ArrayOfInt ]

    open sqlData, "hash-rows.txt" || die $!;

    while (<sqlData>)
    {
    chomp $_;
    my @row = split '\t', $_;

    my $id = $row[0];
    my $name = $row[1];
    my $firstInt = $row[2];
    my $secondInt = $row[3];

    my $value = $data{$id};
    if (not defined $value)
    {
    my @tmpvalue = ($name, [], []);
    $value = \@tmpvalue;
    $data{$id} = $value;
    }
    push(@{$$value[1]}, $firstInt);
    push(@{$$value[2]}, $secondInt);
    }

    foreach my $id (keys %data)
    {
    my $value = $data{$id};
    my $name = $$value[0];
    my @firstInts = @{$$value[1]};
    my @secondInts = @{$$value[2]};
    print "ENTRY\n id=$id\n name=$name\n firstInts=(@firstInts)\n secondInts=(@secondInts)\n";
    }


    ENTRY
    id=1
    name=First
    firstInts=(Id Id)
    secondInts=(2 2)
    ENTRY
    id=2
    name=Second
    firstInts=(Id)
    secondInts=(3)
     
    removeps groups, Sep 13, 2012
    #1
    1. Advertising

  2. removeps groups

    Jim Gibson Guest

    In article <>,
    removeps groups <> wrote:

    > My perl script reads lines of a file, or rows of a database...


    > To model this data structure in perl I had to do stuff like the following,
    > which works on Strawberry Perl and Linux Perl:
    >
    > my %data; # map of id to [ Name, ArrayOfInt, ArrayOfInt ]
    > my $value = $data{$id};
    > my @tmpvalue = ($name, [], []);
    > $value = \@tmpvalue;
    > $data{$id} = $value;
    > push(@{$$value[1]}, $firstInt);
    >
    > I don't know why the particular combination of $ @ {} \ works, but it does.
    > Questions are:
    >
    > (1) Is this the most efficient way?


    The data structure is efficient, but you have some unnecessary
    assignment operations in the code above. For example, you assign to the
    variable $value twice, without using the first value.

    > (2) Why does it work?


    Because of Perl's dereferencing and precedence rules. If you want a
    more detailed explanation, you need to specify more exactly what "it"
    is.

    I find the expression @{$$value[1]} somewhat ambiguous and difficult to
    parse.

    $value is a reference to an array. $value->[0] would be the first
    element of that array, as would ${$value}[0].

    $$value[1] could be interpreted as either ${$value[1]} or ${$value}[1],
    depending upon Perl's precedence rules. The former is dereferencing the
    second element of the @value array. The latter is the second element of
    the anonymous array referenced by the scalar $value. Since
    dereferencing a scalar has highest precedence, Perl will do the latter.

    If you don't want to learn Perl's precedence rules, then just use the
    arrow notation and explicit braces ({}). That is what I do.

    push( @{ $data{$id}->[1] }, $firstInt );

    From the inside out:

    1. %data is a hash
    2. $id is a key for that $hash
    3. $data{$id} is the value associated with that key, a reference to an
    anonymous array.
    4. $data{$id}->[1] is the second element of that array, a reference to
    another anonymous array
    5. @{$data{$id}->[1]} is that anonymous array
    6. push( ${$data{$id}->[1]}, $firstInt ) is pushing the value from
    $firstInt onto the end of that array.


    >
    > The full script is below
    >
    >
    > #!/usr/bin/perl
    >
    > use strict;
    > use warnings;
    >
    > my %data; # map of id to [ Name, ArrayOfInt, ArrayOfInt ]
    >
    > open sqlData, "hash-rows.txt" || die $!;
    >
    > while (<sqlData>)
    > {
    > chomp $_;
    > my @row = split '\t', $_;
    >
    > my $id = $row[0];
    > my $name = $row[1];
    > my $firstInt = $row[2];
    > my $secondInt = $row[3];


    You can also use:

    my( $id, $name, $firstInt, $secondInt ) = split '\t', $_;

    or just

    my( $id, $name, $firstInt, $secondInt ) = split;

    >
    > my $value = $data{$id};
    > if (not defined $value)


    It is better to use the exists function for testing hashes for the
    presence of a key,value pair.

    > {
    > my @tmpvalue = ($name, [], []);
    > $value = \@tmpvalue;
    > $data{$id} = $value;


    You can replace the 3 lines above with:

    $data{$id} = [ $name, [], [] ];

    > }


    > push(@{$$value[1]}, $firstInt);


    push( @{$value->[1]}, $firstInt );

    > push(@{$$value[2]}, $secondInt);


    push( @{$value->[2]}, $secondInt );

    > }
    >
    > foreach my $id (keys %data)
    > {
    > my $value = $data{$id};
    > my $name = $$value[0];


    my $name = $value->[0];

    etc.

    > my @firstInts = @{$$value[1]};
    > my @secondInts = @{$$value[2]};
    > print "ENTRY\n id=$id\n name=$name\n firstInts=(@firstInts)\n
    > secondInts=(@secondInts)\n";
    > }
    >
    >
    > ENTRY
    > id=1
    > name=First
    > firstInts=(Id Id)
    > secondInts=(2 2)
    > ENTRY
    > id=2
    > name=Second
    > firstInts=(Id)
    > secondInts=(3)
    >


    --
    Jim Gibson
     
    Jim Gibson, Sep 13, 2012
    #2
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Philipp
    Replies:
    21
    Views:
    1,174
    Philipp
    Jan 20, 2009
  2. rp
    Replies:
    1
    Views:
    563
    red floyd
    Nov 10, 2011
  3. Adam Akhtar
    Replies:
    5
    Views:
    675
    Adam Akhtar
    Mar 25, 2008
  4. Älphä Blüë

    Hash of Hash of Arrays Question

    Älphä Blüë, Jul 18, 2009, in forum: Ruby
    Replies:
    5
    Views:
    682
    Älphä Blüë
    Jul 18, 2009
  5. Tore Aursand
    Replies:
    3
    Views:
    570
    Anno Siegel
    Sep 16, 2003
Loading...

Share This Page