rearrange "columns" of a multi-level hash?

Discussion in 'Perl Misc' started by hymie!, Jun 14, 2004.

  1. hymie!

    hymie! Guest

    Greetings. I don't have that special knack for properly forming a
    Google search that will give me the answer I seek, so I apologize if
    I'm asking an old question.

    I'm taking over a project from a co-worker.

    We are processing a file that has information in it:
    customer vendor transType productCode appNumber resultCode

    We have to prepare 2 reports from this data.

    Without too much detail, the first report is sorted by Customer, then by
    TransactionType, then by ProductCode, and then by resultCode, with a count
    of the number of lines that match each configuration.

    The second report is similar, but is sorted by Customer, then by Vendor,
    then by TransType, ProdCode, and resultCode.

    Co-worker wrote the script with (I don't know the correct term) a multi-
    level hash:

    unless( exists $list{ $customer } )
    {
    $list{ $customer } = () ;
    }
    unless( exists $list{ $customer }{ $type } )
    {
    $list{ $customer }{ $type } = () ;
    }
    and so on, down to
    unless( exists $list{$customer}{$type}{$productCode}{$appNo}{$vendor} )
    {
    $list{ $customer }{ $type }{ $productCode }{ $appNo }{ $vendor } =
    { 130 => 0,
    150 => 0,
    385 => 0 } ;
    }
    if( exists $list{$customer}{$type}{$productCode}{$appNo}
    {$vendor}{$returnCode} )
    {
    $list{$customer}{$type}{$productCode}{$appNo}{$vendor}{$returnCode} = 1
    }

    Then he reads the data thusly:

    foreach $customer ( keys(%list) )
    {
    print "Customer: $customer\n\n" ;
    foreach $type ( keys(%{$list{ $customer }}) )
    {
    printf "\n Type: %s", $type ;
    foreach $productCode ( keys(%{$list{ $customer }{ $type }}) )
    {
    printf "\n Product: %s\n", $productCode ;
    foreach $appNo (keys(%{$list{ $customer }{ $type }{ $productCode }}))
    {
    foreach $vendor (keys(%{$list{$customer}{$type}
    {$productCode}{$appNo}}))
    {
    if( $list{ $customer }{ $type }{ $productCode }
    { $appNo }{ $vendor }{385})
    {
    $no385++ ;
    }
    if( $list{ $customer }{ $type }{ $productCode }
    { $appNo }{ $vendor }{150})
    {
    $no150++ ;
    }
    }
    }
    }
    }
    }

    This generates the frst report that is sorted by Customer and Type.
    He wrote a second, almost identical script to re-parse all of the original
    data into a new hash with the variables in customer-vendor-type order.

    What I want to know is if there is
    (*) an easier way to re-arrange the first hash (visualize taking a column
    in a spreadsheet and moving it over, then resorting with the new
    column order)?
    (*) a better/easier way to start from scratch?

    Thanks.

    hymie! http://www.smart.net/~hymowitz
    ===============================================================================
     
    hymie!, Jun 14, 2004
    #1
    1. Advertising

  2. hymie!

    Ben Morrow Guest

    Quoth (hymie!):
    >
    > We are processing a file that has information in it:
    > customer vendor transType productCode appNumber resultCode
    >
    > We have to prepare 2 reports from this data.
    >
    > Without too much detail, the first report is sorted by Customer, then by
    > TransactionType, then by ProductCode, and then by resultCode, with a count
    > of the number of lines that match each configuration.
    >
    > The second report is similar, but is sorted by Customer, then by Vendor,
    > then by TransType, ProdCode, and resultCode.
    >
    > Co-worker wrote the script with (I don't know the correct term) a multi-
    > level hash:


    That term'll do fine... some people here use HoH, for Hash-of-Hash.

    > unless( exists $list{ $customer } )
    > {
    > $list{ $customer } = () ;
    > }
    > unless( exists $list{ $customer }{ $type } )
    > {
    > $list{ $customer }{ $type } = () ;
    > }
    > and so on, down to
    > unless( exists $list{$customer}{$type}{$productCode}{$appNo}{$vendor} )
    > {
    > $list{ $customer }{ $type }{ $productCode }{ $appNo }{ $vendor } =
    > { 130 => 0,
    > 150 => 0,
    > 385 => 0 } ;
    > }


    None of this is necessary. What was *meant* here, I think, is to create
    a new anon hash for each level; in that case it should have been '= {}'
    not '= ()'. In any case, if you treat an undefined variable as though
    it's got a hashref in it, Perl will create a new anon hash and put a ref
    to it in there for you. Also, a hash key that doesn't exist will return
    a value of undef, which is zero in numeric context (with a warning you
    can turn off)...

    > if( exists $list{$customer}{$type}{$productCode}{$appNo}
    > {$vendor}{$returnCode} )
    > {
    > $list{$customer}{$type}{$productCode}{$appNo}{$vendor}{$returnCode} = 1
    > }


    ....so this whole lot can be replaced with the one line

    $list{$customer}{$type}{$productCode}{$appNo}{$vendor}{$returnCode} = 1
    if grep $_ == $returnCode, qw/130 150 385/;

    If you need the keys of the last hash to be right, or you specifically
    need the zeros, you could do

    my @validCodes = qw/130 150 385/;

    for ($list{$customer}...{$vendor}) {
    @{$_}{@validCodes} = (0) x @validCodes;

    grep $_ == $returnCode, @validCodes
    and $_->{$returnCode} = 1;
    }

    The expression @{$_}{@validCodes} is perhaps a little confusing: the
    first {} are for disambiguation, the second are a hash slice. Compare
    with @hash{@validCodes}.

    Why is the data stored like this at all? Surely it would be better to
    store the return code straight in the hash, rather than have another
    level with only one (significant) value?

    $list{$customer}...{$vendor} = $returnCode
    if grep $_ == $returnCode, qw/130 150 385/;

    Also, what happens if the return code *isn't* in the list? Is the list
    supposed to be exhaustive (in which case you can simply strip the greps
    out of the above)?

    > Then he reads the data thusly:


    (unnecessary use of -ly: 'thus' means 'like this' all by itself)

    > foreach $customer ( keys(%list) )


    This is not sorted. Did you simply mean 'grouped by', or should it be

    for $customer (sort keys %list)
    ?

    > {
    > print "Customer: $customer\n\n" ;
    > foreach $type ( keys(%{$list{ $customer }}) )
    > {
    > printf "\n Type: %s", $type ;


    Don't use printf when interpolation will do.

    > foreach $productCode ( keys(%{$list{ $customer }{ $type }}) )
    > {
    > printf "\n Product: %s\n", $productCode ;
    > foreach $appNo (keys(%{$list{ $customer }{ $type }{ $productCode }}))
    > {
    > foreach $vendor (keys(%{$list{$customer}{$type}
    > {$productCode}{$appNo}}))
    > {
    > if( $list{ $customer }{ $type }{ $productCode }
    > { $appNo }{ $vendor }{385})
    > {
    > $no385++ ;


    This should be a hash. Variables with systematically similar names
    nearly always should be.

    $no{385}++;

    I would use a hash rather than an array even though the keys are
    numeric because they are sparse.

    > }
    > if( $list{ $customer }{ $type }{ $productCode }
    > { $appNo }{ $vendor }{150})
    > {
    > $no150++ ;
    > }
    > }
    > }
    > }
    > }
    > }


    This is a mess :). I would recast it with a dispatch table (untested):

    my %no;

    sub do_keys {
    my $ivalue = shift;
    my $action = shift;

    if ('HASH' eq ref $value) {
    for (keys %$value) {
    $action and $action->($_);
    do_keys $value->{$_}, @_;
    }
    }
    else {
    $action and $action->($value);
    }
    }

    do_keys \$list,
    sub { print "Customer: $_[0]" },
    sub { print " Type: $_[0]" },
    sub { print " Product: $_[0]" },
    undef,
    undef,
    sub { $no{$_[0]}++ };


    > This generates the frst report that is sorted by Customer and Type.
    > He wrote a second, almost identical script to re-parse all of the original
    > data into a new hash with the variables in customer-vendor-type order.
    >
    > What I want to know is if there is
    > (*) an easier way to re-arrange the first hash (visualize taking a column
    > in a spreadsheet and moving it over, then resorting with the new
    > column order)?


    The obvious, though maybe not the most efficient, way is to unwrap it
    into a big list of records and wrap it up again:

    # old order: customer type product appno vendor
    # new order: customer vendor type product appno

    my %new_list;

    # customer is still first, so we can leave that

    for my $cust (keys %list) {

    my @records;
    my %me;

    do_keys $list{$cust},
    sub { $me{type} = $_[0] },
    sub { $me{prod} = $_[0] },
    sub { $me{appn} = $_[0] },
    sub { $me{vend} = $_[0] },
    sub { push @records, { %me, code => $_[0] } }

    for (@records) {
    $new_list{ $cust }
    { $_->{vend} }
    { $_->{type} }
    { $_->{prod} }
    { $_->{appn} } = $_->{code};
    }
    }

    Ben

    --
    Razors pain you / Rivers are damp
    Acids stain you / And drugs cause cramp. [Dorothy Parker]
    Guns aren't lawful / Nooses give
    Gas smells awful / You might as well live.
     
    Ben Morrow, Jun 14, 2004
    #2
    1. Advertising

  3. hymie!

    Ben Morrow Guest

    Quoth Ben Morrow <>:
    >
    > This is a mess :). I would recast it with a dispatch table (untested)...:
    >
    > my %no;
    >
    > sub do_keys {
    > my $ivalue = shift;

    ^
    ....but without a typo (yes, I use vim) :).

    > my $action = shift;
    >
    > if ('HASH' eq ref $value) {

    [...]
    >
    > Quoth (hymie!):
    > >
    > > What I want to know is if there is
    > > (*) an easier way to re-arrange the first hash (visualize taking a column
    > > in a spreadsheet and moving it over, then resorting with the new
    > > column order)?

    >
    > The obvious, though maybe not the most efficient, way is to unwrap it
    > into a big list of records and wrap it up again:


    ....except there's no need to actually build the list.

    > # old order: customer type product appno vendor
    > # new order: customer vendor type product appno
    >
    > my %new_list;
    >
    > # customer is still first, so we can leave that
    >
    > for my $cust (keys %list) {
    >
    > my @records;


    We don't need this temp array.

    > my %me;
    >
    > do_keys $list{$cust},
    > sub { $me{type} = $_[0] },
    > sub { $me{prod} = $_[0] },
    > sub { $me{appn} = $_[0] },
    > sub { $me{vend} = $_[0] },
    > sub { push @records, { %me, code => $_[0] } }


    Gah! another typo... clearly not concentrating :(.
    Anyway, this last should have been:

    sub {
    $new_list{ $cust }
    { $me{vend} }
    { $me{type} }
    { $me{prod} }
    { $me{appn} } = $_[0];
    };


    > for (@records) {


    ....and then we don't need to loop over the data a second time.

    Ben

    --
    perl -e'print map {/.(.)/s} sort unpack "a2"x26, pack "N"x13,
    qw/1632265075 1651865445 1685354798 1696626283 1752131169 1769237618
    1801808488 1830841936 1886550130 1914728293 1936225377 1969451372
    2047502190/' #
     
    Ben Morrow, Jun 14, 2004
    #3
  4. hymie!

    hymie! Guest

    In our last episode, the evil Dr. Lacto had captured our hero,
    Ben Morrow <>, who said:
    >
    >Quoth (hymie!):
    >>
    >> Co-worker wrote the script with (I don't know the correct term) a multi-
    >> level hash:

    >
    >That term'll do fine... some people here use HoH, for Hash-of-Hash.


    >Also, a hash key that doesn't exist will return
    >a value of undef, which is zero in numeric context (with a warning you
    >can turn off)...


    Excellent. That was one of my fears. :)

    >...so this whole lot can be replaced with the one line
    >
    >$list{$customer}{$type}{$productCode}{$appNo}{$vendor}{$returnCode} = 1
    > if grep $_ == $returnCode, qw/130 150 385/;
    >
    >If you need the keys of the last hash to be right, or you specifically
    >need the zeros, you could do


    I don't specifically need the zeros as long as (like you said) an
    undefined hash will return 0.

    >Why is the data stored like this at all? Surely it would be better to
    >store the return code straight in the hash, rather than have another
    >level with only one (significant) value?


    Because a single set of customer-type-productcode-appno-vendor may have
    more than one return code, and I need to track all of them that appear.

    But I'll probaby switch it something like
    $list{$customer}...{$vendor} .= "$returnCode:"
    if grep $_ == $returnCode, qw/130 150 385/;
    and then I can m// through the values later.

    >Also, what happens if the return code *isn't* in the list?


    Then I can ignore that entire line of data.

    >> foreach $customer ( keys(%list) )

    >
    >This is not sorted. Did you simply mean 'grouped by', or should it be


    Sorting isn't required.

    >> {
    >> print "Customer: $customer\n\n" ;
    >> foreach $type ( keys(%{$list{ $customer }}) )
    >> {
    >> printf "\n Type: %s", $type ;

    >
    >Don't use printf when interpolation will do.


    Oops. Oversight.

    >This should be a hash. Variables with systematically similar names
    >nearly always should be.
    >
    > $no{385}++;


    It's actually a little more complicated than that, but the hash is still
    a good idea.

    >This is a mess :). I would recast it with a dispatch table (untested):


    I apprecite the table, but this is probably beyond my ability to
    maintain and troubleshoot. But it's a great idea and, if nothing else,
    a learning exercise.

    >> What I want to know is if there is
    >> (*) an easier way to re-arrange the first hash (visualize taking a column
    >> in a spreadsheet and moving it over, then resorting with the new
    >> column order)?

    >
    >The obvious, though maybe not the most efficient, way is to unwrap it
    >into a big list of records and wrap it up again:


    No, but somewhere along the line, you gave me an idea -- in short, when
    I process
    $list{$cust}{$type}{$prod}{$appno}{$vend}
    I can create
    $list2{$cust}{$vend}{$type}{$prod}{$appno}


    > Razors pain you / Rivers are damp
    > Acids stain you / And drugs cause cramp.
    > Guns aren't lawful / Nooses give
    > Gas smells awful / You might as well live.


    Ooh! I'd been looking for that poem. Thank you.

    hymie! http://www.smart.net/~hymowitz
    ===============================================================================
     
    hymie!, Jun 14, 2004
    #4
  5. hymie!

    Ben Morrow Guest

    Quoth (hymie!):
    > In our last episode, the evil Dr. Lacto had captured our hero,
    > Ben Morrow <>, who said:
    > >
    > >Why is the data stored like this at all? Surely it would be better to
    > >store the return code straight in the hash, rather than have another
    > >level with only one (significant) value?

    >
    > Because a single set of customer-type-productcode-appno-vendor may have
    > more than one return code, and I need to track all of them that appear.
    >
    > But I'll probaby switch it something like
    > $list{$customer}...{$vendor} .= "$returnCode:"
    > if grep $_ == $returnCode, qw/130 150 385/;
    > and then I can m// through the values later.


    Oooh, no, that's very shell :).
    Use an array:

    push @{ $list{...} }, $returnCode if ...;

    Ben

    --
    perl -e'print map {/.(.)/s} sort unpack "a2"x26, pack "N"x13,
    qw/1632265075 1651865445 1685354798 1696626283 1752131169 1769237618
    1801808488 1830841936 1886550130 1914728293 1936225377 1969451372
    2047502190/' #
     
    Ben Morrow, Jun 14, 2004
    #5
  6. hymie!

    Peter Scott Guest

    In article <cal2ct$q9q$>,
    Ben Morrow <> writes:
    >
    >Quoth (hymie!):
    >> In our last episode, the evil Dr. Lacto had captured our hero,
    >> Ben Morrow <>, who said:
    >> >
    >> >Why is the data stored like this at all? Surely it would be better to
    >> >store the return code straight in the hash, rather than have another
    >> >level with only one (significant) value?

    >>
    >> Because a single set of customer-type-productcode-appno-vendor may have
    >> more than one return code, and I need to track all of them that appear.
    >>
    >> But I'll probaby switch it something like
    >> $list{$customer}...{$vendor} .= "$returnCode:"
    >> if grep $_ == $returnCode, qw/130 150 385/;
    >> and then I can m// through the values later.

    >
    >Oooh, no, that's very shell :).
    >Use an array:
    >
    >push @{ $list{...} }, $returnCode if ...;


    The condition is somewhat shell too :)

    ... if $returnCode =~ /^(130|150|385)\Z/;
    ... if {map {$=>1} (130,150,385)}->{$returnCode};

    Although it would be more maintainable to put something like

    my %Ok_Return_Code = map { $_ => 1 } (130, 150, 385);

    in a configuration section and then just do

    ... if $Ok_Return_Code{$returnCode};

    --
    Peter Scott
    http://www.perldebugged.com/
    *** NEW *** http://www.perlmedic.com/
     
    Peter Scott, Jun 15, 2004
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Steve C. Orr [MVP, MCSD]

    Re: How do I rearrange Items in DropDown Menu?

    Steve C. Orr [MVP, MCSD], Jun 22, 2004, in forum: ASP .Net
    Replies:
    1
    Views:
    901
    Marty U.
    Jun 23, 2004
  2. hazz
    Replies:
    2
    Views:
    3,089
  3. Replies:
    0
    Views:
    491
  4. pabbu
    Replies:
    8
    Views:
    773
    Marc Boyer
    Nov 7, 2005
  5. rp
    Replies:
    1
    Views:
    596
    red floyd
    Nov 10, 2011
Loading...

Share This Page