problem with hash & sort array

U

uNConVeNtiOnAL

Hi -

I am trying to read from an file and put the lines into a hash. Then I
put the hash
into an array with the sort command. This sort will put the array into
such order that
I can see if duplicate lines occur and add their numeric total field
together. I will add the combined
data as new hash entries and then remove the original lines that were
duplicates.

I don't seem to be putting any values in the hash - before you go off on
me, this code
is very similar to code that is working. The twist is I have to
identify the duplicate data,
create a new entry for it (rename one of the elements so it is
distinguishable from the
duplicates), and remove all duplicate lines.

Thanks a bunch for any help -

-T-

open (my_file, "$ARGV[0]") || die "ERROR: missing file";

#load up vmi hash to sort and combine duplicate records
while (<my_file>)
{
chomp;
$a_a = substr ($_, 0, 4);
$b_b = substr ($_, 12, 11);
c_c = substr ($_, 24, 2);
d_d = substr ($_, 54, 4);
e_e = substr ($_, 59, 2);
f_f = substr ($_, 62, 2);
g_g = substr ($_, 46, 7);
$forcombo{"$b_b$d_d$e_e$f_f"} =
$forcombo{"$b_b$d_d$e_e$f_f"}."^"."$a_a$b_b$c_c$g_g$d_d$e_e$f_f";
}

close my_file;

# how to delete from a hash ==> delete($HASH{$KEY});

@keys = split(/\^/,$forcombo{"$b_bd_de_ef_f"});

foreach $key (sort(@keys))
{
#printf nodupes_file "$key\n";
$lv_b_b = substr($key, 4 ,11);
$lv_a_a = substr($key, 0, 4);
$lv_c_c = substr($key, 15, 2);
$lv_d_d = substr($key, 24,4);
$lv_e_e = substr($key, 28,2);
$lv_f_f = substr($key, 30,2);
$lv_g_g = substr($key, 17, 7);
$lv_g_g=~s/ //g;

- - - more stuff

}
 
R

Ragnar Hafstað

[snipped problem and something that vaguely looked like parts of code, but
was not real]

a few bits of advice:
a) when you are having problems with a program, you should post real code
here,
not something that may or may not have some resemblance to your real
code.
this is because many of us here will spend time reading through it ,
trying to figure it
out, find errors and so on. this time is wasted if this is not actually
the problem code.
in this case, your code has obvious syntax errors, so it is obviously
not real code.
$b_b = substr ($_, 12, 11);
c_c = substr ($_, 24, 2);

b) try to simplify your program before posting it. we are not interested
in the details of your application, like exacly what parts of a input
string
gets assigned to what variable, unless that has an direct relevance to
your
problem. a side effect of this is that when you simplify your program
like this,
often the solution to your problem will become clear to you before you
even
have to ask here (and maybe make a fool out of yourself)

c) unless relevant to your problem, do not post code like:
while (<MYFILE>) { complicatedparsingofinputthatdoesnotwork($_)}
we do not have to guess how the input looks like, and deduce from that
what is
happening. include your input in your code in DATA areas or in plain
variable assignements, so we can SEE it.

d) post a complete program. many of us just like to copy your posted
program
save it, and run it, and maybe fiddle with it and run it again. this of
course implies
a) b) and c)

e) dont say it does not work. say what you expect to happen and show us
what does happen


feel free to try again

gnari
 
J

Jay Tilton

[Please be aware of how your news client handles word-wrapping of long
lines. The quoted text below has been reformatted.]


: I am trying to read from an file and put the lines into a
: hash. Then I put the hash into an array with the sort
: command. This sort will put the array into such order that
: I can see if duplicate lines occur and add their numeric
: total field together. I will add the combined data as new
: hash entries and then remove the original lines that were
: duplicates.
:
: I don't seem to be putting any values in the hash

What sequence of debugging steps leads you to that conclusion?

: before
: you go off on me, this code is very similar to code that is
: working. The twist is I have to identify the duplicate
: data, create a new entry for it (rename one of the elements
: so it is distinguishable from the duplicates), and remove
: all duplicate lines.

Is that relevant? The portions of the program that do that much seem to
have been eliminated in your article.

: open (my_file, "$ARGV[0]") || die "ERROR: missing file";
:
: #load up vmi hash to sort and combine duplicate records
: while (<my_file>)
: {
: chomp;
: $a_a = substr ($_, 0, 4);
: $b_b = substr ($_, 12, 11);
: c_c = substr ($_, 24, 2);
: d_d = substr ($_, 54, 4);
: e_e = substr ($_, 59, 2);
: f_f = substr ($_, 62, 2);
: g_g = substr ($_, 46, 7);

I guess there are supposed to be a few more '$' sigils on the LHS of
those assignments.

Consider Perl's unpack() function as an alternative to substr() for
plucking fixed-width fields from a record. That might go like:

my @fields =
unpack 'A4 x8 A11 x1 A2 x20 A7 x1 A4 x1 A2 x1 A2', $_;

: $forcombo{"$b_b$d_d$e_e$f_f"} =
: $forcombo{"$b_b$d_d$e_e$f_f"}."^"."$a_a$b_b$c_c$g_g$d_d$e_e$f_f";
: }

You're using string concatenation in the hash value to mimic an array of
arrays. To get the original fields back later, the program has to burst
the string into records, then pluck the fields out of each record again.
This scheme is terribly fragile, not to mention repetitious.

Using a real array reference for the hash value, then pushing a
reference to the array containing the fields is a much saner approach.

push @{ $forcombo{ @fields[1, 4, 5, 6] } }, \@fields;
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
You'll see Perl's largely ignored "multidimensional hash emulation"
feature being used there (see the entry for ``$;'' in perlvar). The
underscored portion is just like saying:

$forcombo{join($; , $fields[1], $fields[4], $fields[5], $fields[6])}

That feature exists as a mechanism to mimic complex data structures. It
seems appropriate in this case, since you are concerned more with
collecting similar records together than with having an obsessively
organized data structure.

So in one place I recommend against using string concatenation to mimic
a real data structure, and in the next place I make the exact opposite
recommendation. I'm rather enjoying the apparent paradox.

: close my_file;
:
: @keys = split(/\^/,$forcombo{"$b_bd_de_ef_f"});
^^^^^^^^^^^^^
I guess there are some more missing '$' sigils in there.

This part of the process should be about iterating over the hash values,
and is most probably where the program is going off its rails. Those
scalars were used in creating the %forcombo hash from the data file
contents, but that step is over, and the scalars' values are stale.
"use strict;" and proper variable scoping prevents this kind of mistake.

: foreach $key (sort(@keys))
: {
: #printf nodupes_file "$key\n";
: $lv_b_b = substr($key, 4 ,11);
: $lv_a_a = substr($key, 0, 4);
: $lv_c_c = substr($key, 15, 2);
: $lv_d_d = substr($key, 24,4);
: $lv_e_e = substr($key, 28,2);
: $lv_f_f = substr($key, 30,2);
: $lv_g_g = substr($key, 17, 7);
: $lv_g_g=~s/ //g;
: - - - more stuff
: }

Scrap that. Iterate over the sorted keys, then iterate over the array
referenced in the value for each key. If, as recommended earlier, the
program has stored each record's fields as an array reference, they can
be immediately recovered by dereferencing the array instead of doing all
that substr() jazz.

foreach my $key( sort keys %forcombo ) {
# Insert whatever initialization is needed to process each
# set of similar records.
foreach my $record ( @{ $forcombo{$key} } ) {
my(
$lv_a_a, $lv_b_b, $lv_c_c, $lv_g_g,
$lv_d_d, $lv_e_e, $lv_f_f,
) = @$record;
# - - - more stuff
}
# Insert whatever steps are performed after a set of
# similar records has been processed.
}
 
U

uNConVeNtiOnAL

Well, well, where to start.

This code does work. It's real code with the variable name changed for
brevity. If I include the foreach in the while loop it works,
if I don't it won't. I thought someone might be able to spot why.

I will figure it out on my own. Posting to this group is always a waste of
time.

You poor unhappy people. I picture you very overweight, pimply, living at
home with your mothers, locked away in your rooms writing smart-ass retorts
to the world - kind of like comic book guy on the Simpsons, only way meaner.

If there *happens to be* someone in this group not in this category, please
don't
waste any time with my problem - I won't check back.

Have a happy 2004 - but I doubt it ;)
 
J

John J. Trammell

You poor unhappy people. I picture you very overweight, pimply,
living at home with your mothers, locked away in your rooms writing
smart-ass retorts to the world - kind of like comic book guy on the
Simpsons, only way meaner.

Worst. Insult. *Ever*. :)

If you haven't completely given up, you might want to try the
Minneapolis Perl Mongers group: http://minneapolis.pm.org/
 
J

Jay Tilton

: This code does work. It's real code with the variable name changed for
: brevity.

Were the essential characters eliminated for brevity too?

: If I include the foreach in the while loop it works,

Baloney.

: if I don't it won't. I thought someone might be able to spot why.

Well then, why didn't you say exactly that? The PSI::ESP module is
still in development, you know.

: I will figure it out on my own.

Bet you won't. You have no idea what you're doing.

: Posting to this group is always a waste of time.

A waste of everybody else's time, yes.

: You poor unhappy people. I picture you very overweight, pimply, living at
: home with your mothers, locked away in your rooms writing smart-ass retorts
: to the world - kind of like comic book guy on the Simpsons, only way meaner.

Pfah. Professor Frink is a closer portrait of the clpm audience.
Or Disco Stu.

: I won't check back.
: Have a happy 2004

Interesting juxtaposition.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,776
Messages
2,569,603
Members
45,188
Latest member
Crypto TaxSoftware

Latest Threads

Top