Hello all!
I am still a beginner, so please be patient with me.
I have a big file with numbers and dates like follows here:
01.01.98
31
33
14
7
35
16
20
20
13
55
1
1
7
etc etc
I need a complicate hash to know the occurrences of numbers in a scope
of 15:
We skip the dates, and we count the lines. The structure of my %hash
looks like follows:
($number{$line, $line, ...}) => $how_many_times
In my example the 20 occurs in line 7 and 8 -> two times:
20{7,8} => 2
And we iterate over it, and keep only 15 numbers in the hash and count
each time the occurrences of each number.
Could somebody help me with this?
Thank you in advance
marek
A rolling Frame that tracks line's of occurances is not as easy as you think.
The concept is simple, the implementation is another thing altogether.
This would not be a problem to present in a beginner Perl class.
Its not actually Perl that would be a problem, its the implemtation of a rolling
frame and tracking of line numbers from a given criteria.
The below code is just a rudimentary framework to demonstrate the constructs that
would be necessary. You might need a hardened programmer with large application
experience to deal with rolling frames and data tracking.
Could this rough code be thinned out? Sure. It just demonstrates the concept, its
not production quality.
Btw, the frame size was set to 5 for the example, change it to 15 or whatever it is
your doing.
Well, good luck and have fun!
-sln
------------------------
# Frames.pl
# -------------------------
# Template:
# We assume a valid frame of 5 (not based on line count) This could be 15 or any number
# @Frame_Cache = (number, number, number, ...); ## 5 elements
# %Items = (number => [line,line,line], number => [line,line,line],...);
use strict;
use warnings;
my @Frame_Cache = ();
my %Items = ();
my ($cache_size, $lncount, $framesize) = (0, 0, 5);
while (<DATA>)
{
++$lncount;
# Digits only, anything else is invalid
/^\s*(\d+)\s*$/;
next if (!$1);
# Add item to frame cache
push @Frame_Cache, $1;
# Add line number onto item array stack (in hash)
push @{$Items{$1}}, $lncount;
print "\nAdding $1 (line $lncount)\n";
# Continue until full frame
++$cache_size;
next if ($cache_size < $framesize);
# First full frame, the roll starts on next one
# Show Frame, do something with %Items
if ($cache_size == $framesize)
{
PrintItems();
next;
}
# Frame is moving, take head off cache
my $item_number = shift @Frame_Cache;
# Adjust lines going out of frame (all array's in hash).
# Delete the item number line array if it is empty.
print "Taking $item_number off (line ".${$Items{$item_number}}[0].")\n";
my $line_going_out_of_frame = ${$Items{$item_number}}[0];
for my $nbr (keys %Items)
{
shift @{$Items{$nbr}} if (${$Items{$nbr}}[0] <= $line_going_out_of_frame);
delete $Items{$nbr} if (!@{$Items{$nbr}});
}
# Show Frame, do something with %Items
PrintItems();
}
# You could print items down here if there is no full frame
# ...
# end of program ...
# This prints the items hash (could use Data:

umper), but more importantly
# gives a template to access the data.
# When your through with debug printing, just comment the print part out.
# Process the data here, refactor this sub when done.
# No sub should access global data imho.
# -----------------
sub PrintItems
{
print "Frame ".($cache_size-$framesize+1)." - $cache_size\n";
for my $nbr (sort {$a<=>$b} keys %Items) {
print "number = $nbr, on lines = [ @{$Items{$nbr}} ]\n";
}
}
__DATA__
01.01.98
99
31
33
14
7
35
16
20
20
13
55
1
1
7
0
2
3
0
2
3
0
2
3
0
2
3
0
2
3
0
---------------
Output:
c:\temp>perl frames.pl
Adding 99 (line 3)
Adding 31 (line 4)
Adding 33 (line 5)
Adding 14 (line 6)
Adding 7 (line 7)
Frame 1 - 5
number = 7, on lines = [ 7 ]
number = 14, on lines = [ 6 ]
number = 31, on lines = [ 4 ]
number = 33, on lines = [ 5 ]
number = 99, on lines = [ 3 ]
Adding 35 (line 8)
Taking 99 off (line 3)
Frame 2 - 6
number = 7, on lines = [ 7 ]
number = 14, on lines = [ 6 ]
number = 31, on lines = [ 4 ]
number = 33, on lines = [ 5 ]
number = 35, on lines = [ 8 ]
Adding 16 (line 9)
Taking 31 off (line 4)
Frame 3 - 7
number = 7, on lines = [ 7 ]
number = 14, on lines = [ 6 ]
number = 16, on lines = [ 9 ]
number = 33, on lines = [ 5 ]
number = 35, on lines = [ 8 ]
Adding 20 (line 10)
Taking 33 off (line 5)
Frame 4 - 8
number = 7, on lines = [ 7 ]
number = 14, on lines = [ 6 ]
number = 16, on lines = [ 9 ]
number = 20, on lines = [ 10 ]
number = 35, on lines = [ 8 ]
Adding 20 (line 11)
Taking 14 off (line 6)
Frame 5 - 9
number = 7, on lines = [ 7 ]
number = 16, on lines = [ 9 ]
number = 20, on lines = [ 10 11 ]
number = 35, on lines = [ 8 ]
Adding 13 (line 12)
Taking 7 off (line 7)
Frame 6 - 10
number = 13, on lines = [ 12 ]
number = 16, on lines = [ 9 ]
number = 20, on lines = [ 10 11 ]
number = 35, on lines = [ 8 ]
Adding 55 (line 13)
Taking 35 off (line 8)
Frame 7 - 11
number = 13, on lines = [ 12 ]
number = 16, on lines = [ 9 ]
number = 20, on lines = [ 10 11 ]
number = 55, on lines = [ 13 ]
Adding 1 (line 14)
Taking 16 off (line 9)
Frame 8 - 12
number = 1, on lines = [ 14 ]
number = 13, on lines = [ 12 ]
number = 20, on lines = [ 10 11 ]
number = 55, on lines = [ 13 ]
Adding 1 (line 15)
Taking 20 off (line 10)
Frame 9 - 13
number = 1, on lines = [ 14 15 ]
number = 13, on lines = [ 12 ]
number = 20, on lines = [ 11 ]
number = 55, on lines = [ 13 ]
Adding 7 (line 16)
Taking 20 off (line 11)
Frame 10 - 14
number = 1, on lines = [ 14 15 ]
number = 7, on lines = [ 16 ]
number = 13, on lines = [ 12 ]
number = 55, on lines = [ 13 ]
Adding 2 (line 18)
Taking 13 off (line 12)
Frame 11 - 15
number = 1, on lines = [ 14 15 ]
number = 2, on lines = [ 18 ]
number = 7, on lines = [ 16 ]
number = 55, on lines = [ 13 ]
Adding 3 (line 19)
Taking 55 off (line 13)
Frame 12 - 16
number = 1, on lines = [ 14 15 ]
number = 2, on lines = [ 18 ]
number = 3, on lines = [ 19 ]
number = 7, on lines = [ 16 ]
Adding 2 (line 21)
Taking 1 off (line 14)
Frame 13 - 17
number = 1, on lines = [ 15 ]
number = 2, on lines = [ 18 21 ]
number = 3, on lines = [ 19 ]
number = 7, on lines = [ 16 ]
Adding 3 (line 22)
Taking 1 off (line 15)
Frame 14 - 18
number = 2, on lines = [ 18 21 ]
number = 3, on lines = [ 19 22 ]
number = 7, on lines = [ 16 ]
Adding 2 (line 24)
Taking 7 off (line 16)
Frame 15 - 19
number = 2, on lines = [ 18 21 24 ]
number = 3, on lines = [ 19 22 ]
Adding 3 (line 25)
Taking 2 off (line 18)
Frame 16 - 20
number = 2, on lines = [ 21 24 ]
number = 3, on lines = [ 19 22 25 ]
Adding 2 (line 27)
Taking 3 off (line 19)
Frame 17 - 21
number = 2, on lines = [ 21 24 27 ]
number = 3, on lines = [ 22 25 ]
Adding 3 (line 28)
Taking 2 off (line 21)
Frame 18 - 22
number = 2, on lines = [ 24 27 ]
number = 3, on lines = [ 22 25 28 ]
Adding 2 (line 30)
Taking 3 off (line 22)
Frame 19 - 23
number = 2, on lines = [ 24 27 30 ]
number = 3, on lines = [ 25 28 ]
Adding 3 (line 31)
Taking 2 off (line 24)
Frame 20 - 24
number = 2, on lines = [ 27 30 ]
number = 3, on lines = [ 25 28 31 ]
c:\temp>