P
Prabh
Hello all,
This is about grepping, regexps and parsing data.
I do have a solution, but I was wondering if anyone could direct me to
a more efficient one.
I have a log file of the following format, which contains info. on a
series of files after a process.
===============================
File1: Info. on File1
File2: Info. on File2
File1: Info. on File1
File3: Info. on File3
File1: Info. on File1
and so on...
===============================
I want to display the output as...
============================
n1 lines of info on File1
n2 lines of info on File2
n3 lines of info on File3
============================
This is what I came up with, but when the input log file is of
gigantic proportions, the parsing takes a lot of time, could anyone
recommend a better solution, please?
#!/usr/local/bin/perl
#======================
#====================
# Foo.txt is the log
#--------------------
open(FDL,"Foo.txt") ;
chomp(@arr = <FDL> ) ;
close(FDL) ;
#===============================
# Get all the files in the log
#-------------------------------
undef @files ;
foreach $line ( @arr ) {
push(@files,(split(/\:/,$line))[0]) ;
}
#==========================================
# Sort the files, find the uniq files.
# Foreach such file, grep the original log
# for all occurrences and count.
#------------------------------------------
foreach $file ( &uniq(sort @files ) ) {
undef $info ;
$info = grep {/^$file\:/} @arr ;
printf "$info lines of info on $file\n";
}
#=============================
# subroutine to do Unixy-uniq
#-----------------------------
sub uniq {
@uniq = @_ ;
#=======================================================
# Foreach array element , compare with its predecessor.
# If yes, its already present and splice.
#-------------------------------------------------------
for ( $i = 1; $i < @uniq ; $i++ ) {
if ( @uniq[$i] eq @uniq[$i-1] ) {
splice( @uniq,$i-1,1 ) ;
$i--;
}
}
return @uniq ;
}
Thanks,
Prab
This is about grepping, regexps and parsing data.
I do have a solution, but I was wondering if anyone could direct me to
a more efficient one.
I have a log file of the following format, which contains info. on a
series of files after a process.
===============================
File1: Info. on File1
File2: Info. on File2
File1: Info. on File1
File3: Info. on File3
File1: Info. on File1
and so on...
===============================
I want to display the output as...
============================
n1 lines of info on File1
n2 lines of info on File2
n3 lines of info on File3
============================
This is what I came up with, but when the input log file is of
gigantic proportions, the parsing takes a lot of time, could anyone
recommend a better solution, please?
#!/usr/local/bin/perl
#======================
#====================
# Foo.txt is the log
#--------------------
open(FDL,"Foo.txt") ;
chomp(@arr = <FDL> ) ;
close(FDL) ;
#===============================
# Get all the files in the log
#-------------------------------
undef @files ;
foreach $line ( @arr ) {
push(@files,(split(/\:/,$line))[0]) ;
}
#==========================================
# Sort the files, find the uniq files.
# Foreach such file, grep the original log
# for all occurrences and count.
#------------------------------------------
foreach $file ( &uniq(sort @files ) ) {
undef $info ;
$info = grep {/^$file\:/} @arr ;
printf "$info lines of info on $file\n";
}
#=============================
# subroutine to do Unixy-uniq
#-----------------------------
sub uniq {
@uniq = @_ ;
#=======================================================
# Foreach array element , compare with its predecessor.
# If yes, its already present and splice.
#-------------------------------------------------------
for ( $i = 1; $i < @uniq ; $i++ ) {
if ( @uniq[$i] eq @uniq[$i-1] ) {
splice( @uniq,$i-1,1 ) ;
$i--;
}
}
return @uniq ;
}
Thanks,
Prab