efficient way to write multiple loops code

Discussion in 'Perl Misc' started by friend.05@gmail.com, Oct 7, 2008.

  1. Guest

    Hi,

    I am trying to analyze some data. I have big data files.

    I have 3 different files in following format. ($file_1, $file_2,
    $file_3)

    ID | Time | IP | Code

    Following is psuedo code which I am writing. I want to know another
    efficient way to do same thing.

    open(INFO_1,$file_1);
    open(INFO_2,$file_2);
    open(INFO_3,$file_3);

    @file1_lines = <INFO_1>;
    @file2_lines = <INFO_2>;
    @file3_lines = <INFO_3>;

    foreach $file1_line (@file1_lines)
    {
    @file1 = split('\|',$file1_line);

    #some code

    foreach $file2_line (@file2_lines)
    {
    @file2 = split('\|',$file2_line);

    #some code

    #if condition between File1 data and File2 data
    {

    #some code

    foreach $file3_line (@file3_lines)
    {
    @file3 = split('\|',$file3_line);

    #some code

    #if condition

    }

    }


    }


    }



    So I am going thorugh each data of file 1 and depending on if data is
    present in file2 and again depending on some if condition I look for
    that data in file3.


    So each data of file1 will have to go through each data of file2 and
    each data of file2 will have to go thorugh file3.

    So this code is taking lot of time. I want some suggestion for
    efficient code.

    Can I use Hash Array (by reading file in hash array)



    Thanks
    , Oct 7, 2008
    #1
    1. Advertising

  2. Guest

    "" <> wrote:

    > foreach $file1_line (@file1_lines)
    > {
    > @file1 = split('\|',$file1_line);
    > foreach $file2_line (@file2_lines)
    > {
    > @file2 = split('\|',$file2_line);
    > #if condition between File1 data and File2 data


    ....
    >
    > So this code is taking lot of time. I want some suggestion for
    > efficient code.
    >
    > Can I use Hash Array (by reading file in hash array)


    Whether you can use a hash to speed this up depends on whether
    "If condition between File1 data and File2 data" can be reduced
    to (or protected by) fast hash look ups. We can't answer this for you
    without knowing what the nature of that condition is.

    Xho

    --
    -------------------- http://NewsReader.Com/ --------------------
    The costs of publication of this article were defrayed in part by the
    payment of page charges. This article must therefore be hereby marked
    advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
    this fact.
    , Oct 7, 2008
    #2
    1. Advertising

  3. Tim Greer Guest

    wrote:

    >
    > I have 3 different files in following format. ($file_1, $file_2,
    > $file_3)
    >
    > ID | Time | IP | Code
    >
    > Following is psuedo code which I am writing. I want to know another
    > efficient way to do same thing.
    >
    > open(INFO_1,$file_1);
    > open(INFO_2,$file_2);
    > open(INFO_3,$file_3);
    >
    > @file1_lines = <INFO_1>;
    > @file2_lines = <INFO_2>;
    > @file3_lines = <INFO_3>;


    >
    >
    > So I am going thorugh each data of file 1 and depending on if data is
    > present in file2 and again depending on some if condition I look for
    > that data in file3.
    >
    >
    > So each data of file1 will have to go through each data of file2 and
    > each data of file2 will have to go thorugh file3.
    >
    > So this code is taking lot of time. I want some suggestion for
    > efficient code.
    >
    > Can I use Hash Array (by reading file in hash array)
    >
    >


    The answer very much depends on what #some code is actually doing. Is
    the data fixed in the files, what specific checks are you doing? Could
    the data be anywhere in a file, inside of a line of data, or are you
    trying to match lines from ^ start to $ end of line per file, or are
    you doing some other type of processing?
    --
    Tim Greer, CEO/Founder/CTO, BurlyHost.com, Inc.
    Shared Hosting, Reseller Hosting, Dedicated & Semi-Dedicated servers
    and Custom Hosting. 24/7 support, 30 day guarantee, secure servers.
    Industry's most experienced staff! -- Web Hosting With Muscle!
    Tim Greer, Oct 7, 2008
    #3
  4. Guest

    On Oct 7, 4:48 pm, Tim Greer <> wrote:
    > wrote:
    >
    > > I have 3 different files in following format. ($file_1, $file_2,
    > > $file_3)

    >
    > > ID | Time | IP | Code

    >
    > > Following is psuedo code which I am writing. I want to know another
    > > efficient way to do same thing.

    >
    > > open(INFO_1,$file_1);
    > > open(INFO_2,$file_2);
    > > open(INFO_3,$file_3);

    >
    > > @file1_lines = <INFO_1>;
    > > @file2_lines = <INFO_2>;
    > > @file3_lines = <INFO_3>;

    >
    > > So I am going thorugh each data of file 1 and depending on if data is
    > > present in file2 and again depending on some if condition I look for
    > > that data in file3.

    >
    > > So each data of file1 will have to go through each data of file2 and
    > > each data of file2 will have to go thorugh file3.

    >
    > > So this code is taking lot of time. I want some suggestion for
    > > efficient code.

    >
    > > Can I use Hash Array (by reading file in  hash array)

    >
    > The answer very much depends on what #some code is actually doing.  Is
    > the data fixed in the files, what specific checks are you doing?  Could
    > the data be anywhere in a file, inside of a line of data, or are you
    > trying to match lines from ^ start to $ end of line per file, or are
    > you doing some other type of processing?
    > --
    > Tim Greer, CEO/Founder/CTO, BurlyHost.com, Inc.
    > Shared Hosting, Reseller Hosting, Dedicated & Semi-Dedicated servers
    > and Custom Hosting.  24/7 support, 30 day guarantee, secure servers.
    > Industry's most experienced staff! -- Web Hosting With Muscle!- Hide quoted text -
    >
    > - Show quoted text -


    I am checking data from a line not whole line.

    I want to check if IP and Code of file1 is present in file2 and if it
    is present in file2 then check again if it is there in file3.

    I am doing all this processing analyze some data.

    Let me know if still it is not clear.

    Thanks.
    , Oct 7, 2008
    #4
  5. Tim Greer Guest

    wrote:

    > On Oct 7, 4:48 pm, Tim Greer <> wrote:
    >> wrote:
    >>
    >> > I have 3 different files in following format. ($file_1, $file_2,
    >> > $file_3)

    >>
    >> > ID | Time | IP | Code

    >>
    >> > Following is psuedo code which I am writing. I want to know another
    >> > efficient way to do same thing.

    >>
    >> > open(INFO_1,$file_1);
    >> > open(INFO_2,$file_2);
    >> > open(INFO_3,$file_3);

    >>
    >> > @file1_lines = <INFO_1>;
    >> > @file2_lines = <INFO_2>;
    >> > @file3_lines = <INFO_3>;

    >>
    >> > So I am going thorugh each data of file 1 and depending on if data
    >> > is present in file2 and again depending on some if condition I look
    >> > for that data in file3.

    >>
    >> > So each data of file1 will have to go through each data of file2
    >> > and each data of file2 will have to go thorugh file3.

    >>
    >> > So this code is taking lot of time. I want some suggestion for
    >> > efficient code.

    >>
    >> > Can I use Hash Array (by reading file in  hash array)

    >>
    >> The answer very much depends on what #some code is actually doing.
    >> Is the data fixed in the files, what specific checks are you doing?
    >> Could the data be anywhere in a file, inside of a line of data, or
    >> are you trying to match lines from ^ start to $ end of line per file,
    >> or are you doing some other type of processing?
    >> --


    >>
    >> - Show quoted text -

    >
    > I am checking data from a line not whole line.
    >
    > I want to check if IP and Code of file1 is present in file2 and if it
    > is present in file2 then check again if it is there in file3.
    >
    > I am doing all this processing analyze some data.
    >
    > Let me know if still it is not clear.
    >
    > Thanks.


    I'd personally just either create a hash key and value based on it, if
    there's not a lot of data involved, and open the next file and check if
    it exists that way, which you can check per line with a while loop
    against file 2 and 3 (if needed), instead of reading all three files
    into arrays. If the files are potentially large, you'll want to avoid
    that because it'll read a lot of data into memory that wouldn't be
    necessary. I'd open the first file, do a split on a while loop and
    create a hash, close it and then open file 2 and do a while loop and
    check to see if the hash key/val exists. If not, repeat for file 3.
    There is probably a better way than that, but that's a generally better
    idea off the top of my head with what you're attempting now.
    --
    Tim Greer, CEO/Founder/CTO, BurlyHost.com, Inc.
    Shared Hosting, Reseller Hosting, Dedicated & Semi-Dedicated servers
    and Custom Hosting. 24/7 support, 30 day guarantee, secure servers.
    Industry's most experienced staff! -- Web Hosting With Muscle!
    Tim Greer, Oct 7, 2008
    #5
  6. Grant Guest

    On Tue, 7 Oct 2008 13:31:50 -0700 (PDT), "" <> wrote:

    >Hi,
    >
    >I am trying to analyze some data. I have big data files.
    >
    >I have 3 different files in following format. ($file_1, $file_2,
    >$file_3)
    >
    >ID | Time | IP | Code
    >
    >Following is psuedo code which I am writing. I want to know another
    >efficient way to do same thing.


    Who knows, without seeing your data and requirements, but I'll offer
    this optimised database table loader as an example that follows all
    the speedup clues from the camel book. Loads a ~100k record data
    table followed by a ~250 record table on a slow 500MHz Celeron box
    in about 3 seconds:
    ....
    do_log("read: $indexfile");
    open FILE, "< $indexfile" or do_die("$indexfile $!");
    flock FILE, 1;
    $ip2c_cn = 0;
    while (<FILE>) {
    next if /^$/; next if /^#/; next if /^junkview/; chomp;
    ( $ip2c_lo[++$ip2c_cn],
    $ip2c_hi[$ip2c_cn],
    $ip2c_cc[$ip2c_cn]
    ) = split /\s+/, $_;
    }
    close FILE;

    do_log("read: $namesfile");
    open FILE, "< $namesfile" or do_die("$namesfile $!");
    flock FILE, 1;
    %cc_name = ();
    while (<FILE>) {
    next if /^$/; next if /^#/; next if /^junkview/; chomp;
    my ($cc, $name) = split /:/, $_;
    $cc_name{$cc} = $name;
    }
    close FILE;
    }

    You can see that as far as possible you avoid useless processing of
    irrelevant data, so plan on how to skip (with 'next') over sections
    of your loop code rather than using 'if ... processing', avoid complex
    regexps, don't chomp records that are about to be discarded.

    From log file:
    2008-10-07.21:28:17 - read: /etc/ip2cn-server.conf
    2008-10-07.21:28:17 - read: /usr/local/share/ip2cn/ip2c-data
    2008-10-07.21:28:20 - read: /usr/local/share/ip2cn/ip2c-names
    2008-10-07.21:28:20 - listen: localhost:4743

    Context: http://bugsplatter.id.au/ip2cn/ip2cn-server.txt

    Grant.
    --
    http://bugsplatter.id.au/
    Grant, Oct 7, 2008
    #6
  7. Guest

    On Tue, 7 Oct 2008 13:31:50 -0700 (PDT), "" <> wrote:

    >Hi,
    >
    >I am trying to analyze some data. I have big data files.
    >
    >I have 3 different files in following format. ($file_1, $file_2,
    >$file_3)
    >
    >ID | Time | IP | Code
    >
    >Following is psuedo code which I am writing. I want to know another
    >efficient way to do same thing.
    >
    >open(INFO_1,$file_1);
    >open(INFO_2,$file_2);
    >open(INFO_3,$file_3);
    >
    >@file1_lines = <INFO_1>;
    >@file2_lines = <INFO_2>;
    >@file3_lines = <INFO_3>;
    >
    >foreach $file1_line (@file1_lines)
    >{
    > @file1 = split('\|',$file1_line);
    >
    > #some code
    >
    > foreach $file2_line (@file2_lines)
    > {
    > @file2 = split('\|',$file2_line);
    >
    > #some code
    >
    > #if condition between File1 data and File2 data
    > {
    >
    > #some code
    >
    > foreach $file3_line (@file3_lines)
    > {
    > @file3 = split('\|',$file3_line);
    >
    > #some code
    >
    > #if condition
    >
    > }
    >
    > }
    >
    >
    > }
    >
    >
    >}
    >
    >
    >
    >So I am going thorugh each data of file 1 and depending on if data is
    >present in file2 and again depending on some if condition I look for
    >that data in file3.
    >
    >
    >So each data of file1 will have to go through each data of file2 and
    >each data of file2 will have to go thorugh file3.
    >
    >So this code is taking lot of time. I want some suggestion for
    >efficient code.
    >
    >Can I use Hash Array (by reading file in hash array)
    >


    Nobody knows the impact of any pseudo code, or what data that
    it process is. There is no generalization to be sought.

    The best you can do, through trial and error, is benchmark
    it yourself:

    use Benchmark ':hireswallclock';
    my $t0 = new Benchmark;

    {{{{ code block}}}

    my $t1 = new Benchmark;
    my $tdif = timediff($t1, $t0);
    print STDERR "the code took:",timestr($tdif),"\n";

    sln
    , Oct 7, 2008
    #7
  8. [A complimentary Cc of this posting was sent to

    <>], who wrote in article <>:

    Nobody else commented on that yet:

    > @file1_lines = <INFO_1>;
    > @file2_lines = <INFO_2>;
    > @file3_lines = <INFO_3>;
    >
    > foreach $file1_line (@file1_lines)
    > {
    > @file1 = split('\|',$file1_line);


    > foreach $file2_line (@file2_lines)
    > {
    > @file2 = split('\|',$file2_line);


    This split is done again and again, once per every line of INFO_1.
    The result is going to be the same anyway. Better move it outside of
    the loop

    @file2_fields = map [split '\|', $_], @file2_lines;

    if you have enough memory. Likewise for other stuff.

    Hope this helps,
    Ilya
    Ilya Zakharevich, Oct 8, 2008
    #8
  9. Guest

    On Tue, 07 Oct 2008 22:20:06 GMT, wrote:

    >On Tue, 7 Oct 2008 13:31:50 -0700 (PDT), "" <> wrote:
    >
    >>Hi,
    >>
    >>I am trying to analyze some data. I have big data files.
    >>
    >>I have 3 different files in following format. ($file_1, $file_2,
    >>$file_3)
    >>
    >>ID | Time | IP | Code
    >>
    >>Following is psuedo code which I am writing. I want to know another
    >>efficient way to do same thing.
    >>
    >>open(INFO_1,$file_1);
    >>open(INFO_2,$file_2);
    >>open(INFO_3,$file_3);
    >>
    >>@file1_lines = <INFO_1>;
    >>@file2_lines = <INFO_2>;
    >>@file3_lines = <INFO_3>;
    >>
    >>foreach $file1_line (@file1_lines)
    >>{
    >> @file1 = split('\|',$file1_line);
    >>
    >> #some code
    >>
    >> foreach $file2_line (@file2_lines)
    >> {
    >> @file2 = split('\|',$file2_line);
    >>
    >> #some code
    >>
    >> #if condition between File1 data and File2 data
    >> {
    >>
    >> #some code
    >>
    >> foreach $file3_line (@file3_lines)
    >> {
    >> @file3 = split('\|',$file3_line);
    >>
    >> #some code
    >>
    >> #if condition
    >>
    >> }
    >>
    >> }
    >>
    >>
    >> }
    >>
    >>
    >>}
    >>
    >>
    >>
    >>So I am going thorugh each data of file 1 and depending on if data is
    >>present in file2 and again depending on some if condition I look for
    >>that data in file3.
    >>
    >>
    >>So each data of file1 will have to go through each data of file2 and
    >>each data of file2 will have to go thorugh file3.
    >>
    >>So this code is taking lot of time. I want some suggestion for
    >>efficient code.
    >>
    >>Can I use Hash Array (by reading file in hash array)
    >>

    >
    >Nobody knows the impact of any pseudo code, or what data that
    >it process is. There is no generalization to be sought.
    >
    >The best you can do, through trial and error, is benchmark
    >it yourself:
    >
    >use Benchmark ':hireswallclock';
    >my $t0 = new Benchmark;
    >
    >{{{{ code block}}}
    >
    >my $t1 = new Benchmark;
    >my $tdif = timediff($t1, $t0);
    >print STDERR "the code took:",timestr($tdif),"\n";
    >
    >sln


    Well, if it were my code, I would know exactly how to do it without benchmarks.
    But you don't know yourself it seams. Do you?
    Instead, you post phoney PSEUDO code as if you know something, which you don't.
    Yet put the burdon on the sucker who is stupid enough to respond to you.

    Outta here... ignant

    sln
    , Oct 8, 2008
    #9
  10. Guest

    On Oct 8, 4:18 am, wrote:
    > On Tue, 07 Oct 2008 22:20:06 GMT, wrote:
    > >On Tue, 7 Oct 2008 13:31:50 -0700 (PDT), "" <> wrote:

    >
    > >>Hi,

    >
    > >>I am trying to analyze some data. I have big data files.

    >
    > >>I have 3 different files in following format. ($file_1, $file_2,
    > >>$file_3)

    >
    > >>ID | Time | IP | Code

    >
    > >>Following is psuedo code which I am writing. I want to know another
    > >>efficient way to do same thing.

    >
    > >>open(INFO_1,$file_1);
    > >>open(INFO_2,$file_2);
    > >>open(INFO_3,$file_3);

    >
    > >>@file1_lines = <INFO_1>;
    > >>@file2_lines = <INFO_2>;
    > >>@file3_lines = <INFO_3>;

    >
    > >>foreach $file1_line (@file1_lines)
    > >>{
    > >>         @file1 = split('\|',$file1_line);

    >
    > >>         #some code

    >
    > >>         foreach $file2_line (@file2_lines)
    > >>         {
    > >>                 @file2 = split('\|',$file2_line);

    >
    > >>                 #some code

    >
    > >>                 #if condition between File1 data and File2 data
    > >>                 {

    >
    > >>                  #some code

    >
    > >>                            foreach $file3_line (@file3_lines)
    > >>                            {
    > >>                                     @file3 = split('\|',$file3_line);

    >
    > >>                                     #some code

    >
    > >>                                    #if condition

    >
    > >>                            }

    >
    > >>                 }

    >
    > >>         }

    >
    > >>}

    >
    > >>So I am going thorugh each data of file 1 and depending on if data is
    > >>present in file2 and again depending on some if condition I look for
    > >>that data in file3.

    >
    > >>So each data of file1 will have to go through each data of file2 and
    > >>each data of file2 will have to go thorugh file3.

    >
    > >>So this code is taking lot of time. I want some suggestion for
    > >>efficient code.

    >
    > >>Can I use Hash Array (by reading file in  hash array)

    >
    > >Nobody knows the impact of any pseudo code, or what data that
    > >it process is. There is no generalization to be sought.

    >
    > >The best you can do, through trial and error, is benchmark
    > >it yourself:

    >
    > >use Benchmark ':hireswallclock';
    > >my $t0 = new Benchmark;

    >
    > >{{{{ code block}}}

    >
    > >my $t1 = new Benchmark;
    > >my $tdif = timediff($t1, $t0);
    > >print STDERR "the code took:",timestr($tdif),"\n";

    >
    > >sln

    >
    > Well, if it were my code, I would know exactly how to do it without benchmarks.
    > But you don't know yourself it seams. Do you?
    > Instead, you post phoney PSEUDO code as if you know something, which you don't.
    > Yet put the burdon on the sucker who is stupid enough to respond to you.
    >
    > Outta here... ignant
    >
    > sln- Hide quoted text -
    >
    > - Show quoted text -


    Thanks to all for replying.

    Can I use Hash even if I don't have unique key ? Because in my data I
    need IP and Code which are not necessary to be unique.

    Below is my code:

    open(INFO_1,$file_1);
    open(INFO_2,$file_2);
    open(INFO_3,$file_3);


    @file1_lines = <INFO_1>;
    @file2_lines = <INFO_2>;
    @file3_lines = <INFO_3>;


    foreach $file1_line (@file1_lines)
    {
    @file1 = split('\|',$file1_line);
    $file1_ip = $file[2];
    $file2_code = $file[3];

    foreach $file2_line (@file2_lines)
    {
    @file2 = split('\|',$file2_line);
    $file2_ip = $file[2];
    $file2_code = $file[3];

    if($file1_ip eq $file2_ip)
    {
    $flag = 1;
    if($file1_code eq $file2_code)
    {
    $r_flag = 0;

    foreach $file3_line (@file3_lines)
    {
    @file3 = split('\|',$file3_line);
    $file3_ip = $file[2];
    $file3_code = $file[3];

    if(($file1_ip eq $file3_ip) &&
    ($file1_code eq $file3_code))
    {
    #some flag
    }

    }
    #depending on flag I increment some counter
    }

    }
    }
    #depending on flag I increment some counter
    }
    , Oct 8, 2008
    #10
  11. Guest

    "" <> wrote:
    >
    > Thanks to all for replying.
    >
    > Can I use Hash even if I don't have unique key ?


    Yes and no. Hashes only have unique keys, but the hash value for that
    key can be an array or a hash, so effectively each key can have several
    values.

    > Because in my data I
    > need IP and Code which are not necessary to be unique.


    In what data structure are they not necessarily unique?
    >
    > if($file1_ip eq $file2_ip)
    > {
    > $flag = 1;


    $flag is not used elsewhere

    > if($file1_code eq $file2_code)


    Since this if has no else, and has nothing done between its end and
    the end of the previous if, they can be combined into one line

    if($file1_ip eq $file2_ip and $file1_code eq $file2_code) {



    > {
    > $r_flag = 0;
    >
    > foreach $file3_line (@file3_lines)
    > {
    > @file3 = split('\|',$file3_line);
    > $file3_ip = $file[2];
    > $file3_code = $file[3];
    >
    > if(($file1_ip eq $file3_ip) &&
    > ($file1_code eq $file3_code))
    > {
    > #some flag


    Again with the pseudo-code? Let's say that that means "$r_flag=1;"
    Since any execution of that beyond the first is meaningless, that mean
    it doesn't matter if the same ip|code shows up more than once in
    @file3_lines, so @file3_lines can be reduced to a simple hash instead


    > }
    >
    > }
    > #depending on flag I increment some counter


    Incrementing a counter of course is count-sensitive, unlike setting a flag.
    So it might matter of the same ip|code shows up mutliple times in
    @file2_lines. But that could probably be solved by storing the count
    in the hash.

    --
    -------------------- http://NewsReader.Com/ --------------------
    The costs of publication of this article were defrayed in part by the
    payment of page charges. This article must therefore be hereby marked
    advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
    this fact.
    , Oct 8, 2008
    #11
  12. <> wrote:

    > open(INFO_1,$file_1);



    You should always, yes *always*, check the return value from open():

    open(INFO_1, $file_1) or die "could not open '$file_1' $!";

    Even better, you should use the 3-arg form of open() and a lexical filehandle:

    open my $INFO_1, '<', $file_1 or die "could not open '$file_1' $!";


    > @file1_lines = <INFO_1>;


    @file1_lines = <$INFO_1>; # use the lexical filehandle



    > @file1 = split('\|',$file1_line);



    A pattern match should *look like* a pattern match:

    @file1 = split(/\|/,$file1_line);


    > $file1_ip = $file[2];
    > $file2_code = $file[3];



    You can replace those 3 lines of code with 1 line using a
    List Slice (see the "Slices" section in perldata.pod):

    my($file1_ip, $file2_code) = (split /\|/, $file1_line)[2,3];


    > $flag = 1;



    You should choose meaningful variable names.


    --
    Tad McClellan
    email: perl -le "print scalar reverse qq/moc.noitatibaher\100cmdat/"
    Tad J McClellan, Oct 8, 2008
    #12
  13. Guest

    On Oct 8, 12:33 pm, Tad J McClellan <> wrote:
    > <> wrote:
    > > open(INFO_1,$file_1);

    >
    > You should always, yes *always*, check the return value from open():
    >
    >    open(INFO_1, $file_1) or die "could not open '$file_1' $!";
    >
    > Even better, you should use the 3-arg form of open() and a lexical filehandle:
    >
    >    open my $INFO_1, '<', $file_1 or die "could not open '$file_1' $!";
    >
    > > @file1_lines = <INFO_1>;

    >
    >     @file1_lines = <$INFO_1>; # use the lexical filehandle
    >
    > >          @file1 = split('\|',$file1_line);

    >
    > A pattern match should *look like* a pattern match:
    >
    >    @file1 = split(/\|/,$file1_line);
    >
    > >          $file1_ip = $file[2];
    > >          $file2_code = $file[3];

    >
    > You can replace those 3 lines of code with 1 line using a
    > List Slice (see the "Slices" section in perldata.pod):
    >
    >    my($file1_ip, $file2_code) = (split /\|/, $file1_line)[2,3];
    >
    > >                        $flag = 1;

    >
    > You should choose meaningful variable names.
    >
    > --
    > Tad McClellan
    > email: perl -le "print scalar reverse qq/moc.noitatibaher\100cmdat/"


    Hey Tad,

    Thanks for your help.

    Can you also suggest some efficient way.

    Since I am processing three files in loop. It is taking lot of time.
    , Oct 8, 2008
    #13
  14. [A complimentary Cc of this posting was sent to
    Tad J McClellan
    <>], who wrote in article <>:

    > > @file1 = split('\|',$file1_line);


    > A pattern match should *look like* a pattern match:
    > @file1 = split(/\|/,$file1_line);


    In general, I do not agree. A split on a constant string WITHOUT
    METACHARS should better be written as a split on string. However, in
    this particular case, it is better to use something looking as a REx.

    However, do you really find /\|/ very esthetic? ;-) Can it be
    written better than m'\|'?

    Yours,
    Ilya
    Ilya Zakharevich, Oct 8, 2008
    #14
  15. Ilya Zakharevich <> wrote:
    > [A complimentary Cc of this posting was sent to
    > Tad J McClellan
    ><>], who wrote in article <>:
    >
    >> > @file1 = split('\|',$file1_line);

    >
    >> A pattern match should *look like* a pattern match:
    >> @file1 = split(/\|/,$file1_line);

    >
    > In general, I do not agree. A split on a constant string WITHOUT
    > METACHARS should better be written as a split on string.



    I like that idea enough that I may actually change my preference...


    > However, in
    > this particular case, it is better to use something looking as a REx.
    >
    > However, do you really find /\|/ very esthetic? ;-)



    No. In this case, the nature of the beast precludes anything esthetic. :-(


    > Can it be
    > written better than m'\|'?



    That is not too objectionable, though I kinda like /\Q|/


    --
    Tad McClellan
    email: perl -le "print scalar reverse qq/moc.noitatibaher\100cmdat/"
    Tad J McClellan, Oct 9, 2008
    #15
  16. Guest

    On Oct 8, 7:27 pm, Tad J McClellan <> wrote:
    > Ilya Zakharevich <> wrote:
    > > [A complimentary Cc of this posting was sent to
    > > Tad J McClellan
    > ><>], who wrote in article <>:

    >
    > >> >          @file1 = split('\|',$file1_line);

    >
    > >> A pattern match should *look like* a pattern match:
    > >>    @file1 = split(/\|/,$file1_line);

    >
    > > In general, I do not agree.  A split on a constant string WITHOUT
    > > METACHARS should better be written as a split on string.  

    >
    > I like that idea enough that I may actually change my preference...
    >
    > > However, in
    > > this particular case, it is better to use something looking as a REx.

    >
    > > However, do you really find /\|/ very esthetic?  ;-)  

    >
    > No. In this case, the nature of the beast precludes anything esthetic. :-(
    >
    > > Can it be
    > > written better than m'\|'?

    >
    > That is not too objectionable, though I kinda like /\Q|/
    >
    > --
    > Tad McClellan
    > email: perl -le "print scalar reverse qq/moc.noitatibaher\100cmdat/"



    Thanks to all.

    I tried to read file in array and use @file1_lines = <INFO_P>;

    But I my results are getting changed.

    Below is my FULL Code which I am using.

    Please suggest something to make it run more fast.


    #!/usr/local/bin/perl

    $p_file = "out_p.txt";
    $s_file = "out_s.txt";
    $r_file = "out_r.txt";

    open(INFO_P,$p_file);
    open(INFO_S,$s_file);
    open(INFO_R,$r_file);

    @p_lines = <INFO_P>;
    @s_lines = <INFO_S>;
    @r_lines = <INFO_R>;

    $fail_flag = 1;
    $p_slow = 0;
    $p_fail = 0;
    $r_robin = 0;
    $s_as_p = 0;

    foreach $s_line (@s_lines)
    {
    @sec = split('\|',$s_line);
    $s_cli_ip = $sec[4];
    $s_ser_ip = $sec[5];
    $s_id = $sec[7];

    $r_robin_flag = 1;
    $flag = 0;

    foreach $p_line (@p_lines)
    {

    @pri = split('\|',$p_line);
    $p_cli_ip = $pri[4];
    $p_ser_ip = $pri[5];
    $p_id = $pri[7];

    if($s_cli_ip eq $p_cli_ip)
    {
    $flag = 1;

    if($s_id eq $p_id)
    {
    $r_robin_flag = 0;
    $s_res_first = 0;
    $p_res_first = 0;
    foreach $r_line (@r_lines)
    {
    @res = split('\|',$r_line);
    $r_cli_ip = $res[4];
    $r_ser_ip = $res[5];
    $r_id = $res[7];


    if(($s_cli_ip eq $r_cli_ip) && ($s_id eq $r_id))
    {
    if($r_ser_ip eq $s_ser_ip)
    {
    #chk if pri_res_first
    if($p_res_first eq '0'){
    $slow = 1;
    $s_res_first = 1;
    }

    }elsif($r_ser_ip eq $p_ser_ip){
    $fail_flag = 0;
    if($s_res_first){
    #$slow = 1;
    }else{
    #$s_res_first = 0;
    $p_res_first = 1;
    }
    }

    }
    if($p_res_first){
    $primary++ ;
    last;
    }
    }
    if($fail_flag){
    $primary_fail++;

    }elsif($slow){
    $slow = 0;
    $fail_flag = 1;
    $primary_slow++;
    }
    last;
    }

    }

    }
    if($flag == 0){
    $s_as_p++;
    }elsif($r_robin_flag){
    $r_robin++;
    }
    }

    close(INFO_P);
    close(INFO_S);
    close(INFO_R);
    , Oct 9, 2008
    #16
  17. Guest

    "" <> wrote:
    >
    > Below is my FULL Code which I am using.


    It is better to post real but simplified code. Especially when
    your full code is so inscrutable.


    > open(INFO_P,$p_file);
    > open(INFO_S,$s_file);
    > open(INFO_R,$r_file);
    >
    > @p_lines = <INFO_P>;
    > @s_lines = <INFO_S>;


    As several people have said, you should use lexical file handles, you
    should check the status of the open, and you should use strict.


    > @r_lines = <INFO_R>;


    The innermost for loop doesn't seem to do anything except for when the
    if statement if(($s_cli_ip eq $r_cli_ip) && ($s_id eq $r_id))
    is satisfied. Thus, that loop can be reduced to loop over only
    those lines of INFO_R that will cause the above to be true. Build a hash
    of arrays that segregates lines according to $r_cli_ip and $r_id.

    # for reasons to be seen later:
    my %r_lines;
    while (<INFO_R>) {
    my @res = split('\|',$r_line);
    my $r_cli_ip = $res[4];
    my $r_ser_ip = $res[5];
    my $r_id = $res[7];
    push @{$r_lines{"$r_cli_ip|$r_id"}}, $_;
    };


    Then replace

    foreach $r_line (@r_lines)

    with

    foreach $r_line (@{$r_lines{"$s_cli_ip|$s_id"}})

    The same strategy could perhaps be employed in the middle foreach loop
    as well. If I understood the motivation of your code, I might be able
    to make it much simpler, but since I don't I'll stick the "minimal possible
    changes" approach.

    Xho

    --
    -------------------- http://NewsReader.Com/ --------------------
    The costs of publication of this article were defrayed in part by the
    payment of page charges. This article must therefore be hereby marked
    advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
    this fact.
    , Oct 9, 2008
    #17
  18. <> wrote:


    > open(INFO_P,$p_file);



    You should always, yes *always*, check the return value from open():

    open(INFO_P, $p_file) or die "could not open '$p_file' $!";

    Even better, you should use the 3-arg form of open() and a lexical filehandle:

    open my $INFO_P, '<', $p_file or die "could not open '$p_file' $!";


    > @p_lines = <INFO_P>;



    @p_lines = <$INFO_P>; # use the lexical filehandle


    > $flag = 1;



    You should choose meaningful variable names.


    --
    Tad McClellan
    email: perl -le "print scalar reverse qq/moc.noitatibaher\100cmdat/"
    Tad J McClellan, Oct 10, 2008
    #18
  19. Dr.Ruud Guest

    schreef:

    > I am trying to analyze some data. I have big data files.
    >
    > I have 3 different files in following format. ($file_1, $file_2,
    > $file_3)


    Numbered variable names are a red flag. Normally you are better off
    using a different data structure, like an array or a hash.

    use strict;
    use warnings;

    use Data::Dumper;
    $Data::Dumper::Sortkeys = $Data::Dumper::Indent =
    $Data::Dumper::Terse = 1;

    my %data;
    my @filenames = qw/a b x/;
    for my $fn (@filenames) {
    open my $fh, "<", $fn or die "open $fn: $!";
    while ( <$fh> ) {
    my ($ip, $code) = (split m'\|')[2,3];
    push @{$data{$ip}{$code}}, "$fn:$.";
    }
    }
    print Dumper( \%data );
    __END__

    (untested)

    --
    Affijn, Ruud

    "Gewoon is een tijger."
    Dr.Ruud, Oct 10, 2008
    #19
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Roy
    Replies:
    14
    Views:
    1,128
    Hugo Kornelis
    Mar 18, 2005
  2. Replies:
    8
    Views:
    492
  3. Michael Hetrick

    Most efficient way to return multiple values to client

    Michael Hetrick, Sep 28, 2003, in forum: ASP .Net Web Services
    Replies:
    3
    Views:
    153
  4. Me
    Replies:
    2
    Views:
    237
  5. Tim Chase
    Replies:
    0
    Views:
    87
    Tim Chase
    Dec 16, 2013
Loading...

Share This Page