Question on grep and reading from file

Discussion in 'Perl Misc' started by googler, Aug 8, 2007.

  1. googler

    googler Guest

    Inside my Perl script, I had to check if a particular pattern appears
    in a certain file or not (only a yes/no answer). I did it as below:
    @matching_lines = grep { /$srchpat/ } <MYFILE>;
    print "Pattern found\n" if ($#matching_lines != -1);

    I was wondering if there is a more efficient way to do this. Is it
    possible to use the Unix "grep" command to do this inside my script?
    If so, how? Will that be more efficient (faster)?

    I have another question. Is there a way to read a particular line in a
    file when I know the line number (without using a loop and reading
    each line at a time)? I guess the below code would work.
    @lines = <MYFILE>;
    $myline = $lines[$linenum-1];
    But this will read the entire file into the array @lines and can take
    up a lot of memory if the file is huge. Is there a more efficient
    solution?

    Thanks.
     
    googler, Aug 8, 2007
    #1
    1. Advertising

  2. googler

    Mirco Wahab Guest

    googler wrote:
    > Inside my Perl script, I had to check if a particular pattern appears
    > in a certain file or not (only a yes/no answer). I did it as below:
    > @matching_lines = grep { /$srchpat/ } <MYFILE>;
    > print "Pattern found\n" if ($#matching_lines != -1);
    >
    > I was wondering if there is a more efficient way to do this. Is it
    > possible to use the Unix "grep" command to do this inside my script?
    > If so, how? Will that be more efficient (faster)?


    In almost all cases, a sequential approach will be *much*
    faster on *large* files (>= 100MB), like

    <pseudo>
    ...
    my @matching_lines;
    while( <MYFILE> ) {
    push @matching_lines, $_
    if /$srchpat/
    }
    print "Pattern found\n"
    if scalar @matching_lines;
    ...

    > I have another question. Is there a way to read a particular line in a
    > file when I know the line number (without using a loop and reading
    > each line at a time)? I guess the below code would work.
    > @lines = <MYFILE>;
    > $myline = $lines[$linenum-1];
    > But this will read the entire file into the array @lines and can take
    > up a lot of memory if the file is huge. Is there a more efficient
    > solution?


    No, not really. Besides the 'tie' approach (which is sometimes
    too slow), you can always read large files fast 'record by record'
    (eg. lines) and check the line no via "$." ...

    Regards

    M.
     
    Mirco Wahab, Aug 8, 2007
    #2
    1. Advertising

  3. googler

    Guest

    googler <> wrote:
    > Inside my Perl script, I had to check if a particular pattern appears
    > in a certain file or not (only a yes/no answer). I did it as below:
    > @matching_lines = grep { /$srchpat/ } <MYFILE>;


    This reads the entire file, even if the match is in the first line.
    (Potentially worse, it reads the entire into memory at once, as perl
    is currently implemented.)

    > print "Pattern found\n" if ($#matching_lines != -1);
    >
    > I was wondering if there is a more efficient way to do this. Is it
    > possible to use the Unix "grep" command to do this inside my script?


    Sure. There are many ways. The simplest, if $srchpat and $filename don't
    require protecting from shell interpretation, and $srchpat either doesn't
    have special characters or only has ones that mean the same thing between
    Perl and grep, would be something like this:

    my $result=`grep -l $srchpat $filename`;

    > If so, how? Will that be more efficient (faster)?


    It would have more overhead, but will probably run faster once it gets
    running (provided $srchpat is fairly simple)

    > I have another question. Is there a way to read a particular line in a
    > file when I know the line number (without using a loop and reading
    > each line at a time)? I guess the below code would work.
    > @lines = <MYFILE>;
    > $myline = $lines[$linenum-1];
    > But this will read the entire file into the array @lines and can take
    > up a lot of memory if the file is huge. Is there a more efficient
    > solution?


    Unless you know how long each line is, or have otherwise pre-computed some
    kind of index into the file, you need to read the entire file at least up
    to the desired line and count newlines, either implicitly or explicitly.

    Xho

    --
    -------------------- http://NewsReader.Com/ --------------------
    Usenet Newsgroup Service $9.95/Month 30GB
     
    , Aug 9, 2007
    #3
  4. googler

    googler Guest

    > > I have another question. Is there a way to read a particular line in a
    > > file when I know the line number (without using a loop and reading
    > > each line at a time)? I guess the below code would work.
    > > @lines = <MYFILE>;
    > > $myline = $lines[$linenum-1];
    > > But this will read the entire file into the array @lines and can take
    > > up a lot of memory if the file is huge. Is there a more efficient
    > > solution?

    >
    > Unless you know how long each line is, or have otherwise pre-computed some
    > kind of index into the file, you need to read the entire file at least up
    > to the desired line and count newlines, either implicitly or explicitly.


    OK, say I know how long each line is. How can it help in reading the n-
    th line from the file directly? Can you please explain. Thanks.
     
    googler, Aug 11, 2007
    #4
  5. googler <> writes:

    >> Unless you know how long each line is, or have otherwise pre-computed some
    >> kind of index into the file, you need to read the entire file at least up
    >> to the desired line and count newlines, either implicitly or explicitly.

    >
    > OK, say I know how long each line is. How can it help in reading the n-
    > th line from the file directly? Can you please explain. Thanks.


    If you know that each line, including newline, is $x bytes long, you
    can read line $n by doing something like:

    use Fcntl :)seek);

    open FH, '>', $filename;
    seek FH, $n*$x, SEEK_SET;
    $_ = <FH>;

    Note that the length is in bytes, not characters. So doing this on an
    utf8 encoded file (or any other variable length encoding) will not
    work as expected.

    //Makholm
     
    Peter Makholm, Aug 11, 2007
    #5
  6. googler

    Guest

    googler <> wrote:
    > > > I have another question. Is there a way to read a particular line in
    > > > a file when I know the line number (without using a loop and reading
    > > > each line at a time)? I guess the below code would work.
    > > > @lines = <MYFILE>;
    > > > $myline = $lines[$linenum-1];
    > > > But this will read the entire file into the array @lines and can take
    > > > up a lot of memory if the file is huge. Is there a more efficient
    > > > solution?

    > >
    > > Unless you know how long each line is, or have otherwise pre-computed
    > > some kind of index into the file, you need to read the entire file at
    > > least up to the desired line and count newlines, either implicitly or
    > > explicitly.

    >
    > OK, say I know how long each line is. How can it help in reading the n-
    > th line from the file directly? Can you please explain. Thanks.


    Compute where in the file the desired line starts, and use "seek" to
    jump to it. See perldoc -f seek.

    Xho

    --
    -------------------- http://NewsReader.Com/ --------------------
    Usenet Newsgroup Service $9.95/Month 30GB
     
    , Aug 12, 2007
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. danpres2k
    Replies:
    3
    Views:
    7,492
    danpres2k
    Aug 25, 2003
  2. Agrapha
    Replies:
    15
    Views:
    162
    Agrapha
    Feb 9, 2004
  3. Adam
    Replies:
    2
    Views:
    109
  4. Replies:
    2
    Views:
    196
    Peter Sundstrom
    Nov 27, 2004
  5. Mr_Noob

    grep in file and date process

    Mr_Noob, Mar 3, 2008, in forum: Perl Misc
    Replies:
    7
    Views:
    97
    J. Gleixner
    Mar 4, 2008
Loading...

Share This Page