remove specific line from al ltext fiels in dir?

Discussion in 'Perl Misc' started by Snail, Jun 23, 2005.

  1. Snail

    Snail Guest

    I've been trying to find a way to remove a line containing a specific
    string.

    In my case, I have a dir full of logs files and I want to remove all
    lines containing "127.0.0.1" or "localhost", because those lines take up
    more than 95% of the text in the logs making them rather big. Removing
    those lines will make our logs much easier to read.

    I found the opendir and readdir functions, but what if I want to
    recurse. I suppose I could just while(<LOGFIE>) loop over each line and
    check, but I'm not sure how to erase the line.

    Or is there a better way to do this?

    Thank you for any advise and help.

    (PS: Running on a Linux platform, with Perl 5.6.1)
    Snail, Jun 23, 2005
    #1
    1. Advertising

  2. Snail

    Paul Lalli Guest

    Snail wrote:
    > I've been trying to find a way to remove a line containing a specific
    > string.


    Have you read
    perldoc -q "delete a line"
    ?

    > In my case, I have a dir full of logs files and I want to remove all
    > lines containing "127.0.0.1" or "localhost", because those lines take up
    > more than 95% of the text in the logs making them rather big. Removing
    > those lines will make our logs much easier to read.
    >
    > I found the opendir and readdir functions


    I wouldn't even bother doing those explicitly. I would suggest just
    using a one-liner using the -n and -i switches...

    perl -ni -e'print unless /localhost|127\.0\.0\.1/' *.log

    (Read about those two switches in
    perldoc perlrun
    )


    > , but what if I want to recurse.


    When you think "recurse" in Perl, think "File::Find", which is a
    standard module. There is also the non-standard File::Find::Rule
    available on CPAN which offers a theoretically easier to understand
    syntax.

    perldoc File::Find

    > I suppose I could just while(<LOGFIE>) loop over each line and
    > check, but I'm not sure how to erase the line.


    You can't. At least, not like you're thinking. You'd need to save a
    copy of the original under a different name, open a new file for output
    under the original name, and print all the lines from the saved copy
    that you don't want to "delete". (This is precisely what -n and -i
    help you with, from above).

    > Or is there a better way to do this?


    Annoyingly, I'm not thinking of any particularly quick and easy ways of
    combining the -ni approach with the File::Find aproach. I can think of
    several ways, just no "good" ones:
    * in File::Find's &wanted subroutine, spawn an external perl process
    that uses -ni
    * use BEGIN{} and END{} blocks in the "one-liner" to bring File::Find
    into the -ni method.
    * do the recursion external to perl, with the perl -ni process in the
    body of a
    for ${find . -name *.log -print}; do ... block
    * set @ARGV to the list of files found from File::Find::find, set $^I,
    and then start your while (<>) loop.

    Any of the gurus know of any better way of recursing using the -ni
    switches?

    Paul Lalli
    Paul Lalli, Jun 23, 2005
    #2
    1. Advertising

  3. Snail

    Paul Lalli Guest

    Paul Lalli wrote:
    > Snail wrote:
    > > I've been trying to find a way to remove a line containing a specific
    > > string.

    <snip>
    > > , but what if I want to recurse.

    <snip>
    > > Or is there a better way to do this?

    >
    > Annoyingly, I'm not thinking of any particularly quick and easy ways of
    > combining the -ni approach with the File::Find aproach. I can think of
    > several ways, just no "good" ones:
    > * in File::Find's &wanted subroutine, spawn an external perl process
    > that uses -ni
    > * use BEGIN{} and END{} blocks in the "one-liner" to bring File::Find
    > into the -ni method.
    > * do the recursion external to perl, with the perl -ni process in the
    > body of a
    > for ${find . -name *.log -print}; do ... block
    > * set @ARGV to the list of files found from File::Find::find, set $^I,
    > and then start your while (<>) loop.
    >
    > Any of the gurus know of any better way of recursing using the -ni
    > switches?


    Thinking more about this, I think my last option there is probably the
    best I can come up with. (Although I was very unclear about how @ARGV
    is set: find doesn't return anything. You have to populate @ARGV
    within the &wanted function).

    Here's my attempt to do what I think you're looking for (replace
    'logdir' with whatever directory holds your logs):

    #!/usr/bin/perl
    use strict;
    use warnings;
    use File::Find;

    local @ARGV;

    sub wanted {
    push @ARGV, $File::Find::name if /\.log$/ and -f;
    }

    find (\&wanted, 'logdir');

    if (@ARGV){
    print "Files to process: \n", join("\n", @ARGV), "\n";
    local $^I = '.bkp';
    while (<>){
    print unless /localhost|127\.0\.0\.1/;
    }
    } else {
    print "No files to process!\n";
    }

    __END__
    Paul Lalli, Jun 23, 2005
    #3
  4. Snail <> wrote:
    > I've been trying to find a way to remove a line containing a specific
    > string.


    > more than 95% of the text in the logs making them rather big.

    ^^^^

    Are these "live" logs? That is, will logging writes be happening
    when you run your cleanup program?

    If so, then you better work file locking into the mix.


    --
    Tad McClellan SGML consulting
    Perl programming
    Fort Worth, Texas
    Tad McClellan, Jun 23, 2005
    #4
  5. Snail

    Snail Guest

    Tad McClellan wrote:
    > Snail <> wrote:
    >> I've been trying to find a way to remove a line containing a specific
    >> string.

    >
    >> more than 95% of the text in the logs making them rather big.

    > ^^^^
    >
    > Are these "live" logs? That is, will logging writes be happening
    > when you run your cleanup program?
    >
    > If so, then you better work file locking into the mix.


    Yes and no. They are live during the day, but at night, for 10 minutes,
    its like a "maintence" time. My plan was to setup a cron job to strip
    the uneeded lines from the logs. The server daemon provides no way (and
    even worse, no source code :< ) to customize it's logs, thats why I
    wanted to take matters into my own hands.

    `perl -ni -e'print unless /localhost|127\.0\.0\.1/' *.log`

    Works almsot perfectly. If I run it from my shell window I see annoying

    "Can't do inplace edit: backup is not a regular file."

    type messages when it encouters a directory, and I can no way to supress
    them, even with 2>&1 appended to the end.

    `perl -ni -e'print unless /localhost|127\.0\.0\.1/' *.log 2>&1`

    I suppose it doesn't matter as this will be running from a cron job, but
    I think it would run a little after if there was a way to just skip dirs
    all together. Maybe it would be better to just write a full perl script
    using readdir, and checking each one with -f to see if it's a file.

    Unless there is a way to achieve this on the cmd line?

    Thanks again.
    Snail, Jun 23, 2005
    #5
  6. Snail

    Guest

    Perl is the wrong tool for this task.

    Try :

    grep -v localhost logfile.log | grep -v 127.0.0.1 > new logfile.log

    Note that grep has options to use a file so you can put
    localhost, 127.0.0.1 etc into a file instead of doing a bunch of pipes.
    , Jun 23, 2005
    #6
  7. Snail <> wrote:

    > `perl -ni -e'print unless /localhost|127\.0\.0\.1/' *.log`
    > Works almsot perfectly. If I run it from my shell window I see annoying
    >
    > "Can't do inplace edit: backup is not a regular file."
    >
    > type messages when it encouters a directory, and I can no way to supress
    > them, even with 2>&1 appended to the end.
    >
    > `perl -ni -e'print unless /localhost|127\.0\.0\.1/' *.log 2>&1`

    ^ ^
    ^ ^

    What's with the (shell) backticks?

    Is that really how you are calling it?

    Leave them off, or do the redirection outside of them, but you don't
    what the annoying messages in your log file either, so make it 2>/dev/null
    or some such.


    > I suppose it doesn't matter as this will be running from a cron job, but
    > I think it would run a little after if there was a way to just skip dirs
    > all together. Maybe it would be better to just write a full perl script
    > using readdir, and checking each one with -f to see if it's a file.
    >
    > Unless there is a way to achieve this on the cmd line?



    You can muck about with @ARGV before you let -n's while(<>)
    loop look at it (line wrapped for posting):

    perl -ni -e 'BEGIN{@ARGV = grep -f, @ARGV}
    print unless /localhost|\Q127.0.0.1/' *.log


    Or, even better, don't make subdirectories with silly names
    that match *.log. :)


    --
    Tad McClellan SGML consulting
    Perl programming
    Fort Worth, Texas
    Tad McClellan, Jun 24, 2005
    #7
  8. writes:

    > You could.


    I could what?

    > grep -v is what a sysadmin would use.


    Use for what?

    > A perl one liner is what a perl enthusiast would use.


    Use for what?

    Have you read the posting guidelines for this group, and the Google Groups
    guide to usenet netiquette? You've been asked many times now to quote the
    messages you're replying to as appropriate. Why aren't you doing it?

    sherm--
    Sherm Pendley, Jun 24, 2005
    #8
  9. wrote:
    > Perl is the wrong tool for this task.


    For what task?
    Please provide enough context such that people have a chance to know what
    you are talking about

    > Try :
    > grep -v localhost logfile.log | grep -v 127.0.0.1 > new logfile.log
    > Note that grep has options to use a file so you can put
    > localhost, 127.0.0.1 etc into a file instead of doing a bunch of
    > pipes.


    And why shouldn't you be able to do this in Perl one-liner?

    jue
    Jürgen Exner, Jun 24, 2005
    #9
  10. Snail

    Guest

    You could.

    grep -v is what a sysadmin would use.

    A perl one liner is what a perl enthusiast would use.
    , Jun 24, 2005
    #10
  11. Snail

    Tintin Guest

    <> wrote in message
    news:...
    [Please quote appropriate context. Mind you, I know you are going to
    deliberately ignore this request]

    > Perl is the wrong tool for this task.


    Rubbish.

    >
    > Try :
    >
    > grep -v localhost logfile.log | grep -v 127.0.0.1 > new logfile.log
    >
    > Note that grep has options to use a file so you can put
    > localhost, 127.0.0.1 etc into a file instead of doing a bunch of pipes.


    Note that your "solution" doesn't do what the OP asked (although you didn't
    quote any context, so it's hard to see what the OP did actually ask).

    If you really did want to do a non Perl solution, it would be:

    egrep -v "localhost|127\.0\.0\.1" logfile.log >/tmp/$$ && mv /tmp/$$
    logfile.log
    Tintin, Jun 24, 2005
    #11
  12. Snail

    Ian Wilson Guest

    wrote:
    > Perl is the wrong tool for this task.
    >
    > Try :
    >
    > grep -v localhost logfile.log | grep -v 127.0.0.1 > new logfile.log


    I usually need to quote filenames with spaces in them.

    >
    > Note that grep has options to use a file so you can put
    > localhost, 127.0.0.1 etc into a file instead of doing a bunch of pipes.
    >


    grep | grep is the wrong tool for this task.

    Try :

    egrep -v "(localhost|127.0.0.1)" logfile.log > newlogfile.log

    :)
    Ian Wilson, Jun 24, 2005
    #12
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. =?Utf-8?B?UnVkeQ==?=

    Sub Dir, Virtual dir, what do I use?

    =?Utf-8?B?UnVkeQ==?=, Jun 12, 2005, in forum: ASP .Net
    Replies:
    0
    Views:
    411
    =?Utf-8?B?UnVkeQ==?=
    Jun 12, 2005
  2. kaushikshome
    Replies:
    4
    Views:
    750
    kaushikshome
    Sep 10, 2006
  3. Matthew Denner
    Replies:
    1
    Views:
    157
  4. Kga Agk
    Replies:
    2
    Views:
    138
    Kga Agk
    Jun 29, 2009
  5. Nick Gnedin
    Replies:
    2
    Views:
    148
Loading...

Share This Page