perl grep problem

Discussion in 'Perl Misc' started by demolitionz@gmail.com, May 25, 2005.

  1. Guest

    hey, wonder if anyone can help 'cause i'm fresh out of ideas why my
    perl script isn't working!

    basically the script reads all the data from files in a directory into
    an array. i then want the user to be able to search that array for
    keywords (in each line) and output the keywords to a file. i've got
    the script to work using the following line:

    @found = grep(/$ARGV[2]/i, @rf);

    where @rf is the array that's being searched, @found is the array the
    found words are stored to (i output it to a file later which also works
    fine) and $ARGV[2] is the user input word to search for. the problem
    with this script is that because the user inputs the search word as
    $ARGV[2] the program can only search for one word per run, which means
    when they want to search for another word they have to run the whole
    program again and this slows things down as the @rf array has to be
    created from scratch once more.

    what i want to do (and what i've tried endlessly to do in the 2nd
    remake of the script!) is to have a 2 step proccess, where the files
    are read into the @rf array as step one, and then in step 2 the user
    inputs the keyword to search for and we loop step 2 as many times as
    the user wants. what i'm currently doing with that then, is this:

    $keyword = <STDIN>;
    chop $keyword;
    @found = grep(/$keyword/i, @rf);

    now i've printed to screen everything so as to debug it, and if the
    user inputs "chickens" for example, then print $keyword; will return
    "chickens" correctly. the problem is, no matter what i try, i cannot
    get the grep(/$keyword/) bit to work and @found is *always* empty! i
    don't really understand why grep would work fine with $ARGV[2] but not
    with $keyword and it's drivin me crazy! i've tried @found =
    grep(/"$keyword"/i, @rf); and i've tried chomp $keyword; and i've even
    resorted to pushing $keyword into an array and calling the same value
    from the array as a scalar (i got very very desperate by this point and
    would try anything ;)) but nothing i do works.

    can anyone help?! :)

    cheers,
    d
     
    , May 25, 2005
    #1
    1. Advertising

  2. wrote:

    > hey, wonder if anyone can help 'cause i'm fresh out of ideas why my
    > perl script isn't working!
    >
    > basically the script reads all the data from files in a directory into
    > an array. i then want the user to be able to search that array for
    > keywords (in each line) and output the keywords to a file. i've got
    > the script to work using the following line:
    >
    > @found = grep(/$ARGV[2]/i, @rf);
    >
    > where @rf is the array that's being searched, @found is the array the
    > found words are stored to (i output it to a file later which also
    > works fine) and $ARGV[2] is the user input word to search for. the
    > problem with this script is that because the user inputs the search
    > word as $ARGV[2] the program can only search for one word per run,
    > which means when they want to search for another word they have to
    > run the whole program again and this slows things down as the @rf
    > array has to be created from scratch once more.
    >
    > what i want to do (and what i've tried endlessly to do in the 2nd
    > remake of the script!) is to have a 2 step proccess, where the files
    > are read into the @rf array as step one, and then in step 2 the user
    > inputs the keyword to search for and we loop step 2 as many times as
    > the user wants. what i'm currently doing with that then, is this:
    >
    > $keyword = <STDIN>;
    > chop $keyword;
    > @found = grep(/$keyword/i, @rf);
    >
    > now i've printed to screen everything so as to debug it, and if the
    > user inputs "chickens" for example, then print $keyword; will return
    > "chickens" correctly. the problem is, no matter what i try, i cannot
    > get the grep(/$keyword/) bit to work and @found is always empty! i
    > don't really understand why grep would work fine with $ARGV[2] but not
    > with $keyword and it's drivin me crazy! i've tried @found =
    > grep(/"$keyword"/i, @rf); and i've tried chomp $keyword; and i've even
    > resorted to pushing $keyword into an array and calling the same value
    > from the array as a scalar (i got very very desperate by this point
    > and would try anything ;)) but nothing i do works.
    >
    > can anyone help?! :)
    >

    You need to post a small, complete program that displays this
    behaviour, as well as sample data and output, copying and pasting
    rather than retyping. Check out the posting guidelines. I would have
    suggested "chomp" rather than "chop", but you've tried that. Is also
    possible that the data you are feeding to STDIN has something
    unexpected in it. Bear in mind that you'll need to escape special
    characters if you want to use them in a regex to match the, er, special
    characters.

    Mark
     
    Mark Clements, May 25, 2005
    #2
    1. Advertising

  3. Guest

    Okay, have read the posting guidelines and hopefully understood them,
    sorry about that :)

    Here's a scaled down version of the program that isn't working...

    #!usr/bin/perl
    $filenumber = 0;
    do {
    print "Processing file $filenumber of $#rd\n"; # nb: this is just to
    debug
    opendir(DH,$ARGV[0]);
    @rd = readdir(DH);
    open(FH,"$ARGV[0]/$rd[$filenumber]");
    @rf = <FH>;
    $filenumber++;
    }
    while ($filenumber <= $#rd);
    do {
    print "File to save to: ";
    $filename = <STDIN>;
    chomp $filename;
    print "Keyword to search for:";
    $searchterm = <STDIN>;
    chomp $searchterm;
    @found = grep(/$searchterm/i, @rf);
    open(SAVETOFILE,">>./new/$filename");
    print SAVETOFILE @found;
    print "rf array: @rf\n";
    print "keyword: $searchterm\n";
    print "found array: @found\n";
    print "Search again? y/n\n";
    $stop = <STDIN>;
    chop $stop;
    if ($stop eq "n") { exit; }
    }
    while ($filename ne "!exit");
    exit;

    Have also tried adding in $searchterm =~ s/[^A-Za-z0-9 .\\:-]*//g; but
    doesn't seem to make a difference. (oh and the directory 'New' does
    exist just in case you were wondering :)).

    And here's some sample data (directory contains 4 txt files. 1.txt
    contains word eggs, 2.txt contains word bacon, 3.txt contains word
    chickens, 4.txt contains word flower)...

    c:\scriptdir> new1.pl c:\prltest\
    Processing file 0 of -1 #this is just cos i put the debug print in
    weird place :)
    Processing file 1 of 5
    Processing file 2 of 5
    Processing file 3 of 5
    Processing file 4 of 5
    Processing file 5 of 5
    File to save to: new.txt
    Keyword to search for: chicken
    rf array: flower
    keyword: chicken
    found array:
    Search again? n

    And that's basically it. As i say, it works absolutely fine with
    $ARGV[2] as input so i'm stumped!

    cheers,
    d
    ps this is only the 3rd script i've ever written in perl, so pls go
    easy on me if i've done something obviously stupid ;)
     
    , May 25, 2005
    #3
  4. Guest

    oh and just to pre-empt anyone lol, i did actually copy and paste that
    script so i assume some of the misformats are due to google's
    newsreader - e.g. open(FH,"$ARGV[0]/$rd[$filenum ber]"); is not a
    mistake in the script (it's actually
    open(FH,"$ARGV[0]/$rd[$filenumber]"); in the script) :)
     
    , May 25, 2005
    #4
  5. John Bokma Guest

    wrote:

    > Okay, have read the posting guidelines and hopefully understood them,


    Probably not entirely, so I added some guidelines ;-)

    > #!usr/bin/perl


    use strict;
    use warnings;

    > opendir(DH,$ARGV[0]);


    check return value

    > open(FH,"$ARGV[0]/$rd[$filenumber]");


    check return value

    > open(SAVETOFILE,">>./new/$filename");


    check

    > chop $stop;


    use chomp if you want to chomp, see perldoc -f chomp

    > if ($stop eq "n") { exit; }
    > }
    > while ($filename ne "!exit");


    nicer:

    while ( 1 ) {

    :
    last if $stop eq 'n';
    }


    --
    John Small Perl scripts: http://johnbokma.com/perl/
    Perl programmer available: http://castleamber.com/
    Happy Customers: http://castleamber.com/testimonials.html
     
    John Bokma, May 25, 2005
    #5
  6. Guest

    Ok, I've added the following debug in which will check to see if
    directory can be opened and files are being read properly...

    open(FH,"$ARGV[0]/$rd[$filenumber]");
    print "\nFH is: ";
    print <FH>;

    this returns...

    Processing file 1 of 5
    FH is:
    Processing file 2 of 5
    FH is: eggs
    Processing file 3 of 5
    FH is: bacon
    Processing file 4 of 5
    FH is: chickens
    Processing file 5 of 5
    FH is: flower

    So return value for open(FH,"$ARGV[0]/$rd[$filenumber]"); (and
    therefore opendir(DH,$ARGV[0]);) seem fine.

    Also, the new file is created ok (and it writes to the new file ok when
    using $ARGV[2]) so open(SAVETOFILE,">>./new/$file name"); so this seems
    fine too, its just annoying that it's always bleeding empty lol!

    Have changed the ending of the script per your suggestion, and ty for
    that :)

    So I'm once again totally baffled, as all my debug checks seem to show
    everything is working ok. The files in the directory are read to the
    @rf array ok, the new file is created fine, the $keyword stdin works,
    but the script just refuses to grep using the $keyword. And to top it
    off google is doing it's best to misformat these posts lol :) ty for
    ongoing help btw, appreciate it :)
     
    , May 25, 2005
    #6
  7. John Bokma Guest

    wrote:

    Learn how to quote, otherwise you will notice that no one is going to reply
    to your postings.

    > Ok, I've added the following debug


    wrong, try again.

    (hint open( ... ) or die "Can't open '$filename': $!";

    BTW: I am not saying that it's going to fix your problem, but it might trap
    errors now, or in your future work.

    --
    John Small Perl scripts: http://johnbokma.com/perl/
    Perl programmer available: http://castleamber.com/
    Happy Customers: http://castleamber.com/testimonials.html
     
    John Bokma, May 25, 2005
    #7
  8. Guest

    mmkay, well these will be manual quotes then as google doesn't have a
    quote feature that i can find, so hopefully they come out ok.

    > (hint open( ... ) or die "Can't open '$filename': $!";


    done on opendir(DH,$ARGV[0]); and open(FH,"$ARGV[0]/$rd[$filenumber]");
    and open(SAVETOFILE,">>./new/$file name") and they all work fine, no
    errors...
     
    , May 25, 2005
    #8
  9. wrote:

    > Okay, have read the posting guidelines and hopefully understood them,
    > sorry about that :)
    >
    > Here's a scaled down version of the program that isn't working...
    >


    I've piped it through perltidy to make it semi-legible.


    > $filenumber = 0;
    > do {
    > print "Processing file $filenumber of $#rd\n"; # nb: this >
    > opendir( DH, $ARGV[0] );
    > @rd = readdir(DH);
    > open( FH, "$ARGV[0]/$rd[$filenumber]" );
    > @rf = <FH>;
    > $filenumber++;
    > } while ( $filenumber <= $#rd );


    You are doing opendir and reading the directory each time through the
    loop. You don't need to do this. You aren't checking the return value
    of your system calls. You aren't running with strict and warnings
    (already pointed out). You are probably trying to open "." and ".." as
    files. You are overwriting the value of @rf each time through the loop,
    so @rf will only contain the contents of the last file found in the
    directory, whatever that is.

    use strict;
    use warnings;
    use Data::Dumper;

    my $dirName = shift;
    opendir DIRTOREAD, $dirName or die "could not open dir $dirName: $!";

    my @filesToSearch = grep { -f "$dirName/$_" } readdir DIRTOREAD;
    closedir DIRTOREAD or die "error closing dir $dirName: $!";

    my %fileData = ();

    foreach my $fileName(@filesToSearch){
    my $fileToSearch = "$dirName$fileName";
    open IN, "<$fileToSearch"
    or die "could not open $fileToSearch: $!";
    my @lines = map { chomp , $_} <IN>;
    $fileData{$fileName} = \@lines;

    }

    warn Dumper %fileData;

    while(my($fileName,$lines)=each %fileData){

    print "enter search term for $fileName: ";
    my $searchTerm = <STDIN>;
    chomp $searchTerm;
    last unless $searchTerm;
    print "\n";

    my @foundLines = grep { /$searchTerm/ } @$lines;

    print "filename = $fileName searchTerm = $searchTerm\n";
    print "found ".Dumper(@foundLines)."\n";


    }

    use Data::Dumper to make sure that your arrays contain what you think
    they contain....

    Note that doing this loads *all* of the files in the directory into
    memory; you may not want to do this.

    Mark
     
    Mark Clements, May 25, 2005
    #9
  10. John Bokma Guest

    Mark Clements wrote:

    > my $dirName = shift;
    > opendir DIRTOREAD, $dirName or die "could not open dir $dirName: $!";


    Isn't it more common to use:

    opendir my $dh, etc

    nowadays? (Also CamelCase is something I prefer not to use ;-) )

    > my @lines = map { chomp , $_} <IN>;


    chomp( my @lines = <IN> ); ?

    (Just curious, not nitpicking, ok a little).

    --
    John Small Perl scripts: http://johnbokma.com/perl/
    Perl programmer available: http://castleamber.com/
    Happy Customers: http://castleamber.com/testimonials.html
     
    John Bokma, May 25, 2005
    #10
  11. Guest

    Thanks for your reply. Haven't used your code primarily because the
    point of the excercise for me was just to try and learn some perl and
    see if i could make the thing work (i'll move on to elegance later ;)),
    but you did hit the nail on the head with this...

    > You are overwriting the value of @rf each time through the loop,
    > so @rf will only contain the contents of the last file found in the
    > directory, whatever that is.


    Have now changed the code from

    @rf = <FH>;

    to:

    push(@rf, <FH>);

    and it works fine :)

    have also moved opendir(DH,$ARGV[0]); @rd = readdir(DH); out of the
    first loop as you suggested.

    many thanks to you both for your help :)

    d
     
    , May 25, 2005
    #11
  12. John Bokma wrote:

    > Mark Clements wrote:
    >
    > > my $dirName = shift;
    > > opendir DIRTOREAD, $dirName or die "could not open dir $dirName:
    > > $!";

    >
    > Isn't it more common to use:
    >
    > opendir my $dh, etc


    Sure - I was just following on from the OP's style, or, er, perhaps it
    just didn't occur to me. On another point, I tend not to put the "my"
    there unless it is eg at the start of a foreach. I think it makes
    things clearer if the my is the first non-whitespace on the line.

    > nowadays? (Also CamelCase is something I prefer not to use ;-) )


    Yeah - I've been whistled on this one before :)

    >
    > > my @lines = map { chomp , $_} <IN>;

    >
    > chomp( my @lines = <IN> ); ?
    >


    Good point. I hadn't realised that chomp could be fed a list argument.
    You learn something new every day.

    regards,

    Mark
     
    Mark Clements, May 25, 2005
    #12
  13. John Bokma Guest

    John Bokma, May 25, 2005
    #13
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Al Belden

    perl vs Unix grep

    Al Belden, Jul 3, 2004, in forum: Perl
    Replies:
    1
    Views:
    5,258
    Giridhar Nandigam
    Jul 7, 2004
  2. js
    Replies:
    4
    Views:
    387
    Fredrik Lundh
    Jan 3, 2007
  3. Tim Smith
    Replies:
    1
    Views:
    380
    Marc 'BlackJack' Rintsch
    Dec 29, 2006
  4. Replies:
    3
    Views:
    406
    BartlebyScrivener
    Nov 8, 2007
  5. jeniffer
    Replies:
    4
    Views:
    353
    John W. Krahn
    Mar 20, 2006
Loading...

Share This Page