Search.cgi followup

K

Ken Saunders

first Thaks to all those folks that gave me pointers on the perl
script assignment I was doing. I managed to cobble it together. Now
that I've gotten it together I'm facing another challenge. the script
is working but won't return any search results. Here is the code can
someone tell me why I get no results page? I'm new at this and
already I've learned that Perl is really fun and amazingly frustrating
at the same time. Thanks

bnbliss


#!/usr/bin/perl
#this perl script performs a keyword search and displays the results
use strict;
use CGI qw:)standard);
use File::Find;

# Root directory of my website
my $filePath = '/home/classes/ksaund01/public_html';

print header;
print start_html(-bgcolor=>'lightblue');

if( param('criteria') ) {
find(\&search_file, $filePath);
}
else {
display_menu();
}

print end_html;

# Subroutines

sub search_file {

my $query = param('query');

if( $_ !~ /html|txt$/o ) {
return();
}

open(IN, "$_") or warn "Can't open $_: $!\n";

while ( my $line = <IN> ) {
chomp($line);

# Remove HTML from the line
$line =~ s/\<.*?\>//g;

# Cleanup filenames and turn it them
# a valid relative URL so that it can be uesd
# as a link
my $uri = $File::Find::name;
$uri =~ s/^$filePath//;
$uri = "/$uri";

if( $line =~ /$query/o ) {
print "<A HREF=$uri>$_</A><BR>";
}

}
close(IN);

}

sub display_menu {

print start_form,
b('Search this site for:'),
br,
textfield(-name=>'criteria'),
br,
submit(-name=>'Search'),
end_form;

}
 
T

Tad McClellan

Ken Saunders said:
the script
is working


Don't fix it if it isn't broken.

but won't return any search results.


Errr, I guess you assign a rather strange definition to the
word "working" then...

Here is the code can
someone tell me why I get no results page?
^^^^^^^^^^^^^^^^^^^


What, exactly, does that mean?

The browser "hangs" and you never get *anything* back?

You get a web page, but it doesn' report finding any "hits"?

Have you looked for messages in your server log?

(Have you already looked for Perl FAQs that mention the CGI?

perldoc -q CGI
)

use strict;


Good, but you should also add:

use warnings; # ask for all the help you can get!

find(\&search_file, $filePath);
sub search_file {

my $query = param('query');

if( $_ !~ /html|txt$/o ) {
return();
}


I'd switch the order of those 2 operations. There is no point in
fetching a param only to return() without using it.

The m//o does not do anything for the pattern you are using, so
it should not be there. Don't throw options on the end willy-nilly.
Either understand what they do for you, or don't use them yet.


Your pattern will match 'foo.html.bar' you know.

I _was_ just going to ask you to search for "precedence" in perlre.pod,
but that doesn't find docs that explain why it will match. The
right place is harder to find than it should be, but you can find
it by searching for "minimize confusion".

Your pattern says:
match "html" anywhere or "txt" at the end of string

as if you had written /html|(txt$)/.

I expect those are meant to be filename extensions, so you should
also require the dot before the extension.


You can say "unless" instead of "if not", which seems preferable
here (to me at least).


Phew! That's a lot of comments for only 4 lines of code. :)

So, you can replace those four lines with these two:

return unless /\.(html|txt)$/;
my $query = param('query');

If the query might contain regex metacharacters that you want to
match literally, then you'll want this instead:

my $query = quotemeta param('query');

open(IN, "$_") or warn "Can't open $_: $!\n";


perldoc -q vars

What's wrong with always quoting "$vars"?

then:

open(IN, $_) or warn "Can't open $_: $!\n";

while ( my $line = <IN> ) {
chomp($line);

# Remove HTML from the line
$line =~ s/\<.*?\>//g;


perldoc -q HTML

How do I remove HTML from a string?

Which gives several examples of HTML that will mess things
up for the pattern you are using.

HTML tags may span across more than one line too.

Since these are your own files, you might be able to guarantee
that none of that "tricky stuff" will be present, but in general
you would need to do a Real Parse of the HTML to do it correctly.

Angle brackets are not meta in regular expressions, so there is
no need to backslash them.

$uri =~ s/^$filePath//;
$uri = "/$uri";


You can combine those 2 into a single substitution:

$uri =~ s/^$filePath/\//;

or, since you now have a slash character in your replacement string,
choose to use an alternate delimiter so that you won't need
any backslashing:

$uri =~ s#^$filePath#/#;

if( $line =~ /$query/o ) {


We need 2 pieces of information to analyse why a pattern match
is not working correctly (the pattern and the string it is to
be matched against).

We have zero of those pieces of information, so we cannot help
explain why it is, or is not, matching...

print "<A HREF=$uri>$_</A><BR>";


You really should put quotes around your attribute values.

Using an alternate form of double quoting helps to avoid
yet more backslashing:

print qq(<A HREF="$uri">$_</A><BR>);
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

No members online now.

Forum statistics

Threads
473,780
Messages
2,569,611
Members
45,266
Latest member
DavidaAlla

Latest Threads

Top