J
Jason
In the past I've written a simple search engine and have been using it
for awhile, but now I'm trying to make it a little more intuitive.
Originally, the script was simple. Take a keyword entered by a form,
compare it to each value of an array (@data), and return any value
containing the entry:
$var = param('keyword');
foreach $key (@data) {
if ($key =~ /$var/i) { push (@founddata, $key); }
}
But now I'm trying to allow for multi-word phrases, which is a bit more
complex. I couldn't find how others are doing it, so I'm winging it on
my own. I started by splitting $var by the whitespace into an array,
counting the number of instances that $var appeared in the $key, then
adding the results to a hash value:
my (%founddata, @keywords);
my $var = param('keyword');
$var =~ s/(?:,|'|\.)//g; # Remove comma, apostrophe, or period
@keywords = split(/ /, $var);
foreach my $key (@data) {
foreach my $term (@keywords) {
my @matches = $var =~ /($term)/ig;
my $size = @matches;
if ($size > 0) { $founddata{$term} .= $size::$key . "|"; }
}
}
Let's say that @data = ("Men", "Women", "Children", "Pets",
"Monsters"), and someone searches for "m n" (without the quotes, of
course, because so far I'm not worrying about strict phrases). The
results would be:
$keywords[0] = m;
$keywords[1] = n;
$founddata{'m'} = 1::Men|1::Women|1::Monsters;
$founddata{'n'} = 1::Men|1::Women|1::Children|2::Monsters;
Now, how do I compare the two keys "m" and "n" to both remove any value
that's not in both (ie, Children), and then add the rest together to
create something like 2::Men|2::Women|3::Monsters?
In my mind, I would take the result of this, split it by | into an
array, sort it so that Monsters is first, then split it by :: to remove
the numbers (leaving the final values sorted by the number of
instances), unless you guys know an easier route.
TIA,
Jason
for awhile, but now I'm trying to make it a little more intuitive.
Originally, the script was simple. Take a keyword entered by a form,
compare it to each value of an array (@data), and return any value
containing the entry:
$var = param('keyword');
foreach $key (@data) {
if ($key =~ /$var/i) { push (@founddata, $key); }
}
But now I'm trying to allow for multi-word phrases, which is a bit more
complex. I couldn't find how others are doing it, so I'm winging it on
my own. I started by splitting $var by the whitespace into an array,
counting the number of instances that $var appeared in the $key, then
adding the results to a hash value:
my (%founddata, @keywords);
my $var = param('keyword');
$var =~ s/(?:,|'|\.)//g; # Remove comma, apostrophe, or period
@keywords = split(/ /, $var);
foreach my $key (@data) {
foreach my $term (@keywords) {
my @matches = $var =~ /($term)/ig;
my $size = @matches;
if ($size > 0) { $founddata{$term} .= $size::$key . "|"; }
}
}
Let's say that @data = ("Men", "Women", "Children", "Pets",
"Monsters"), and someone searches for "m n" (without the quotes, of
course, because so far I'm not worrying about strict phrases). The
results would be:
$keywords[0] = m;
$keywords[1] = n;
$founddata{'m'} = 1::Men|1::Women|1::Monsters;
$founddata{'n'} = 1::Men|1::Women|1::Children|2::Monsters;
Now, how do I compare the two keys "m" and "n" to both remove any value
that's not in both (ie, Children), and then add the rest together to
create something like 2::Men|2::Women|3::Monsters?
In my mind, I would take the result of this, split it by | into an
array, sort it so that Monsters is first, then split it by :: to remove
the numbers (leaving the final values sorted by the number of
instances), unless you guys know an easier route.
TIA,
Jason