...
what is the best method to remove alot of words such as "a", "and", "I",
"so", "that", "this" ...etc ... from the search string leaving only
keywords essentially per page/field that will be searched within for the
occurance of the users' input text through a input field
The idea being to only return the suitable records without alot of
rubbish...
You want to create yourself an IGNORE WORDS list, I put one of these
together before, contains about 390 now I think - you can search Google and
find results for this pretty easily, the pain is sometimes having to get
them from the web into a format you can use, perhaps from a web page to
excel to xml or something...
With these, you then just have a little function that is called passing in
your search criteria, iterate through each word in the search criteria and
the ignore words, if you find a match dont keep it as good criteria, if
there's no match then keep it...
Example:
Dim aIgnoreWords(4)
aIgnoreWords(0) = "a"
aIgnoreWords(1) = "i"
aIgnoreWords(2) = "them"
aIgnoreWords(3) = "must"
bMatchFound = False
sSearchCriteria = "I must find them and a donkey"
aSearchCriteriaWords = Split(sSearchCriteria, " ")
' iterate through search criteria words
For x = 0 To (UBound(aSearchCriteriaWords)-1)
' iterate through our ignore words
For y = 0 To (UBound(aIgnoreWords)-1)
' do we have a match?
If LCase(aSearchCritieriaWords(x)) = LCase(aIgnoreWords(y)) Then
' match found!
bMatchFound = True
Exit For
End If
Next
' if we didn't match our criteria word to any ignore words then its a
good criteria word, add it to our new search criteria
If bMarchFound = False Then
sNewSearchCriteria = sNewSearchCriteria & aSearchCriteria(x)
' add a space to separate the words
If x < (UBound(aIgnoreWords)-1) Then
sNewSearchCriteria & " "
End If
Else
' reset flag ready for next search criteria word
bMatchFound = False
End If
Next
By the end of this, your sNewSearchCriteria string should contain: "find and
my donkey"
Little example only and untested so it might error - you should also
consider doing some checks,
does the Len(sSearchCriteria) > 1 (do we have anything to search for at
all?!)
have you loaded in your ignore words successful (perhaps from a
database/xml)
do you want to store the words that were admitted so you can do what google
used to do... "The following common words were exclude; a, i, them, must"
etc
My list of ignore words is below, its in XML format, but you cold easily do
a REPLACE ALL in word or something to remove the tags..hope its of use.
Regards
Rob
<?xml version="1.0" encoding="utf-8" ?>
<IgnoreWords>
<Word>a</Word>
<Word>about</Word>
<Word>above</Word>
<Word>according</Word>
<Word>across</Word>
<Word>actually</Word>
<Word>adj</Word>
<Word>after</Word>
<Word>afterwards</Word>
<Word>again</Word>
<Word>against</Word>
<Word>all</Word>
<Word>almost</Word>
<Word>alone</Word>
<Word>along</Word>
<Word>already</Word>
<Word>also</Word>
<Word>although</Word>
<Word>always</Word>
<Word>among</Word>
<Word>amongst</Word>
<Word>an</Word>
<Word>and</Word>
<Word>another</Word>
<Word>any</Word>
<Word>anyhow</Word>
<Word>anyone</Word>
<Word>anything</Word>
<Word>anywhere</Word>
<Word>are</Word>
<Word>aren't</Word>
<Word>around</Word>
<Word>as</Word>
<Word>at</Word>
<Word>b</Word>
<Word>be</Word>
<Word>became</Word>
<Word>because</Word>
<Word>become</Word>
<Word>becomes</Word>
<Word>becoming</Word>
<Word>been</Word>
<Word>before</Word>
<Word>beforehand</Word>
<Word>begin</Word>
<Word>beginning</Word>
<Word>behind</Word>
<Word>being</Word>
<Word>below</Word>
<Word>beside</Word>
<Word>besides</Word>
<Word>between</Word>
<Word>beyond</Word>
<Word>billion</Word>
<Word>both</Word>
<Word>but</Word>
<Word>by</Word>
<Word>c</Word>
<Word>can</Word>
<Word>can't</Word>
<Word>cannot</Word>
<Word>caption</Word>
<Word>co</Word>
<Word>co.</Word>
<Word>could</Word>
<Word>couldn't</Word>
<Word>d</Word>
<Word>did</Word>
<Word>didn't</Word>
<Word>do</Word>
<Word>does</Word>
<Word>doesn't</Word>
<Word>don't</Word>
<Word>down</Word>
<Word>during</Word>
<Word>e</Word>
<Word>each</Word>
<Word>eg</Word>
<Word>eight</Word>
<Word>eighty</Word>
<Word>either</Word>
<Word>else</Word>
<Word>elsewhere</Word>
<Word>end</Word>
<Word>ending</Word>
<Word>enough</Word>
<Word>etc</Word>
<Word>even</Word>
<Word>ever</Word>
<Word>every</Word>
<Word>everyone</Word>
<Word>everything</Word>
<Word>everywhere</Word>
<Word>except</Word>
<Word>f</Word>
<Word>few</Word>
<Word>fifty</Word>
<Word>first</Word>
<Word>five</Word>
<Word>for</Word>
<Word>former</Word>
<Word>formerly</Word>
<Word>forty</Word>
<Word>found</Word>
<Word>four</Word>
<Word>from</Word>
<Word>further</Word>
<Word>g</Word>
<Word>h</Word>
<Word>had</Word>
<Word>has</Word>
<Word>hasn't</Word>
<Word>have</Word>
<Word>haven't</Word>
<Word>he</Word>
<Word>he'd</Word>
<Word>he'll</Word>
<Word>he's</Word>
<Word>hence</Word>
<Word>her</Word>
<Word>here</Word>
<Word>here's</Word>
<Word>hereafter</Word>
<Word>hereby</Word>
<Word>herein</Word>
<Word>hereupon</Word>
<Word>hers</Word>
<Word>herself</Word>
<Word>him</Word>
<Word>himself</Word>
<Word>his</Word>
<Word>how</Word>
<Word>however</Word>
<Word>hundred</Word>
<Word>i</Word>
<Word>i'd</Word>
<Word>i'll</Word>
<Word>i'm</Word>
<Word>i've</Word>
<Word>ie</Word>
<Word>if</Word>
<Word>in</Word>
<Word>inc.</Word>
<Word>indeed</Word>
<Word>instead</Word>
<Word>into</Word>
<Word>is</Word>
<Word>isn't</Word>
<Word>it</Word>
<Word>it's</Word>
<Word>its</Word>
<Word>itself</Word>
<Word>j</Word>
<Word>k</Word>
<Word>l</Word>
<Word>last</Word>
<Word>later</Word>
<Word>latter</Word>
<Word>latterly</Word>
<Word>least</Word>
<Word>less</Word>
<Word>let</Word>
<Word>let's</Word>
<Word>like</Word>
<Word>likely</Word>
<Word>ltd</Word>
<Word>m</Word>
<Word>made</Word>
<Word>make</Word>
<Word>makes</Word>
<Word>many</Word>
<Word>maybe</Word>
<Word>me</Word>
<Word>meantime</Word>
<Word>meanwhile</Word>
<Word>might</Word>
<Word>million</Word>
<Word>miss</Word>
<Word>more</Word>
<Word>moreover</Word>
<Word>most</Word>
<Word>mostly</Word>
<Word>mr</Word>
<Word>mrs</Word>
<Word>much</Word>
<Word>must</Word>
<Word>my</Word>
<Word>myself</Word>
<Word>n</Word>
<Word>namely</Word>
<Word>neither</Word>
<Word>never</Word>
<Word>nevertheless</Word>
<Word>next</Word>
<Word>nine</Word>
<Word>ninety</Word>
<Word>no</Word>
<Word>nobody</Word>
<Word>none</Word>
<Word>nonetheless</Word>
<Word>noone</Word>
<Word>nor</Word>
<Word>not</Word>
<Word>nothing</Word>
<Word>now</Word>
<Word>nowhere</Word>
<Word>o</Word>
<Word>of</Word>
<Word>off</Word>
<Word>often</Word>
<Word>on</Word>
<Word>once</Word>
<Word>one</Word>
<Word>one's</Word>
<Word>only</Word>
<Word>onto</Word>
<Word>or</Word>
<Word>other</Word>
<Word>others</Word>
<Word>otherwise</Word>
<Word>our</Word>
<Word>ours</Word>
<Word>ourselves</Word>
<Word>out</Word>
<Word>over</Word>
<Word>overall</Word>
<Word>own</Word>
<Word>p</Word>
<Word>per</Word>
<Word>perhaps</Word>
<Word>q</Word>
<Word>r</Word>
<Word>rather</Word>
<Word>recent</Word>
<Word>recently</Word>
<Word>s</Word>
<Word>same</Word>
<Word>seem</Word>
<Word>seemed</Word>
<Word>seeming</Word>
<Word>seems</Word>
<Word>seven</Word>
<Word>seventy</Word>
<Word>several</Word>
<Word>she</Word>
<Word>she'd</Word>
<Word>she'll</Word>
<Word>she's</Word>
<Word>should</Word>
<Word>shouldn't</Word>
<Word>since</Word>
<Word>six</Word>
<Word>sixty</Word>
<Word>so</Word>
<Word>some</Word>
<Word>somehow</Word>
<Word>someone</Word>
<Word>something</Word>
<Word>sometime</Word>
<Word>sometimes</Word>
<Word>somewhere</Word>
<Word>still</Word>
<Word>stop</Word>
<Word>stoplist</Word>
<Word>such</Word>
<Word>t</Word>
<Word>taking</Word>
<Word>ten</Word>
<Word>than</Word>
<Word>that</Word>
<Word>that'll</Word>
<Word>that's</Word>
<Word>that've</Word>
<Word>the</Word>
<Word>their</Word>
<Word>them</Word>
<Word>themselves</Word>
<Word>then</Word>
<Word>thence</Word>
<Word>there</Word>
<Word>there'd</Word>
<Word>there'll</Word>
<Word>there're</Word>
<Word>there's</Word>
<Word>there've</Word>
<Word>thereafter</Word>
<Word>thereby</Word>
<Word>therefore</Word>
<Word>therein</Word>
<Word>thereupon</Word>
<Word>these</Word>
<Word>they</Word>
<Word>they'd</Word>
<Word>they'll</Word>
<Word>they're</Word>
<Word>they've</Word>
<Word>thirty</Word>
<Word>this</Word>
<Word>those</Word>
<Word>though</Word>
<Word>thousand</Word>
<Word>three</Word>
<Word>through</Word>
<Word>throughout</Word>
<Word>thru</Word>
<Word>thus</Word>
<Word>to</Word>
<Word>together</Word>
<Word>too</Word>
<Word>toward</Word>
<Word>towards</Word>
<Word>trillion</Word>
<Word>twenty</Word>
<Word>two</Word>
<Word>u</Word>
<Word>under</Word>
<Word>unless</Word>
<Word>unlike</Word>
<Word>unlikely</Word>
<Word>until</Word>
<Word>up</Word>
<Word>upon</Word>
<Word>us</Word>
<Word>used</Word>
<Word>using</Word>
<Word>v</Word>
<Word>very</Word>
<Word>via</Word>
<Word>w</Word>
<Word>was</Word>
<Word>wasn't</Word>
<Word>we</Word>
<Word>we'd</Word>
<Word>we'll</Word>
<Word>we're</Word>
<Word>we've</Word>
<Word>well</Word>
<Word>were</Word>
<Word>weren't</Word>
<Word>what</Word>
<Word>what'll</Word>
<Word>what's</Word>
<Word>what've</Word>
<Word>whatever</Word>
<Word>when</Word>
<Word>whence</Word>
<Word>whenever</Word>
<Word>where</Word>
<Word>where's</Word>
<Word>whereafter</Word>
<Word>whereas</Word>
<Word>whereby</Word>
<Word>wherein</Word>
<Word>whereupon</Word>
<Word>wherever</Word>
<Word>whether</Word>
<Word>which</Word>
<Word>while</Word>
<Word>whither</Word>
<Word>who</Word>
<Word>who'd</Word>
<Word>who'll</Word>
<Word>who's</Word>
<Word>whoever</Word>
<Word>whole</Word>
<Word>whom</Word>
<Word>whomever</Word>
<Word>whose</Word>
<Word>why</Word>
<Word>will</Word>
<Word>with</Word>
<Word>within</Word>
<Word>without</Word>
<Word>won't</Word>
<Word>would</Word>
<Word>wouldn't</Word>
<Word>x</Word>
<Word>y</Word>
<Word>yes</Word>
<Word>yet</Word>
<Word>you</Word>
<Word>you'd</Word>
<Word>you'll</Word>
<Word>you're</Word>
<Word>you've</Word>
<Word>your</Word>
<Word>yours</Word>
<Word>yourself</Word>
<Word>yourselves</Word>
<Word>z</Word>
</IgnoreWords>