Search - stop words - array/database/text file?

Discussion in 'ASP General' started by Rob Meade, Feb 7, 2004.

  1. Rob Meade

    Rob Meade Guest

    Lo all,

    Ok - I'm adding site search functionality to a database driven website.

    I have a list of 390 stop/ignore words, having looked at ASPFAQ already I
    see that the example uses an array, what I was wondering was whether this
    would still be the best practice for this quantity of stop words?

    There is a larger over head in me defining the array initially as I will
    have to hard code them all in, alternatively I thought I could import them
    into a SQL Server table from the excel file they are currently in and then
    query that, but I believe the ASPFAQ article gave a good reason for not
    doing that, my last thought was to read them in from a text file...

    Anyone got any thoughts? Would an array be equally as efficient for 390 stop
    words as it is for 10-20? Is it better to hard code them rather than grab
    them from a database?

    Any help / advice would be appreciated.

    Regards

    Rob
    Rob Meade, Feb 7, 2004
    #1
    1. Advertising

  2. Rob Meade

    Bob Barrows Guest

    Rob Meade wrote:
    > Lo all,
    >
    > Ok - I'm adding site search functionality to a database driven
    > website.
    >
    > I have a list of 390 stop/ignore words, having looked at ASPFAQ
    > already I see that the example uses an array, what I was wondering
    > was whether this would still be the best practice for this quantity
    > of stop words?
    >
    > There is a larger over head in me defining the array initially as I
    > will have to hard code them all in, alternatively I thought I could
    > import them into a SQL Server table from the excel file they are
    > currently in and then query that, but I believe the ASPFAQ article
    > gave a good reason for not doing that, my last thought was to read
    > them in from a text file...
    >
    > Anyone got any thoughts? Would an array be equally as efficient for
    > 390 stop words as it is for 10-20? Is it better to hard code them
    > rather than grab them from a database?
    >
    > Any help / advice would be appreciated.
    >
    > Regards
    >
    > Rob


    If the list will be static, I would store it in an Application variable,
    making the decision as to whether to store it in a database or a textfile
    superfluous. If the list has no relationship to any of your database data,
    then a text file on your web server seems to be indicated.

    Moreover, I would suggest storing it as an XML DOMDocument, allowing you to
    use the XML Parser DOM methods to easily search for values in the list.

    Bob Barrows

    --
    Microsoft MVP - ASP/ASP.NET
    Please reply to the newsgroup. This email account is my spam trap so I
    don't check it very often. If you must reply off-line, then remove the
    "NO SPAM"
    Bob Barrows, Feb 7, 2004
    #2
    1. Advertising

  3. Rob Meade

    Rob Meade Guest

    "Bob Barrows" wrote ...

    > If the list will be static, I would store it in an Application variable,
    > making the decision as to whether to store it in a database or a textfile
    > superfluous.


    Hi Bob,

    Yes, initially this list will definately be static, I do not plan to add to
    the list dynamically at this stage.

    > If the list has no relationship to any of your database data,
    > then a text file on your web server seems to be indicated.


    ok

    > Moreover, I would suggest storing it as an XML DOMDocument, allowing you

    to
    > use the XML Parser DOM methods to easily search for values in the list.


    hmmm...hadn't thought of XML for this..

    Wouldnt this be quite a bit of extra code considering all I want to do is
    iterate through the list and chop those words out of the original search
    string etc? Maybe not, not sure...can't see the advantages of this method?

    Any further info appreciated..

    Regards

    Rob
    Rob Meade, Feb 7, 2004
    #3
  4. Rob Meade

    Bob Barrows Guest

    Rob Meade wrote:
    > "Bob Barrows" wrote ...
    >
    >> If the list will be static, I would store it in an Application
    >> variable, making the decision as to whether to store it in a
    >> database or a textfile superfluous.

    >
    > Hi Bob,
    >
    > Yes, initially this list will definately be static, I do not plan to
    > add to the list dynamically at this stage.
    >
    >> If the list has no relationship to any of your database data,
    >> then a text file on your web server seems to be indicated.

    >
    > ok
    >
    >> Moreover, I would suggest storing it as an XML DOMDocument, allowing
    >> you to use the XML Parser DOM methods to easily search for values in
    >> the list.

    >
    > hmmm...hadn't thought of XML for this..
    >
    > Wouldnt this be quite a bit of extra code considering all I want to
    > do is iterate through the list and chop those words out of the
    > original search string etc? Maybe not, not sure...can't see the
    > advantages of this method?
    >


    Ah! I see. I was thinking you would need to do the opposite: find specific
    words in the list.

    To find a word in an array:
    for i = 0 to ubound(ar)
    if ar(i) = <something> then
    exit for
    end if
    next

    To find a word in an XML Document:
    xmldoc.selectsinglenode("/root/node[value='<something>']")

    There is no extra code involved in looping through a DOM Document:

    for each oNode in xmldoc.documentelement.childnodes
    'do something with oNode.Text
    next

    Given the comparative sizes of the array and xml document, if I did not need
    search capabilities, I would go with the array.

    Bob Barrows

    --
    Microsoft MVP - ASP/ASP.NET
    Please reply to the newsgroup. This email account is my spam trap so I
    don't check it very often. If you must reply off-line, then remove the
    "NO SPAM"
    Bob Barrows, Feb 7, 2004
    #4
  5. Rob Meade

    Rob Meade Guest

    "Bob Barrows" wrote ...

    > Given the comparative sizes of the array and xml document, if I did not

    need
    > search capabilities, I would go with the array.


    Hi Bob,

    Many thanks for the reply, and examples, I will use the array method for now
    then - many thanks - if you have time - I've another question - see Search
    (part2) :eek:D

    Cheers

    Rob
    Rob Meade, Feb 7, 2004
    #5
  6. Rob Meade

    Roland Hall Guest

    "Bob Barrows" wrote:
    : To find a word in an array:
    : for i = 0 to ubound(ar)
    : if ar(i) = <something> then
    : exit for
    : end if
    : next
    :
    : To find a word in an XML Document:
    : xmldoc.selectsinglenode("/root/node[value='<something>']")
    :
    : There is no extra code involved in looping through a DOM Document:
    :
    : for each oNode in xmldoc.documentelement.childnodes
    : 'do something with oNode.Text
    : next
    :
    : Given the comparative sizes of the array and xml document, if I did not
    need
    : search capabilities, I would go with the array.

    Or you could use Filter and eliminate the For...Next loop:

    <%@ Language=VBScript %>
    <%
    Option Explicit
    Response.Buffer = True

    sub lPrt(strMsg)
    Response.Write(strMsg & "<br />" & vbCrLf)
    end sub

    sub Prt(strMsg)
    Response.Write(strMsg)
    end sub

    sub findWord(arr, fWord)
    if isFound(arr, fWord) = fWord Then
    Response.Write(fWord & " found in array.<br />" & vbCrLf)
    else
    Response.Write(fWord & " not found in array.<br />" & vbCrLf)
    end if
    end sub

    function isFound(arr, fWord)
    dim f
    f = Filter(arr, fWord)
    if ubound(f) <> 0 Then
    isFound = ""
    else
    isFound = f(0)
    end if
    end function

    dim str, myarray, fWord
    str = "one two three four five six seven eight nine ten"
    myarray = Split(str)

    lPrt("Using Filter to find words in an array")
    lPrt("Array elements: " & str)
    Prt("Testing eleven: ")
    findWord myarray, "eleven"
    Prt("Testing five: ")
    findWord myarray, "five"
    %>

    http://kiddanger.com/lab/filter.asp

    --
    Roland Hall
    /* This information is distributed in the hope that it will be useful, but
    without any warranty; without even the implied warranty of merchantability
    or fitness for a particular purpose. */
    Technet Script Center - http://www.microsoft.com/technet/scriptcenter/
    WSH 5.6 Documentation - http://msdn.microsoft.com/downloads/list/webdev.asp
    MSDN Library - http://msdn.microsoft.com/library/default.asp
    Roland Hall, Feb 7, 2004
    #6
  7. Rob Meade

    Bob Barrows Guest

    Roland Hall wrote:
    > "Bob Barrows" wrote:
    >> To find a word in an array:
    >> for i = 0 to ubound(ar)
    >> if ar(i) = <something> then
    >> exit for
    >> end if
    >> next
    >>
    >> To find a word in an XML Document:
    >> xmldoc.selectsinglenode("/root/node[value='<something>']")
    >>
    >> There is no extra code involved in looping through a DOM Document:
    >>
    >> for each oNode in xmldoc.documentelement.childnodes
    >> 'do something with oNode.Text
    >> next
    >>
    >> Given the comparative sizes of the array and xml document, if I did
    >> not need search capabilities, I would go with the array.

    >
    > Or you could use Filter and eliminate the For...Next loop:
    >
    > <%@ Language=VBScript %>
    > <%
    > Option Explicit
    > Response.Buffer = True
    >
    > sub lPrt(strMsg)
    > Response.Write(strMsg & "<br />" & vbCrLf)
    > end sub
    >
    > sub Prt(strMsg)
    > Response.Write(strMsg)
    > end sub
    >
    > sub findWord(arr, fWord)
    > if isFound(arr, fWord) = fWord Then
    > Response.Write(fWord & " found in array.<br />" & vbCrLf)
    > else
    > Response.Write(fWord & " not found in array.<br />" & vbCrLf)
    > end if
    > end sub
    >
    > function isFound(arr, fWord)
    > dim f
    > f = Filter(arr, fWord)
    > if ubound(f) <> 0 Then
    > isFound = ""
    > else
    > isFound = f(0)
    > end if
    > end function
    >
    > dim str, myarray, fWord
    > str = "one two three four five six seven eight nine ten"
    > myarray = Split(str)
    >
    > lPrt("Using Filter to find words in an array")
    > lPrt("Array elements: " & str)
    > Prt("Testing eleven: ")
    > findWord myarray, "eleven"
    > Prt("Testing five: ")
    > findWord myarray, "five"
    > %>
    >
    > http://kiddanger.com/lab/filter.asp


    Hah! I had forgotten about that. Thanks for the heads-up.

    Bob Barrows


    --
    Microsoft MVP - ASP/ASP.NET
    Please reply to the newsgroup. This email account is my spam trap so I
    don't check it very often. If you must reply off-line, then remove the
    "NO SPAM"
    Bob Barrows, Feb 8, 2004
    #7
  8. Rob Meade

    Rob Meade Guest

    "Roland Hall" wrote ...

    > No problem. I was workin' on it couple of days ago so it was fresh in my
    > mind.


    Thanks for that Roland,

    Tell me, is using the filter method more efficient perhaps than what I
    bashed out with two pencils stuck to my head yesterday (I'm guessing so but
    figured would ask)...

    <!--INSERT VERY LARGE ARRAY UP HERE-->

    intMatch = 0

    For intLoop = 0 To UBound(aSearchCriteria)

    For intLoop2 = 0 To UBound(aIgnoreWords)

    If UCase(aSearchCriteria(intLoop)) = UCase(aIgnoreWords(intLoop2)) Then

    intMatch = 1

    strIgnoredWords = strIgnoredWords & aIgnoreWords(intLoop2) & ", "

    Exit For

    End If

    Next

    If intMatch = 0 Then

    strTempSearchCriteria = strTempSearchCriteria & aSearchCriteria(intLoop)
    & " "

    End If

    intMatch = 0

    Next

    strSearchCriteria = Trim(strTempSearchCriteria)

    In this I'm obviously iterating through the entire array of ignore words for
    each word in the search criteria, I am then creating a new string of words
    that are not found which eventually get used as criteria, and I also create
    a string of 'ignored' words which then get dumped on the page to make it
    look really clevaaarrr :eek:D

    You example has less lines of code so I suspect its far more efficient and
    probably the preferred way, mine was bashed out whilst drinking stella :eek:)

    Regards

    Rob
    Rob Meade, Feb 8, 2004
    #8
  9. Rob Meade

    Roland Hall Guest

    "Bob Barrows" wrote:
    : Roland Hall wrote:
    : > "Bob Barrows" wrote:
    : >> To find a word in an array:
    : >> for i = 0 to ubound(ar)
    : >> if ar(i) = <something> then
    : >> exit for
    : >> end if
    : >> next
    : >>
    : >> To find a word in an XML Document:
    : >> xmldoc.selectsinglenode("/root/node[value='<something>']")
    : >>
    : >> There is no extra code involved in looping through a DOM Document:
    : >>
    : >> for each oNode in xmldoc.documentelement.childnodes
    : >> 'do something with oNode.Text
    : >> next
    : >>
    : >> Given the comparative sizes of the array and xml document, if I did
    : >> not need search capabilities, I would go with the array.
    : >
    : > Or you could use Filter and eliminate the For...Next loop:
    : >
    : > <%@ Language=VBScript %>
    : > <%
    : > Option Explicit
    : > Response.Buffer = True
    : >
    : > sub lPrt(strMsg)
    : > Response.Write(strMsg & "<br />" & vbCrLf)
    : > end sub
    : >
    : > sub Prt(strMsg)
    : > Response.Write(strMsg)
    : > end sub
    : >
    : > sub findWord(arr, fWord)
    : > if isFound(arr, fWord) = fWord Then
    : > Response.Write(fWord & " found in array.<br />" & vbCrLf)
    : > else
    : > Response.Write(fWord & " not found in array.<br />" & vbCrLf)
    : > end if
    : > end sub
    : >
    : > function isFound(arr, fWord)
    : > dim f
    : > f = Filter(arr, fWord)
    : > if ubound(f) <> 0 Then
    : > isFound = ""
    : > else
    : > isFound = f(0)
    : > end if
    : > end function
    : >
    : > dim str, myarray, fWord
    : > str = "one two three four five six seven eight nine ten"
    : > myarray = Split(str)
    : >
    : > lPrt("Using Filter to find words in an array")
    : > lPrt("Array elements: " & str)
    : > Prt("Testing eleven: ")
    : > findWord myarray, "eleven"
    : > Prt("Testing five: ")
    : > findWord myarray, "five"
    : > %>
    : >
    : > http://kiddanger.com/lab/filter.asp
    :
    : Hah! I had forgotten about that. Thanks for the heads-up.

    No problem. I was workin' on it couple of days ago so it was fresh in my
    mind.

    Roland
    Roland Hall, Feb 8, 2004
    #9
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Peter Strøiman
    Replies:
    1
    Views:
    2,066
    Peter Strøiman
    Aug 23, 2005
  2. Richard Heathfield
    Replies:
    7
    Views:
    349
    Barry Schwarz
    Oct 5, 2003
  3. utab

    Words Words

    utab, Feb 16, 2006, in forum: C++
    Replies:
    6
    Views:
    413
    Daniel T.
    Feb 16, 2006
  4. BerlinBrown
    Replies:
    6
    Views:
    4,420
  5. pantagruel
    Replies:
    8
    Views:
    420
    Dr John Stockton
    Jul 22, 2006
Loading...

Share This Page