Keyword Parsing with ASP

Discussion in 'ASP General' started by ARK, Aug 29, 2003.

  1. ARK

    ARK Guest

    I am writing a search program in ASP(VBScript). The user can enter keywords
    and press submit.
    The user can separate the keywords by spaces and/or commas and key words may
    contain plain words, single quoted strings (phrases), double quoted strings
    (phrases).
    For example:

    Keywords:

    Jack, Jill, Jim, "Timothy Brown", Mary OR
    Jack Jill Jim 'Timothy Brown' Mary OR
    Jack, Jill Jim, 'Timothy Brown' "Mary"

    When I parse it i store the keywords in an array. The results must be:

    Jack
    Jill
    Jim
    Timothy Brown
    Mary

    I have tried doing this using Split but am unable to get the Phrases. Some
    suggestions, code examples or links would help.

    Thanks in advance

    ARK.
     
    ARK, Aug 29, 2003
    #1
    1. Advertising

  2. You might want to replace the spaces the user puts in with commas and
    then use the split command.

    strVariable = Replace(strVariable," ",",")
    Split(strVariable,",")

    Then you should have your array of items.

    hth,
    Andrew

    * * * Sent via DevBuilder http://www.devbuilder.org * * *
    Developer Resources for High End Developers.
     
    Andrew J Durstewitz, Aug 29, 2003
    #2
    1. Advertising

  3. ARK

    TomB Guest

    Unfortunately, that would put a comma in "Timothy Brown" as well.

    My suggestion would be to work your way through the string a character at a
    time. If the character is a space, and not within quotes (" or ') then add
    a comma, otherwise move along








    "Andrew J Durstewitz" <> wrote in message
    news:3f4f51c1$0$62076$...
    > You might want to replace the spaces the user puts in with commas and
    > then use the split command.
    >
    > strVariable = Replace(strVariable," ",",")
    > Split(strVariable,",")
    >
    > Then you should have your array of items.
    >
    > hth,
    > Andrew
    >
    > * * * Sent via DevBuilder http://www.devbuilder.org * * *
    > Developer Resources for High End Developers.
     
    TomB, Aug 29, 2003
    #3
  4. > My suggestion would be to work your way through the string a character at
    a
    > time. If the character is a space, and not within quotes (" or ') then

    add
    > a comma, otherwise move along


    This can get infinitely complex, e.g.

    Bob, Mary, "Timothy, Brown" 'franke, "tom, hula hoop" mea, culpa"

    You never know what a user is going to enter, and it's hard to write code to
    understand exactly what they mean.

    I'd be really interested to see how Google's parsing algorithm works. I
    wasn't brave enough to do that: www.aspfaq.com supports all words, any
    words, or exact phrase... but no combination of the three.
     
    Aaron Bertrand - MVP, Aug 29, 2003
    #4
  5. I agree, keep it simple as possible. Setting up validation characters
    such as " will require that you analyze the string one character at a
    time.

    Andrew

    * * * Sent via DevBuilder http://www.devbuilder.org * * *
    Developer Resources for High End Developers.
     
    Andrew Durstewitz, Aug 29, 2003
    #5
  6. ARK

    Ray at Guest

    Google is a non-stop source of awe. This is why I buy Google t-shirts. The
    calculator can also do some math too in addition to unit conversion (okay,
    that's also math, but fine), i.e.
    5 percent of 343

    Ray at work

    "TomB" <> wrote in message
    news:...
    > Yes you are right, that would be complex.
    > Speaking of Google's parsing have you seen the calculator? Try

    "searching"
    > for
    >
    > 100 kilometers in miles
    >
    > Very cool.
    >
    >
     
    Ray at, Aug 29, 2003
    #6
  7. ARK

    Jon Mundsack Guest

    SWEET! Thanks for the tip.

    "TomB" <> wrote in message
    news:...
    > Yes you are right, that would be complex.
    > Speaking of Google's parsing have you seen the calculator? Try

    "searching"
    > for
    >
    > 100 kilometers in miles
    >
    > Very cool.
    >
    >
    > "Aaron Bertrand - MVP" <> wrote in message
    > news:%...
    > > > My suggestion would be to work your way through the string a character

    > at
    > > a
    > > > time. If the character is a space, and not within quotes (" or ')

    then
    > > add
    > > > a comma, otherwise move along

    > >
    > > This can get infinitely complex, e.g.
    > >
    > > Bob, Mary, "Timothy, Brown" 'franke, "tom, hula hoop" mea, culpa"
    > >
    > > You never know what a user is going to enter, and it's hard to write

    code
    > to
    > > understand exactly what they mean.
    > >
    > > I'd be really interested to see how Google's parsing algorithm works. I
    > > wasn't brave enough to do that: www.aspfaq.com supports all words, any
    > > words, or exact phrase... but no combination of the three.
    > >
    > >

    >
    >
     
    Jon Mundsack, Aug 29, 2003
    #7
  8. ARK

    TomB Guest

    Yeah, I think that's why they call it a calculator ;)
    I thought the fact that it was able to determine that I wanted a calculation
    rather than a search for the words was the cool part.


    "Ray at <%=sLocation%>" <myfirstname at lane34 dot com> wrote in message
    news:...
    > Google is a non-stop source of awe. This is why I buy Google t-shirts.

    The
    > calculator can also do some math too in addition to unit conversion (okay,
    > that's also math, but fine), i.e.
    > 5 percent of 343
    >
    > Ray at work
    >
    > "TomB" <> wrote in message
    > news:...
    > > Yes you are right, that would be complex.
    > > Speaking of Google's parsing have you seen the calculator? Try

    > "searching"
    > > for
    > >
    > > 100 kilometers in miles
    > >
    > > Very cool.
    > >
    > >

    >
    >
     
    TomB, Aug 29, 2003
    #8
  9. It even works with slightly more complex phrases, like 100 degrees
    fahrenheit in celsius



    "TomB" <> wrote in message
    news:...
    > Yeah, I think that's why they call it a calculator ;)
    > I thought the fact that it was able to determine that I wanted a

    calculation
    > rather than a search for the words was the cool part.
     
    Aaron Bertrand - MVP, Aug 29, 2003
    #9
  10. ARK

    Bob Barrows Guest

    OK, I've come up with the following function that returns an array
    containing the keywords. However, in order for this to work, you need to set
    some ground rules:
    1. Don't mix delimiters for a phrase. This will work correctly:
    Jack, Jill Jim, "Timothy Brown", 'Mary'
    but this will not:
    Jack, Jill Jim, "Timothy Brown', 'Mary'

    2. If literal delimiter characters are used, then they must not match the
    delimiters used. For example, this will work:
    "O'Malley"
    but this will not:
    'O'Malley'
    Also, if literal delimiter characters are used, all delimiters in the entire
    list must be the same. This will work:
    Jim, "Tom Brown", "Pat O'Malley"
    This won't:
    Jim, 'Tom Brown', "Pat O'Malley"

    Anyways, the function appears below my signature. You can use this code to
    test it:
    Dim iCount, arResult, sWords
    sWords="Jack, Jill Jim, ""Timothy Brown"", 'Mary'"
    Response.Write sWords & "<BR>"
    arResult= ParseKeywords(sWords)
    if IsArray(arResult) then
    for iCount = 0 to UBound(arResult)
    Response.Write arResult(iCount) & "<BR>"
    next
    end if

    HTH,
    Bob Barrows

    Function ParseKeywords(pKeywords)
    Dim sKeywords,iQuotes, arQuoted(), i, j, k, sTmp, bQfound, bSQFound
    dim iCommas, arCommas, arSpaces, bArrayDefined, arKeywords()
    bArrayDefined = false
    sKeywords = pKeywords
    'first see if sKeywords contains quoted sections - if so, make
    'sure they are paired, ie, there is an even number of quotes
    iQuotes = len(sKeywords) - len(Replace(sKeywords,"""",""))
    bQfound = false
    if iQuotes > 0 then
    if iQuotes mod 2 = 0 then
    bQfound = true
    redim arQuoted(iQuotes/2 - 1)
    i=instr(sKeywords,"""")
    k = 0
    Do Until i = 0
    j = instr(i+1,sKeywords,"""")
    sTmp = mid(sKeywords,i,j+1-i)
    arQuoted(k) = sTmp
    k=k+1
    sKeywords = replace(sKeywords,sTmp,"")
    i=instr(sKeywords,"""")
    Loop
    for i = 0 to ubound(arQuoted)
    arQuoted(i) = replace(arQuoted(i),"""","")
    next
    end if
    end if

    'now find single-quoted sections
    iQuotes = len(sKeywords) - len(Replace(sKeywords,"'",""))
    bSQFound = false
    if iQuotes > 0 then
    if iQuotes mod 2 = 0 then
    bSQFound = true
    if bQfound = false then
    redim arQuoted(iQuotes/2 - 1)
    k = 0
    else
    k = ubound(arQuoted) + 1
    Redim preserve arQuoted(UBound(arQuoted) + iQuotes/2)
    end if
    i=instr(sKeywords,"'")
    Do Until i = 0
    j = instr(i+1,sKeywords,"'")
    sTmp = mid(sKeywords,i,j+1-i)
    arQuoted(k) = sTmp
    k=k+1
    sKeywords = replace(sKeywords,sTmp,"")
    i=instr(sKeywords,"'")
    Loop
    for i = 0 to ubound(arQuoted)
    arQuoted(i) = replace(arQuoted(i),"'","")
    next
    end if
    end if
    sKeywords = RTrim(sKeywords)
    do until right(sKeywords,1) <> ","
    sKeywords = rtrim(left(sKeywords,len(sKeywords)-1))
    loop

    'add quoted sections to result array
    if bQfound or bSQFound then
    redim arKeywords(UBound(arQuoted))
    for i = 0 to ubound(arQuoted)
    arKeywords(i) = arQuoted(i)
    next
    bArrayDefined = true
    end if

    'now process commas and spaces

    iCommas = len(sKeywords) - len(Replace(sKeywords,",",""))
    arCommas=split(sKeywords,",")
    for i = 0 to ubound(arCommas)
    arCommas(i) = RTrim(LTrim(arCommas(i)))
    if len(arCommas(i)) > 0 then
    if instr(arCommas(i)," ") = 0 then
    if bArrayDefined then
    redim preserve arKeywords(UBound(arKeywords) + 1)
    else
    redim arKeywords(0)
    end if
    arKeywords(ubound(arKeywords)) = arCommas(i)
    else
    arSpaces = split(arCommas(i)," ")
    for j = 0 to ubound(arSpaces)
    arSpaces(j) = RTrim(LTrim(arSpaces(j)))
    if len(arSpaces(j)) > 0 then
    if bArrayDefined then
    redim preserve arKeywords(UBound(arKeywords) + 1)
    else
    redim arKeywords(0)
    end if
    arKeywords(ubound(arKeywords)) = arSpaces(j)
    end if
    next
    end if
    end if
    next
    ParseKeywords=arKeywords
    end function


    ARK wrote:
    > I am writing a search program in ASP(VBScript). The user can enter
    > keywords and press submit.
    > The user can separate the keywords by spaces and/or commas and key
    > words may contain plain words, single quoted strings (phrases),
    > double quoted strings (phrases).
    > For example:
    >
    > Keywords:
    >
    > Jack, Jill, Jim, "Timothy Brown", Mary OR
    > Jack Jill Jim 'Timothy Brown' Mary OR
    > Jack, Jill Jim, 'Timothy Brown' "Mary"
    >
    > When I parse it i store the keywords in an array. The results must be:
    >
    > Jack
    > Jill
    > Jim
    > Timothy Brown
    > Mary
    >
    > I have tried doing this using Split but am unable to get the Phrases.
    > Some suggestions, code examples or links would help.
    >
    > Thanks in advance
    >
    > ARK.
     
    Bob Barrows, Aug 29, 2003
    #10
  11. ARK

    Bob Barrows Guest

    Chris Hohmann wrote:
    > Here's a regular expression alternative:
    > <%
    > Dim s,oRE,oMatches,oMatch
    > s = "Jack, Jill, Jim, 'Timothy Brown', Mary"
    > Set oRE = New RegExp
    > oRE.Global=True
    > oRE.Pattern = "\w+|('|"")([^\1]|\1{2})+\1"
    > Set oMatches = oRE.Execute(s)
    > For Each oMatch In oMatches
    > Response.Write oMatch.Value & "<br>"
    > Next
    > %>
    >

    Showoff! ;-)

    Actually, I have to dive into this regexp stuff. I've been meaning to but I
    just haven't had the time.

    If you have a few min. could you break down that pattern you used and
    explain each element?

    I'm assuming the same ground rules I laid out still apply to your solution
    here ... ?

    Bob Barrows
     
    Bob Barrows, Aug 29, 2003
    #11
  12. "Bob Barrows" <> wrote in message
    news:...
    > Chris Hohmann wrote:
    > > Here's a regular expression alternative:
    > > <%
    > > Dim s,oRE,oMatches,oMatch
    > > s = "Jack, Jill, Jim, 'Timothy Brown', Mary"
    > > Set oRE = New RegExp
    > > oRE.Global=True
    > > oRE.Pattern = "\w+|('|"")([^\1]|\1{2})+\1"
    > > Set oMatches = oRE.Execute(s)
    > > For Each oMatch In oMatches
    > > Response.Write oMatch.Value & "<br>"
    > > Next
    > > %>
    > >

    > Showoff! ;-)
    >
    > Actually, I have to dive into this regexp stuff. I've been meaning to

    but I
    > just haven't had the time.
    >
    > If you have a few min. could you break down that pattern you used and
    > explain each element?
    >
    > I'm assuming the same ground rules I laid out still apply to your

    solution
    > here ... ?
    >
    > Bob Barrows
    >


    Sure...

    \w+ = a series of one(1) or more word characters, i.e.
    [a-zA-Z0-9_]

    | = OR

    ('|") = a quote (") OR an apostrophe ('), let call this submatch
    QUALIFIER

    ([^\1]|\1{2})+ = one(1) or more characters that are either not the
    QUALIFIER OR a double occurrence of the QUALIFIER (escaping quotes)

    \1 = a closing instance of the QUALIFIER

    A perennial favorite for those interested in regular expressions is
    O'Reilly's "Mastering Regular Expressions" (ISBN:0596002890)

    "HTH".replace(/HTH/g,"Hope that helps,");
    -Chris
     
    Chris Hohmann, Aug 29, 2003
    #12
  13. "Bob Barrows" <> wrote in message
    news:...
    > I'm assuming the same ground rules I laid out still apply to your

    solution
    > here ... ?


    Sorry, I forgot to answer this in my previous post. Your first rule
    about balanced(matched) text qualifiers applies to my solution as well.
    However, your second rule does not apply. The value list can contain a
    mixture of quote-qualified phrases and apostrophe qualified phrases.
    Also, a qualifier can be embedded into a phrase by doubling-it-up
    (escaping). Finally, regular expression, by default are greedy
    algorithms (although you can override this behavior). As such the
    expression will match as much of the string as possible. Having said all
    that, the following should be a valid value list:

    Bob, Barrows, "Bob 'The Man' Barrows", 'Bob "The Man" Barrows', "Bob
    ""The Man"" Barrows", 'Bob ''The Man'' Barrows'

    HTH
    -Chris
     
    Chris Hohmann, Aug 29, 2003
    #13
  14. ARK

    ARK Guest

    Hi! Everyone,

    Thanks for the replies. I will try out the code and post my findings. What
    version onwards VBScript
    supports Regular Expressions?

    Thanks again!
    ARK.

    "ARK" <> wrote in message
    news:...
    > I am writing a search program in ASP(VBScript). The user can enter

    keywords
    > and press submit.
    > The user can separate the keywords by spaces and/or commas and key words

    may
    > contain plain words, single quoted strings (phrases), double quoted

    strings
    > (phrases).
    > For example:
    >
    > Keywords:
    >
    > Jack, Jill, Jim, "Timothy Brown", Mary OR
    > Jack Jill Jim 'Timothy Brown' Mary OR
    > Jack, Jill Jim, 'Timothy Brown' "Mary"
    >
    > When I parse it i store the keywords in an array. The results must be:
    >
    > Jack
    > Jill
    > Jim
    > Timothy Brown
    > Mary
    >
    > I have tried doing this using Split but am unable to get the Phrases. Some
    > suggestions, code examples or links would help.
    >
    > Thanks in advance
    >
    > ARK.
    >
    >
     
    ARK, Aug 30, 2003
    #14
  15. ARK

    ARK Guest

    I tried the Function that uses RegExp but the following does not work
    on my server (windows 2000 Prof./IIS 5.0) -

    Set re = new RegExp

    This Object is supposed to be supported by VBScript 5.0 which comes in
    Windows 2000 / IE 5.0 upwards, how come it does not work on my server?



    "ARK" <> wrote in message
    news:...
    > I am writing a search program in ASP(VBScript). The user can enter

    keywords
    > and press submit.
    > The user can separate the keywords by spaces and/or commas and key words

    may
    > contain plain words, single quoted strings (phrases), double quoted

    strings
    > (phrases).
    > For example:
    >
    > Keywords:
    >
    > Jack, Jill, Jim, "Timothy Brown", Mary OR
    > Jack Jill Jim 'Timothy Brown' Mary OR
    > Jack, Jill Jim, 'Timothy Brown' "Mary"
    >
    > When I parse it i store the keywords in an array. The results must be:
    >
    > Jack
    > Jill
    > Jim
    > Timothy Brown
    > Mary
    >
    > I have tried doing this using Split but am unable to get the Phrases. Some
    > suggestions, code examples or links would help.
    >
    > Thanks in advance
    >
    > ARK.
    >
    >
     
    ARK, Sep 1, 2003
    #15
  16. ARK

    Bob Barrows Guest

    ARK wrote:
    > I tried the Function that uses RegExp but the following does not work
    > on my server (windows 2000 Prof./IIS 5.0) -
    >
    > Set re = new RegExp
    >
    > This Object is supposed to be supported by VBScript 5.0 which comes in
    > Windows 2000 / IE 5.0 upwards, how come it does not work on my server?


    It's very difficult to troubleshoot when all we are told is that something
    "does not work." If a user called you and said one of your programs did not
    work, what would be your first response?

    Bob Barrows
     
    Bob Barrows, Sep 1, 2003
    #16
  17. ARK

    Bob Barrows Guest

    William Tasso wrote:
    > Bob Barrows wrote:
    >> ARK wrote:
    >>> I tried the Function that uses RegExp but the following does not
    >>> work on my server (windows 2000 Prof./IIS 5.0) -
    >>>
    >>> Set re = new RegExp
    >>>
    >>> This Object is supposed to be supported by VBScript 5.0 which comes
    >>> in Windows 2000 / IE 5.0 upwards, how come it does not work on my
    >>> server?

    >>
    >> It's very difficult to troubleshoot when all we are told is that
    >> something "does not work." If a user called you and said one of your
    >> programs did not work, what would be your first response?

    >
    > Can I have your card # and expiry date ???


    LOL
    OK - I meant "after that"
     
    Bob Barrows, Sep 1, 2003
    #17
  18. "Bob Barrows" <> wrote in message
    news:...
    > Chris Hohmann wrote:
    > > "Bob Barrows" <> wrote in message
    > > news:...
    > >> I'm assuming the same ground rules I laid out still apply to your
    > >> solution here ... ?

    > >
    > > Sorry, I forgot to answer this in my previous post. Your first rule
    > > about balanced(matched) text qualifiers applies to my solution as
    > > well. However, your second rule does not apply. The value list can
    > > contain a mixture of quote-qualified phrases and apostrophe

    qualified
    > > phrases. Also, a qualifier can be embedded into a phrase by
    > > doubling-it-up (escaping). Finally, regular expression, by default
    > > are greedy algorithms (although you can override this behavior). As
    > > such the expression will match as much of the string as possible.

    >
    > So the users will have to be trained to escape their quotes, eh? I

    don't
    > know ... it's hard enough to train some of the programmers to do this

    ....
    > ;-)
    >
    > Bob


    Only if they want to embed quotes in the phrase they're looking for.
    Most users (and programmers) can remain ignorantly blissful about the
    concept. Perhaps you should teach your users about "stored procedure as
    method", then they wouldn't have to worry about quotes/apostrophes in
    their parameters. :)
     
    Chris Hohmann, Sep 1, 2003
    #18
  19. ARK

    ARK Guest

    Well the error shown is -
    Technical Information (for support personnel)

    a.. Error Type:
    (0x8002801D)
    Library not registered.
    /regexp.asp, line 5
    b.. Browser Type:
    Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)

    c.. Page:
    GET /regexp.asp
    and Line 5 happens to have the following -

    Set re = new RegExp

    I guess the dll is there somewhere and it's not got registered during
    Windows 2K Install?

    "William Tasso" <> wrote in message
    news:O0x#...
    > Bob Barrows wrote:
    > > ARK wrote:
    > >> I tried the Function that uses RegExp but the following does not work
    > >> on my server (windows 2000 Prof./IIS 5.0) -
    > >>
    > >> Set re = new RegExp
    > >>
    > >> This Object is supposed to be supported by VBScript 5.0 which comes
    > >> in Windows 2000 / IE 5.0 upwards, how come it does not work on my
    > >> server?

    > >
    > > It's very difficult to troubleshoot when all we are told is that
    > > something "does not work." If a user called you and said one of your
    > > programs did not work, what would be your first response?

    >
    > Can I have your card # and expiry date ???
    >
    > --
    > William Tasso - http://WilliamTasso.com
    >
    >
     
    ARK, Sep 3, 2003
    #19
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Dhruba Bandopadhyay

    ASP VIRTUAL keyword not working in ASP.NET anymore :S

    Dhruba Bandopadhyay, Mar 22, 2006, in forum: ASP .Net
    Replies:
    4
    Views:
    4,960
    Dhruba Bandopadhyay
    Mar 22, 2006
  2. Replies:
    6
    Views:
    486
    Peter Otten
    May 10, 2007
  3. Hamilton, William

    RE: keyword checker - keyword.kwlist

    Hamilton, William, May 10, 2007, in forum: Python
    Replies:
    4
    Views:
    376
  4. Replies:
    3
    Views:
    113
    Josef Moellers
    Jun 9, 2005
  5. pgodfrin

    Parsing keyword=value pairs

    pgodfrin, Feb 22, 2009, in forum: Perl Misc
    Replies:
    4
    Views:
    161
    Tim McDaniel
    Feb 23, 2009
Loading...

Share This Page