Pattern matching with string and list

Discussion in 'Python' started by olaufr@gmail.com, Dec 12, 2005.

  1. Guest

    Hi,

    I'd need to perform simple pattern matching within a string using a
    list of possible patterns. For example, I want to know if the substring
    starting at position n matches any of the string I have a list, as
    below:

    sentence = "the color is $red"
    patterns = ["blue","red","yellow"]
    pos = sentence.find($)
    # here I need to find whether what's after 'pos' matches any of the
    strings of my 'patterns' list
    bmatch = ismatching( sentence[pos:], patterns)

    Is an equivalent of this ismatching() function existing in some Python
    lib?

    Thanks,

    Olivier.
    , Dec 12, 2005
    #1
    1. Advertising

  2. wrote:
    > Hi,
    >
    > I'd need to perform simple pattern matching within a string using a
    > list of possible patterns. For example, I want to know if the substring
    > starting at position n matches any of the string I have a list, as
    > below:
    >
    > sentence = "the color is $red"
    > patterns = ["blue","red","yellow"]
    > pos = sentence.find($)
    > # here I need to find whether what's after 'pos' matches any of the
    > strings of my 'patterns' list
    > bmatch = ismatching( sentence[pos:], patterns)
    >
    > Is an equivalent of this ismatching() function existing in some Python
    > lib?
    >
    > Thanks,
    >
    > Olivier.
    >

    As I think you define it, ismatching can be written as:

    >>> def ismatching(sentence, patterns):

    ... re_pattern = re.compile("(%s)\Z" % "|".join(patterns))
    ... return bool(re_pattern.match(sentence))
    ...
    >>> ismatching(sentence[pos+1:], patterns)

    True
    >>> ismatching(sentence[pos+1:], ["green", "blue"])

    False
    >>>

    (For help with regular expressions, see: http://www.amk.ca/python/howto/regex/)


    or, you can ask the regexp engine to starting looking at a point you specify:

    >>> def ismatching(sentence, patterns, startingpos = 0):

    ... re_pattern = re.compile("(%s)\Z" % "|".join(patterns))
    ... return bool(re_pattern.match(sentence, startingpos))
    ...
    >>> ismatching(sentence, patterns, pos+1)

    True
    >>>



    but, you may be able to save the separate step of determining pos, by including
    it in the regexp, e.g.,

    >>> def matching(patterns, sentence):

    ... re_pattern = re.compile("\$(%s)" % "|".join(patterns))
    ... return bool(re_pattern.search(sentence))
    ...
    >>> matching(patterns, sentence)

    True
    >>> matching(["green", "blue"], sentence)

    False
    >>>


    then, it might be more general useful to return the match, rather than the
    boolean value - you can still use it in truth testing, since a no-match will
    evaluate to False

    >>> def matching(patterns, sentence):

    ... re_pattern = re.compile("\$(%s)" % "|".join(patterns))
    ... return re_pattern.search(sentence)
    ...
    >>> if matching(patterns, sentence): print "Match"

    ...
    Match
    >>>



    Finally, if you are going to be doing a lot of these it would be faster to take
    the pattern compilation out of the function, and simply use the pre-compiled
    regexp, or as below, its bound method: search:

    >>> matching = re.compile("\$(%s)\Z" % "|".join(patterns)).search
    >>> matching(sentence)

    <_sre.SRE_Match object at 0x01847E60>
    >>> bool(_)

    True
    >>> bool(matching("the color is $red but there is more"))

    False
    >>> bool(matching("the color is $pink"))

    False
    >>> bool(matching("the $color is $red"))

    True
    >>>


    HTH

    Michael
    Michael Spencer, Dec 13, 2005
    #2
    1. Advertising

  3. Tom Anderson Guest

    On Mon, 12 Dec 2005 wrote:

    > I'd need to perform simple pattern matching within a string using a list
    > of possible patterns. For example, I want to know if the substring
    > starting at position n matches any of the string I have a list, as
    > below:
    >
    > sentence = "the color is $red"
    > patterns = ["blue","red","yellow"]
    > pos = sentence.find($)


    I assume that's a typo for "sentence.find('$')", rather than some new
    syntax i've not learned yet!

    > # here I need to find whether what's after 'pos' matches any of the
    > strings of my 'patterns' list
    > bmatch = ismatching( sentence[pos:], patterns)
    >
    > Is an equivalent of this ismatching() function existing in some Python
    > lib?


    I don't think so, but it's not hard to write:

    def ismatching(target, patterns):
    for pattern in patterns:
    if target.startswith(pattern):
    return True
    return False

    You don't say what bmatch should be at the end of this, so i'm going with
    a boolean; it would be straightforward to return the pattern which
    matched, or the index of the pattern which matched in the pattern list, if
    that's what you want.

    The tough guy way to do this would be with regular expressions (in the re
    module); you could do the find-the-$ and the match-a-pattern bit in one
    go:

    import re
    patternsRe = re.compile(r"\$(blue)|(red)|(yellow)")
    bmatch = patternsRe.search(sentence)

    At the end, bmatch is None if it didn't match, or an instance of re.Match
    (from which you can get details of the match) if it did.

    If i was doing this myself, i'd be a bit cleaner and use non-capturing
    groups:

    patternsRe = re.compile(r"\$(?:blue)|(?:red)|(?:yellow)")

    And if i did want to capture the colour string, i'd do it like this:

    patternsRe = re.compile(r"\$((?:blue)|(?:red)|(?:yellow))")

    If this all looks like utter gibberish, DON'T PANIC! Regular expressions
    are quite scary to begin with (and certainly not very regular-looking!),
    but they're actually quite simple, and often a very powerful tool for text
    processing (don't get carried way, though; regular expressions are a bit
    like absinthe, in that a little helps your creativity, but overindulgence
    makes you use perl).

    In fact, we can tame the regular expressions quite neatly by writing a
    function which generates them:

    def regularly_express_patterns(patterns):
    pattern_regexps = map(
    lambda pattern: "(?:%s)" % re.escape(pattern),
    patterns)
    regexp = r"\$(" + "|".join(pattern_regexps) + ")"
    return re.compile(regexp)

    patternsRe = regularly_express_patterns(patterns)

    tom

    --
    limited to concepts that are meta, generic, abstract and philosophical --
    IEEE SUO WG
    Tom Anderson, Dec 13, 2005
    #3
  4. Taking you literally, I'm not sure you need regex. If you know or can
    find position n, then can't you just:

    sentence = "the color is $red"
    patterns = ["blue","red","yellow"]
    pos = sentence.find("$")
    for x in patterns:
    if x==sentence[pos+1:]:
    print x, pos+1

    But maybe I'm oversimplifying.

    rpd
    BartlebyScrivener, Dec 13, 2005
    #4
  5. Even without the marker, can't you do:

    sentence = "the fabric is red"
    colors = ["red", "white", "blue"]

    for color in colors:
    if (sentence.find(color) > 0):
    print color, sentence.find(color)
    BartlebyScrivener, Dec 13, 2005
    #5
  6. BartlebyScrivener wrote:
    > Even without the marker, can't you do:
    >
    > sentence = "the fabric is red"
    > colors = ["red", "white", "blue"]
    >
    > for color in colors:
    > if (sentence.find(color) > 0):
    > print color, sentence.find(color)
    >

    That depends on whether you're only looking for whole words:

    >>> colors = ['red', 'green', 'blue']
    >>> def findIt(sentence):

    .... for color in colors:
    .... if sentence.find(color) > 0:
    .... print color, sentence.find(color)
    ....
    >>> findIt("This is red")

    red 8
    >>> findIt("Fredrik Lundh")

    red 1
    >>>


    It's easy to see all the cases that this approach will fail for...
    Brett g Porter, Dec 13, 2005
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Synonymous
    Replies:
    10
    Views:
    483
    Synonymous
    Apr 22, 2005
  2. Xah Lee
    Replies:
    1
    Views:
    926
    Ilias Lazaridis
    Sep 22, 2006
  3. Xah Lee
    Replies:
    8
    Views:
    454
    Ilias Lazaridis
    Sep 26, 2006
  4. Marc Bissonnette

    Pattern matching : not matching problem

    Marc Bissonnette, Jan 8, 2004, in forum: Perl Misc
    Replies:
    9
    Views:
    219
    Marc Bissonnette
    Jan 13, 2004
  5. Bobby Chamness
    Replies:
    2
    Views:
    211
    Xicheng Jia
    May 3, 2007
Loading...

Share This Page