How do I find possible matches using regular expression?

A

Andy

Hi there,

I'm trying to do some predicting work over user input, here's my
question:

for pattern r'match me', the string 'no' will definitely fail to match,
but 'ma' still has a chance if user keep on inputting characters after
'ma', so how do I mark 'ma' as a possible match string?

Thanks a lot,

Andy
 
P

Paul McGuire

Andy said:
Hi there,

I'm trying to do some predicting work over user input, here's my
question:

for pattern r'match me', the string 'no' will definitely fail to match,
but 'ma' still has a chance if user keep on inputting characters after
'ma', so how do I mark 'ma' as a possible match string?

Thanks a lot,

Andy
Maybe .startsWith might be more useful than re.match, since you are
predicting user input based on characters that have been typed so far.

-- Paul
 
A

Andy

The problem is the input will be much more complex than the example, it
could be something like "30 minutes later" where any string starting
with a number is a possible match.


Paul McGuire ¼g¹D¡G
 
F

Fredrik Lundh

Andy said:
The problem is the input will be much more complex than the example, it
could be something like "30 minutes later" where any string starting
with a number is a possible match.

so if I type "1", are you going to suggest all possible numbers
that start with that digit? doesn't strike me as very practical.

maybe you could post a more detailed example, where you clearly
explain what a pattern is and how it is defined, what prediction
means, and what you want to happen as new input arrives, so we
don't have to guess?

</F>
 
J

John Machin

Andy said:
Hi there,

I'm trying to do some predicting work over user input, here's my
question:

for pattern r'match me', the string 'no' will definitely fail to match,
but 'ma' still has a chance if user keep on inputting characters after
'ma', so how do I mark 'ma' as a possible match string?

The answer is: Using regular expressions doesn't seem like a good idea.
If you want to match against only one target, then
target.startswith(user_input) is, as already mentioned, just fine.
However if you have multiple targets, like a list of computer-language
keywords, or the similar problem of an IME for a language like Chinese,
then you can set up a prefix-tree dictionary so that you can search the
multiple target keywords in parallel. All you need to do is keep a
"finger" pointed at the node you have reached along the path; after
each input character, either the finger gets pointed at the next
relevant node (if the input character is valid) or you return/raise a
failure indication.

HTH,
John
 
P

Peter Otten

Andy said:
I'm trying to do some predicting work over user input, here's my
question:

for pattern r'match me', the string 'no' will definitely fail to match,
but 'ma' still has a chance if user keep on inputting characters after
'ma', so how do I mark 'ma' as a possible match string?

The following may or may not work in the real world:

import re

def parts(regex, flags=0):
candidates = []
for stop in reversed(range(1, len(regex)+1)):
partial = regex[:stop]
try:
r = re.compile(partial + "$", flags)
except re.error:
pass
else:
candidates.append(r)
candidates.reverse()
return candidates

if __name__ == "__main__":
candidates = parts(r"[a-z]+\s*=\s*\d+", re.IGNORECASE)
def check(*args):
s = var.get()
for c in candidates:
m = c.match(s)
if m:
entry.configure(foreground="#008000")
break
else:
entry.configure(foreground="red")


import Tkinter as tk
root = tk.Tk()
var = tk.StringVar()
var.trace_variable("w", check)
entry = tk.Entry(textvariable=var)
entry.pack()
root.mainloop()

The example lets you write an assignment of a numerical value, e. g

meaning = 42

and colours the text in green or red for legal/illegal entries.

Peter
 
A

Andy

OK, here's what I want...

I'm doing a auto-tasking tool in which user can specify the execution
rule by inputting English instead of a complex GUI interface(Normally a
combination of many controls). This way is way better user interaction.

So the problem comes down to "understanding" user input and
"suggesting" possible inputs when user is writing a rule.

Rules will be like "30 minutes later", "Next Monday", "Every 3 hours",
"3pm"...Sure this is an infinite collection, but it doesn't have to be
perfect , it might make mistakes given inputs like "10 minutes after
Every Monday noon".

The "suggesting" feature is even harder, I'm still investigating
possibilities.

Tried NLTK_Lite, I'm sure it can understands well a good user input,
but it is not doing good with some bad inputs("2 hours later here"),
bad inputs makes the toolkit fails to parse it. And NLTK also does not
help on the suggesting part.

Now I'm thinking manipulating regular expressions. I think it's
possible to come up with a collection of REs to understand basic
execution rules. And the question is again how to do suggestions with
the RE collection.

Any thoughts on this subject?

I'm not a native English speaker so...please, be mistake tolerant with
my post here:)




"Fredrik Lundh дµÀ£º
"
 
A

Andy

The seems good to me, I'll try it out, thanks for the posting.


"Peter Otten дµÀ£º
"
Andy said:
I'm trying to do some predicting work over user input, here's my
question:

for pattern r'match me', the string 'no' will definitely fail to match,
but 'ma' still has a chance if user keep on inputting characters after
'ma', so how do I mark 'ma' as a possible match string?

The following may or may not work in the real world:

import re

def parts(regex, flags=0):
candidates = []
for stop in reversed(range(1, len(regex)+1)):
partial = regex[:stop]
try:
r = re.compile(partial + "$", flags)
except re.error:
pass
else:
candidates.append(r)
candidates.reverse()
return candidates

if __name__ == "__main__":
candidates = parts(r"[a-z]+\s*=\s*\d+", re.IGNORECASE)
def check(*args):
s = var.get()
for c in candidates:
m = c.match(s)
if m:
entry.configure(foreground="#008000")
break
else:
entry.configure(foreground="red")


import Tkinter as tk
root = tk.Tk()
var = tk.StringVar()
var.trace_variable("w", check)
entry = tk.Entry(textvariable=var)
entry.pack()
root.mainloop()

The example lets you write an assignment of a numerical value, e. g

meaning = 42

and colours the text in green or red for legal/illegal entries.

Peter
 
A

Andy

This works well as a checking strategy, but what I want is a suggesting
list...

Maybe what I want is not practical at all?

Thanks anyway Peter.

Andy Wu


Andy Œ‘µÀ£º
The seems good to me, I'll try it out, thanks for the posting.


"Peter Otten дµÀ£º
"
Andy said:
I'm trying to do some predicting work over user input, here's my
question:

for pattern r'match me', the string 'no' will definitely fail to match,
but 'ma' still has a chance if user keep on inputting characters after
'ma', so how do I mark 'ma' as a possible match string?

The following may or may not work in the real world:

import re

def parts(regex, flags=0):
candidates = []
for stop in reversed(range(1, len(regex)+1)):
partial = regex[:stop]
try:
r = re.compile(partial + "$", flags)
except re.error:
pass
else:
candidates.append(r)
candidates.reverse()
return candidates

if __name__ == "__main__":
candidates = parts(r"[a-z]+\s*=\s*\d+", re.IGNORECASE)
def check(*args):
s = var.get()
for c in candidates:
m = c.match(s)
if m:
entry.configure(foreground="#008000")
break
else:
entry.configure(foreground="red")


import Tkinter as tk
root = tk.Tk()
var = tk.StringVar()
var.trace_variable("w", check)
entry = tk.Entry(textvariable=var)
entry.pack()
root.mainloop()

The example lets you write an assignment of a numerical value, e. g

meaning = 42

and colours the text in green or red for legal/illegal entries.

Peter
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,578
Members
45,052
Latest member
LucyCarper

Latest Threads

Top