Ask for help about Regexp

E

Eric Luo

------=_Part_2606_21813583.1119967876875
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline

Hi ,
I need to match my input dynamically by Regexp, For example:
I want match to this regexp: mask =3D /\d{10}tt\d{0, 5}
then I give my input byte by byte. And I want to be noticed at the first=20
time when it is impossible=20
for me to match my input to the regexp mask.=20
so if i input 34789d, when I the input the character 'd', I will be noticed=
 
N

Nikolai Weibull

Eric said:
I need to match my input dynamically by Regexp, For example: I want
match to this regexp: mask = /\d{10}tt\d{0, 5} then I give my input
byte by byte. And I want to be noticed at the first time when it is
impossible for me to match my input to the regexp mask. so if i input
34789d, when I the input the character 'd', I will be noticed.

I wonder if it's possible or not. if it is, how could I do that?

Well, you can’t easily do that. You could write a regex that matches
valid prefixes of the strings of the language of the final regex, but
that can become quite complicated. You should probably rethink your
algorithm,
nikolai
 
E

Eric Luo

------=_Part_4855_4083229.1120006700004
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline

I want a GUI interface. when i give my input one by one. and I will get an=
=20
error message
if what I input is not compatible with my expected mask.
I had tried to google the solution for this, but I find nothing.=20
Would you please to give me any solutuion or clue.
Thanks

On 6/28/05, Nikolai Weibull <[email protected]=
g>=20
wrote:=20
=20
Eric Luo wrote:
=20
I need to match my input dynamically by Regexp, For example: I want
match to this regexp: mask =3D /\d{10}tt\d{0, 5} then I give my input
byte by byte. And I want to be noticed at the first time when it is
impossible for me to match my input to the regexp mask. so if i input
34789d, when I the input the character 'd', I will be noticed.

I wonder if it's possible or not. if it is, how could I do that?
=20
Well, you can't easily do that. You could write a regex that matches
valid prefixes of the strings of the language of the final regex, but
that can become quite complicated. You should probably rethink your
algorithm,
nikolai
=20
--
Nikolai Weibull: now available free of charge at http://bitwi.se/!
Born in Chicago, IL USA; currently residing in Gothenburg, Sweden.
main(){printf(&linux["\021%six\012\0"],(linux)["have"]+"fun"-97);}
=20

------=_Part_4855_4083229.1120006700004--
 
D

Daniel Brockman

Nikolai Weibull said:
Well, you can=E2=80=99t easily do that. You could write a regex that m= atches
valid prefixes of the strings of the language of the final regex, but
that can become quite complicated. You should probably rethink your
algorithm,

Nikolai is right: you can't easily do that. But I think it would be
easy to modify the regular expression engine to make it possible.
Unless I am mistaken, the only information you need is whether or not
the engine ever wanted to look past the end of the input string.

If the engine ever managed to consume all characters and still be
hungry for more, then your string is a valid prefix. Conversely, if
the engine did not do this, then your string is not a valid prefix.
Note that merely consuming all characters is not sufficient; the match
has to fail due to lack of additional input.

I just posted about this on Perl Monks,

<http://perlmonks.org/index.pl?node_id=3D471541>

so consider following up there if you have anything to add.

I don't feel like hacking this into regex.c right now, and I'm not
sure what the API should be like.

But I do think it would be a useful feature. The fact that noone
seems to have wanted it before baffles me. Perhaps up until now,
noone even considered the possibility.

--=20
Daniel Brockman <[email protected]>
 
E

Eric Luo

------=_Part_9988_11893867.1120401467635
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline

Yes, It's really what I want.=20
Seem to be a little diffcult to resolve the problem properly.
I had to work around my biz requirement 'cause I cound't manage
to do with it

Thank you for your help

=20

=20
Nikolai is right: you can't easily do that. But I think it would be
easy to modify the regular expression engine to make it possible.
Unless I am mistaken, the only information you need is whether or not
the engine ever wanted to look past the end of the input string.
=20
If the engine ever managed to consume all characters and still be
hungry for more, then your string is a valid prefix. Conversely, if
the engine did not do this, then your string is not a valid prefix.
Note that merely consuming all characters is not sufficient; the match
has to fail due to lack of additional input.
=20
I just posted about this on Perl Monks,
=20
<http://perlmonks.org/index.pl?node_id=3D471541>
=20
so consider following up there if you have anything to add.
=20
I don't feel like hacking this into regex.c right now, and I'm not
sure what the API should be like.
=20
But I do think it would be a useful feature. The fact that noone
seems to have wanted it before baffles me. Perhaps up until now,
noone even considered the possibility.
=20

------=_Part_9988_11893867.1120401467635--
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

No members online now.

Forum statistics

Threads
473,777
Messages
2,569,604
Members
45,227
Latest member
Daniella65

Latest Threads

Top