On Tue, 07 Jun 2011 21:18:54 -0700, Roedy Green
[snip]
The problem with regexes is all it takes is one char off an the whole
thing does not work. You have no clue where the problem is. You
rarely find errors with syntax checking. There is no trace.
The other problem is a regex will work 90% of the time. It may be
quietly rejecting a small percentage of the strings, and you might not
notice.
There are more problems than that.
I assume that you are familiar with this quote:
Some people, when confronted with a problem, think "I know, I'll use
regular expressions." Now they have two problems.
I find regexes to be less than totally useful. I sometimes have
to define a format string with substitution parameters. Here is an
example:
Per client's instruction, the total of all invoices for the current
month will be charged against the supplied credit card number on %D
unless we hear otherwise prior to that date.
The date gets substituted for the %D. There are a few rules.
There must be one and only "%D" string. "%" is an escape character
and is doubled for the literal "%".
I could write a regex for this, BUT I also have to have a routine
for executing the string substitution, and regexes do not help with
this. I do not want two rather different versions of the code. (As
it is, I have two versions of code that are somewhat similar.) More
importantly, if one routine gets changed, so should the other, and it
should be obvious how to do it.
If I wanted to add a second variable to the example above, say a
contact name, and wanted the constraint of appearing once and only
once, using a regex would get even uglier.
I could use regexes for such things as validating with no
interpretation, but such data that I have to validate usually has
trivial formatting. For example, a Canadian Postal Code is "A9A 9A9"
with some limitations on the alphabetic characters. A regex would be
overkill.
Sincerely,
Gene Wirchenko