Tony said:
You will never achieve perfection with a regex, raising the question of "why
bother?".
Careful with that word "never". I could create a regex to recognize
every legal decimal representation for a java int (32 bit, 2's
complement, signed), fairly easily. After all.. There are only a
finite number of possible values to check for.
Not that I'd do by enumerating all of them, as a regex based on that
would have at least 2^32 + 1 nodes in its internal automaton. But there
are simpler approaches that would work in a decent amount of memory. I
think I can get the version for the byte data type to at least be
reasonable:
^(0*(1(2([0-7])|[01]\d)|\d{1,2}))|(-0*(1(2([0-8])|[01]\d)|\d{1,2}))$
Note that for efficiency, all those () blocks should really be (?
blocks, but that makes it much harder to read. This was actually
tested, and actually works correctly.
Anyway, clearly, doing the same thing for short, int, or long isn't
harder, though the amount of typing does increase significantly. Doing
something like that for floating-point types would be more difficult,
but it's still obviously possible theoretically.
Of course... All that's just my reaction to the word "never". I
actually agree that regexes shouldn't be used to solve this kind of
problem. To paraphrase the advice: "The programmer had a problem.
Then he thought, 'Aha! I'll use a regex!' Then he had two problems."