pr said:
howa said:
var s = "12345d";
document.write("s="+s+ ", " + s.replace(/[0-9]*/g,'x'));
It shows:
s=12345d, xxdx
while I expect
xd
[...] In theory, globally replacing a zero-length string should be an
infinite task. In practice (fortunately), the regular expression engine
avoids consecutive zero-length matches. Therefore you have one 5-digit
match and two 0-digit matches, one each side of the 'd'.
Not at all. In theory, there is an ε (epsilon) production; please read
about Regular Grammars:
http://en.wikipedia.org/wiki/Formal_grammar#Regular_grammars
In practice, Regular Expressions match *non-overlapping* occurrences of the
pattern in the string which means that even with global matching no position
is visited twice by the matcher; please read ECMA-262 Ed. 3 Final, section
15.5.4.11:
http://www.ecmascript.org/docs.php
Here is what happens, in a nutshell (I used `^' to indicate the next
possible match, and `ε' for the empty word/string to be matched):
0. Input string: "12345d"
Regular Expression: /[0-9]*/g --> lastIndex=0
Replacement string: "x"
1. Find matches for the Regular Expression.
position 0 1 2 3 4 5
ε1ε2ε3ε4ε5εdε
^ ^ ^ ^ ^
(/[0-9]*/, lastIndex=0) --> ("12345", index=0, lastIndex=5)
Greedy matching, so the longest match wins.
The global flag is set, continue.
2. Find more matches for the Regular Expression.
position 0 1 2 3 4 5
ε1ε2ε3ε4ε5εdε
^
(/[0-9]*/, lastIndex=5) --> (ε, index=5, lastIndex=5)
The longest and only possible match that remains is the empty string;
next possible match after position 4.
The global flag is set, continue.
3. Find more matches for the Regular Expression.
position 0 1 2 3 4 5 6
ε1ε2ε3ε4ε5εdε
^
(/[0-9]*/, lastIndex=5) --> (ε, index=6, lastIndex=6)
The longest and only possible match that remains is the empty string;
next possible match after position 5.
The global flag is set, continue.
4. Find more matches for the Regular Expression.
position 0 1 2 3 4 5 6
ε1ε2ε3ε4ε5εdε
^
(/[0-9]*/, lastIndex=6) --> (null, index=6, lastIndex=0)
End of string, no further matches possible.
5. Found matches:
("12345", index=0, lastIndex=5),
(ε, index=5, lastIndex=6),
(ε, index=6, lastIndex=6),
6. Replace all matches with the replacement string each.
position 0 1 2 3 4 5 6
ε1ε2ε3ε4ε5εdε
Result: x xdx
7. Result: "xxdx"
You can confirm this when evaluating the return value of
"12345d".match(/[0-9]*/g) -- as defined in the Specification -- which is
["12345", "", ""] whereas the matches "" can be understood as those
literally matching ε, the empty word/string.
HTH
PointedEars