V
VK
Honestly regular expressions - above everyday trivias - always were
kind of mysterious stuff to me, but this one drove me really nuts
with false positives.
Basically I want to sort out any strings, containing Unicode
characters from \u0100 and up to \uFFFF. I though that
new RegExp('[\u0100-\uFFFF]+','i')
would make it but it was keep giving false positives for Basic Latin
strings so I find out that
window.alert( (new RegExp('[\u0100-\uFFFF]+','i')).test('a') ) //
false
....
window.alert( (new RegExp('[\u0100-\uFFFF]+','i')).test('h') ) //
false
window.alert( (new RegExp('[\u0100-\uFFFF]+','i')).test('i') ) // !
TRUE
window.alert( (new RegExp('[\u0100-\uFFFF]+','i')).test('j') ) //
false
....
WTF? AFAIK "Latin Small Letter I" is Unicode \u0069 and browsers seem
to overall agree on that:
window.alert( ('i'.charCodeAt(0)) ) // 105 (dec 105 = hex 69)
Yet in regexp context both FF3.5 and IE8 I tested on attribute this
char to some unknown much higher code range. Any insights (even if
coming with insults ?
kind of mysterious stuff to me, but this one drove me really nuts
with false positives.
Basically I want to sort out any strings, containing Unicode
characters from \u0100 and up to \uFFFF. I though that
new RegExp('[\u0100-\uFFFF]+','i')
would make it but it was keep giving false positives for Basic Latin
strings so I find out that
window.alert( (new RegExp('[\u0100-\uFFFF]+','i')).test('a') ) //
false
....
window.alert( (new RegExp('[\u0100-\uFFFF]+','i')).test('h') ) //
false
window.alert( (new RegExp('[\u0100-\uFFFF]+','i')).test('i') ) // !
TRUE
window.alert( (new RegExp('[\u0100-\uFFFF]+','i')).test('j') ) //
false
....
WTF? AFAIK "Latin Small Letter I" is Unicode \u0069 and browsers seem
to overall agree on that:
window.alert( ('i'.charCodeAt(0)) ) // 105 (dec 105 = hex 69)
Yet in regexp context both FF3.5 and IE8 I tested on attribute this
char to some unknown much higher code range. Any insights (even if
coming with insults ?