Regexp help required please

M

Mike Harrison

I'm trying to write a regexp that will return an array of matches
which follow a certain prefix pattern.

For example, running the regexp on this string:

"PREFIX1 other text PREFIX3765 more text12345 PREFIX999 blah blah
blah"

should return an array of ["1", "3765", "999"], ie. every number that
follows PREFIX, but not any other numbers in the string.

/PREFIX\d+/g will give ["PREFIX1", "PREFIX3765", "PREFIX999"], but I
want PREFIX to be matched but not returned in the results.

Is this possible?
 
T

Thomas 'PointedEars' Lahn

Mike said:
[...] running the regexp on this string:

"PREFIX1 other text PREFIX3765 more text12345 PREFIX999 blah blah
blah"

should return an array of ["1", "3765", "999"], ie. every number that
follows PREFIX, but not any other numbers in the string.

/PREFIX\d+/g will give ["PREFIX1", "PREFIX3765", "PREFIX999"], but I
want PREFIX to be matched but not returned in the results.

Is this possible?

Yes. Obviously there are two possibilities: Either use capturing
parentheses to mark only the parts that you want to use in a backreference,
or use (less compatible) non-capturing parentheses to mark the parts that
you do not want to use in the result. If you use String.prototype.match()
instead of RegExp.prototype.exec() you would probably not want to use the
g(lobal) modifier, then.

<http://jibbering.com/faq/#posting>


PointedEars
 
M

Mike Harrison

Yes. Obviously there are two possibilities: Either use capturing
parentheses to mark only the parts that you want to use in a backreference,
or use (less compatible) non-capturing parentheses to mark the parts that
you do not want to use in the result.

Hi, thanks for your reply. I've had no luck using either of those
methods though.

The only way I can do it is like this, but I'm sure there must be a
better/easier/faster way...

<script type="text/javascript">

var s = "PREFIX1 other text PREFIX3765 more text12345 PREFIX999 blah
blah blah";

var re_match = /PREFIX(\d+)/g;
var re_replace = /[^,\d]+/g;

var result = s.match(re_match).join().replace(re_replace,"").split
(",");

alert (result);

</script>
 
T

Thomas 'PointedEars' Lahn

Mike said:
var s = "PREFIX1 other text PREFIX3765 more text12345 PREFIX999 blah
blah blah";

var re_match = /PREFIX(\d+)/g;
var re_replace = /[^,\d]+/g;

var result = s.match(re_match).join().replace(re_replace,"").split
(",");

Eeek. AISB, if you want to use String.prototype.match(), you would want to
lose the `g' modifier so that for each call you get this array (provided
there is a match):

[match, sub_match]

However, you can and need to keep it with RegExp.prototype.exec():

var m, result = [];
while ((m = re_match.exec(s)))
{
/* or result[result.length] = m[1]; */
result.push(m[1]);
}

BTW, Array.prototype.join() can take a string or RegExp argument, so you
could lose the `,'. In fact, String(array) or array.toString() would have
done the same as array.join().


PointedEars
 
L

Lasse Reichstein Nielsen

Mike Harrison said:
The only way I can do it is like this, but I'm sure there must be a
better/easier/faster way...

<script type="text/javascript">

var s = "PREFIX1 other text PREFIX3765 more text12345 PREFIX999 blah
blah blah";

var re_match = /PREFIX(\d+)/g;
var re_replace = /[^,\d]+/g;

var result = s.match(re_match).join().replace(re_replace,"").split
(",");

Lot of string manipulations. Kindof ruins the idea of using
string.match to begin with. You might as well do the regexp loop
yourself:

var re = /PREFIX(\d+)/g;
var result = [];
for(var m; m = re.exec(s);) {
result.push(m[1]);
}

Alternatively, you can use the built in loop in string.replace:

var re = /PREFIX(\d+)/g;
var result = [];
s.replace(re, function(m,m1) { result.push(m1); });

I prefer the former, though.

/L
 
B

Bart Van der Donck

Mike said:
I'm trying to write a regexp that will return an array of matches
which follow a certain prefix pattern.

For example, running the regexp on this string:

"PREFIX1 other text PREFIX3765 more text12345  PREFIX999 blah blah
blah"

should return an array of ["1", "3765", "999"], ie. every number that
follows PREFIX, but not any other numbers in the string.

/PREFIX\d+/g will give ["PREFIX1", "PREFIX3765", "PREFIX999"], but I
want PREFIX to be matched but not returned in the results.

Is this possible?

var s = 'PREFIX5 abc PREFIX444 xyz 8PREFIX9 1 def'
var r = s.replace(/(.*?)(PREFIX)(\d+)/g,'$3\cM').split('\cM')
-- r.length

Hope this helps,
 
B

Bart Van der Donck

kangax said:
Why '\cM'? Isn't '\x00' a better candidate?

Because Control-M would be very unlikely to appear inside the string
(especially when compared to the comma of the original poster).
Perhaps \x00 is even more unlikely, yes.
 
B

Bart Van der Donck

kangax said:
Why '\cM'? Isn't '\x00' a better candidate?

It appears you are absolutely right:

alert('acMb cM 123'.split('\cM'));

appears to also match a straight 'cM' in the original string. Better
to use some low \x indeed.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,067
Latest member
HunterTere

Latest Threads

Top