RegExp question: Find unmatched right bracket

R

rkleiner

Is there a regular expression to find the first unmatched right bracket
in a string, if there is one? For example,

"(1*(2+3))))".search(/regexp/) == 9

Thanks in advance.
 
D

Dietmar Meier

Is there a regular expression to find the first unmatched right
bracket in a string, if there is one?

I don't think so. I would replace every matching pair of brackets
with two other characters and then use indexOf(")"):

var sText = "(1*(2+3))))";
while (sText != (sText = sText.replace(/\(([^()]*)\)/, "_$1_")));
alert(sText.indexOf(")"));

ciao, dhgm
 
D

Dietmar Meier

Dietmar said:
while (sText != (sText = sText.replace(/\(([^()]*)\)/, "_$1_")));

Adding a "g" flag to the RegExp of cause would speed that up in
some cases:
 
D

Dietmar Meier

Dietmar said:
while (sText != (sText = sText.replace(/\(([^()]*)\)/, "_$1_")));

Adding a "g" flag to the Regular Expression of cause would speed
that up in some cases:

while (sText != (sText = sText.replace(/\(([^()]*)\)/g, "_$1_")));

ciao, dhgm
 
L

Lasse Reichstein Nielsen

Is there a regular expression to find the first unmatched right bracket
in a string, if there is one? For example,

No. It's a fundamental limit to the power of regular expressions that
they cannot decide whether brackets are matched or not. They can't
even decide whether a string contains the same number of opening
and closing brackets.


/L
 
L

Lasse Reichstein Nielsen

Is there a regular expression to find the first unmatched right bracket
in a string, if there is one?

Btw, a simple way of solving the problem would be:
---
function firstNonMatched(string) {
for(var i = 0,cnt=0; i < string.length; i++) {
switch(string.charAt(i)) {
case '(': cnt++; break;
case ')': cnt--; if (cnt < 0) {return i;}; break;
}
return -1;
}
 
D

Dr John Stockton

JRS: In article <[email protected]>
, dated Fri, 22 Apr 2005 09:41:57, seen in (e-mail address removed) posted :
Is there a regular expression to find the first unmatched right bracket
in a string, if there is one? For example,

"(1*(2+3))))".search(/regexp/) == 9

Don't know.

The simple method is to read it character by character, counting :

function Scan(S) { var C, J, N = 0
for (J=0 ; J < S.length ; J++) { C = S.charAt(J)
if (C == "(") N++
if (C == ")") N--
if (N < 0) return J
}
}

Scan("(1*(2+3))))")

It can easily be modified to seek [ ] and { } pairs in the same scan;
and, with a little more effort, /* */ ones.
 
M

Mark Szlazak

Is there a regular expression to find the first unmatched right bracket
in a string, if there is one? For example,

"(1*(2+3))))".search(/regexp/) == 9

Thanks in advance.

With JavaScripts regular expression capabilities you can't do it in
general like in later versions of Perl but there is a limited way to
check nested parenthesis balance without explicit counting in script
code. This pattern is hardcoded for nestings of up to four levels and
works for sample strings like yours but some QA maybe in order to
evaluated its limitations.

var rx = /\((?:[^()]|\((?:[^()]|\((?:[^()]|\([^()]*\))*\))*\))*\)/g;
var str = "(1*(2+3))))"
if ( (rx.exec( str ))[0].length == str.length ) alert ("balanced");
else alert("unbalanced");
 
C

Csaba Gabor

Lasse said:
No. It's a fundamental limit to the power of regular expressions that
they cannot decide whether brackets are matched or not. They can't
even decide whether a string contains the same number of opening
and closing brackets.

But with PHP it is possible (using recursive matching described near the
bottom of the page from the Pattern Syntax link at http://php.net/pcre).
The code below uses a single regular expression to determine whether
parentheses match, or if not where the first unmatched right paren
is, or failing that the position of the last unmatched left paren
(this latter is the only reason for the preg_match_all instead of
just preg_match. If not needed set $leftToo=false).

Csaba Gabor from Vienna

<?php
// set $leftToo=true to report position of an unmatched '('
$leftToo = true;

$re = '/\(((?>[^()]+)|(?R))*\)/'; // recursive regular expression
$expR = "x()(()((1*(2+))3))a))(k)"; // some test cases
$expOK = "x()(()((1*(2+)3)()a))";
$expL = "x(y)(z)((k)(()((1*(2+)3()a))";
$expRL = "bill(wanted(what(we)did)a))(not(have sad)";

// Actual test case: we surround it with parens
$exp = "($expR)";

$matchFunc = "preg_match" . ($leftToo ? "_all" : "");
// do test here
call_user_func ($matchFunc, $re, $exp, &$aMatch, PREG_OFFSET_CAPTURE);
if ($leftToo) $aMatch = $aMatch[0]; // preg_match_all => extra level

// show results
print "Expression: " . substr($exp,1,-1) . "\n<br>";
if (($s0=&$aMatch[0][0])==$exp) print "all parens match";
else if (!$aMatch[0][1])
print "unmatched right paren at pos " . (strlen($s0)-2);
else print "unmatched left paren" .
(!$leftToo ? "" : " at pos " .
($aMatch[sizeof($aMatch)-1][1]-1));
?>
 
L

Lasse Reichstein Nielsen

[recognizing matched parentheses]
But with PHP it is possible (using recursive matching described near the
bottom of the page from the Pattern Syntax link at http://php.net/pcre).

Fascinating. However, the feature is now so powerful that maybe it
shouldn't be called "regular expressions". It is capable of recognizing
non-regular languages, and probably all context free languages.

Alas, it's probably a lost cause for a computer science purist to have
them rename it "Context Free Expression" :) And it would not be
enough, since backwards matching even allows some languages that are
not context free.

Ah, back in the days when Regular Expresseions were really regular,
they could be compiled into finite automata, and you didn't need to
worry about stack depth. Those days are obviously long gone. Regular
expressions have become Swiss Army chainsaws of text matching :)

/L
 
D

Dr John Stockton

JRS: In article <[email protected]>, dated Fri, 22 Apr 2005
19:51:35, seen in Lasse Reichstein Nielsen
No. It's a fundamental limit to the power of regular expressions that
they cannot decide whether brackets are matched or not. They can't
even decide whether a string contains the same number of opening
and closing brackets.


I take it that you mean that one unaided application of a single RegExp
cannot do it.


For equal numbers of each, by removing each type,
S = "(()))"
Ans = S.replace(/\(/g, "").length == S.replace(/\)/g, "").length

And
while (S != (S=S.replace(/\([^\(\)]*\)/g, ""))) {}
Balanced = !S.replace( /[^\(\)]/g, "")
checks that each opener has a following closer, with no spare closers.

IIRC, though, that is not the most efficient way to tell whether a
..replace() has made a difference.
 
M

Mark Szlazak

Dr said:
JRS: In article <[email protected]>, dated Fri, 22 Apr 2005
19:51:35, seen in Lasse Reichstein Nielsen
No. It's a fundamental limit to the power of regular expressions that
they cannot decide whether brackets are matched or not. They can't
even decide whether a string contains the same number of opening
and closing brackets.


I take it that you mean that one unaided application of a single RegExp
cannot do it.


For equal numbers of each, by removing each type,
S = "(()))"
Ans = S.replace(/\(/g, "").length == S.replace(/\)/g, "").length

And
while (S != (S=S.replace(/\([^\(\)]*\)/g, ""))) {}
Balanced = !S.replace( /[^\(\)]/g, "")
checks that each opener has a following closer, with no spare closers.

IIRC, though, that is not the most efficient way to tell whether a
.replace() has made a difference.
items, links.

Following your lead John, one maybe able to check for balance by
replace all non () characters with blanks and then seeing if the
remaining string length is even or odd. Do you know of a quick even/odd
test?
 
R

Richard Cornford

Mark Szlazak wrote:
Following your lead John, one maybe able to check for balance by
replace all non () characters with blanks and then seeing if the
remaining string length is even or odd.

')))('.replace(/[^()]/, '').length ; //even

- but:-

a = testString.replace(/[^(]/g, '');
b = testString.replace(/[^)]/g, '');
if(a.length != b.lenght){
// Parenthesise cannot match.
}else{
// There are the same numbers of each type of parenthesise.
// but that does not mean that nesting or sequence of
// parenthesise are "correct". var testString = '))((';
}
Do you know of a quick even/odd test?

if(n % 2){
// odd number (or non-integer)
}else{
// even number
}

Richard.
 
R

Randy Webb

Richard said:
Mark Szlazak wrote:
Following your lead John, one maybe able to check for balance by
replace all non () characters with blanks and then seeing if the
remaining string length is even or odd.


')))('.replace(/[^()]/, '').length ; //even

- but:-

a = testString.replace(/[^(]/g, '');
b = testString.replace(/[^)]/g, '');
if(a.length != b.lenght){

b.length??

:)
 
M

Mick White

Mark Szlazak wrote:
[snip]
Following your lead John, one maybe able to check for balance by
replace all non () characters with blanks and then seeing if the
remaining string length is even or odd. Do you know of a quick even/odd
test?

isEven=num & 1;

Mick
 
F

fox

Mick said:
Mark Szlazak wrote:
[snip]
Following your lead John, one maybe able to check for balance by
replace all non () characters with blanks and then seeing if the
remaining string length is even or odd. Do you know of a quick even/odd
test?

isEven=num & 1;

ahem... isEven = ~num & 1;
isOdd = num & 1;

(i know it was a minor oversight...)
 
F

fox

Richard said:
Mark Szlazak wrote:
Following your lead John, one maybe able to check for balance by
replace all non () characters with blanks and then seeing if the
remaining string length is even or odd.


')))('.replace(/[^()]/, '').length ; //even

- but:-

a = testString.replace(/[^(]/g, '');
b = testString.replace(/[^)]/g, '');
if(a.length != b.lenght){
// Parenthesise cannot match.
}else{
// There are the same numbers of each type of parenthesise.
// but that does not mean that nesting or sequence of
// parenthesise are "correct". var testString = '))((';
}

[THANK YOU for pointing that out!!!]

no matter how you slice it -- this is a two pass test... one pass *must*
remove all legitimately balanced parens:

var hold = origStr;

/*this is destructive -- so in a real setting, origStr needs 2 copies:
copy and hold*/

while(hold !=(origStr = origStr.replace(/(\([^\(\)]*\))/g,"")))
hold = origStr;



then test the remainder for orphan parens:

var left = (origStr.match(/\(/g) || []).length;
var right = (origStr.match(/\)/g) || []).length;

/* match returns 'null' if no matches -- so || with empty array for
length 0*/

var balance = left - right;

if balance is zero -- the string actually matches
less than zero -- stray left parens
greater than zero -- stray right parens

in the case of )( -- left and right are 1 but balance is zero, so the
test of "true" balance becomes:

var truebalance = !balance && !left && !right; (or any variation on this
test...like: !(balance | left | right) ... it has to be bitwise OR
because +/- anding cancel)

the var balance can be used to locate the first unbalanced paren --
if < 0 then iterate through an unaltered copy of the original string
using indexOf("(", startindex) and if > 0, iterate using lastIndexOf().


here's my offering: http://fxmahoney.com/demo/balancedParensObj.htm --
it seemed logical to prototype the method to the String object...it
returns an array of information about the string: balanced, index into
the string of first "offending" paren (or -1), left/right context if
unbalanced, a "display" string (red text shows "error range"), and the
results of the successive reduction regex, i.e., the leftovers (could be
useful...) [as I write, I'm thinking it should return the
leftContext/rightContext from RegExp which could be used to more quickly
isolate the offending parens...i'll work on it later]

source code on page is filtered through a php colorizing script -- view
source for the actual code (in case of discrepencies.)

the default string is unbalanced -- also try out:
this is an )absurd( use of parens... plus any other combination you can
think of... i don't claim this an authoritative solution by any stretch,
but so far, it seems to work in every case (i've tried)...

fox
 
D

Dr John Stockton

JRS: In article <[email protected]>
, dated Sun, 24 Apr 2005 07:24:19, seen in
Mark Szlazak said:
Following your lead John, one maybe able to check for balance by
replace all non () characters with blanks and then seeing if the
remaining string length is even or odd. Do you know of a quick even/odd
test?

Please read the newsgroup FAQ on formatting of responses.

What is a blank? It can only be a space, in which case the method is
useless. If they are replaced with nothing, then the evenness condition
is necessary but not sufficient. !S.length&1 determines evenness;
oddness is slightly easier. However, the basic approach is only capable
of detecting some cases of imbalance : the essence is to match pairs.
 
F

fox

fox said:
Richard said:
Mark Szlazak wrote:
Following your lead John, one maybe able to check for balance by
replace all non () characters with blanks and then seeing if the
remaining string length is even or odd.



')))('.replace(/[^()]/, '').length ; //even

- but:-

a = testString.replace(/[^(]/g, '');
b = testString.replace(/[^)]/g, '');
if(a.length != b.lenght){
// Parenthesise cannot match.
}else{
// There are the same numbers of each type of parenthesise.
// but that does not mean that nesting or sequence of
// parenthesise are "correct". var testString = '))((';
}

[THANK YOU for pointing that out!!!]

no matter how you slice it -- this is a two pass test... one pass *must*
remove all legitimately balanced parens:

var hold = origStr;

/*this is destructive -- so in a real setting, origStr needs 2 copies:
copy and hold*/

while(hold !=(origStr = origStr.replace(/(\([^\(\)]*\))/g,"")))
hold = origStr;



then test the remainder for orphan parens:

var left = (origStr.match(/\(/g) || []).length;
var right = (origStr.match(/\)/g) || []).length;

/* match returns 'null' if no matches -- so || with empty array for
length 0*/

var balance = left - right;

Sorry -- that's right - left...
if balance is zero -- the string actually matches
less than zero -- stray left parens
greater than zero -- stray right parens

in the case of )( -- left and right are 1 but balance is zero, so the
test of "true" balance becomes:

var truebalance = !balance && !left && !right; (or any variation on this
test...like: !(balance | left | right) ... it has to be bitwise OR
because +/- anding cancel)

the var balance can be used to locate the first unbalanced paren --
if < 0 then iterate through an unaltered copy of the original string
using indexOf("(", startindex) and if > 0, iterate using lastIndexOf().


here's my offering: http://fxmahoney.com/demo/balancedParensObj.htm --
it seemed logical to prototype the method to the String object...it
returns an array of information about the string: balanced, index into
the string of first "offending" paren (or -1), left/right context if
unbalanced, a "display" string (red text shows "error range"), and the
results of the successive reduction regex, i.e., the leftovers (could be
useful...) [as I write, I'm thinking it should return the
leftContext/rightContext from RegExp which could be used to more quickly
isolate the offending parens...i'll work on it later]

source code on page is filtered through a php colorizing script -- view
source for the actual code (in case of discrepencies.)

the default string is unbalanced -- also try out:
this is an )absurd( use of parens... plus any other combination you can
think of... i don't claim this an authoritative solution by any stretch,
but so far, it seems to work in every case (i've tried)...

fox

if(n % 2){
// odd number (or non-integer)
}else{
// even number
}

Richard.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,432
Messages
2,571,682
Members
48,796
Latest member
Greg L.

Latest Threads

Top