RegExp - Match specific words, but not if they're inside parenthesis (with or without other words within)

A/B

Joined
Jan 29, 2023
Messages
7
Reaction score
1
I need to match specific words in a text, but not if they're inside parenthesis (with or without other words within).

For example, in the text

La première édition du code civil français est l'aboutissement d'une double réflexion menée afin d'améliorer les échanges juridiques ... (édition du code civil français) est ...

I need to match the first "code civil" words, but not the last ones (inside parenthesis).

I'm using the following regular expression

/[^(]code (civil|pénal|de procédure civile|de procédure pénale)[^)]/g

The issue is that - referring to the example - it also matches the spaces before and after "code civil" (i.e.: " code civil ").

How can I remove those spaces?

Thanks in advance.
 
Joined
Jan 30, 2023
Messages
107
Reaction score
13
You can use the following regex to match the "code civil" words outside of parentheses, while excluding the spaces before and after:

Code:
/\bcode\s+(civil|pénal|de procédure civile|de procédure pénale)\b/g
 

A/B

Joined
Jan 29, 2023
Messages
7
Reaction score
1
Thanks for your answer. But that regex doesn't work when the words "code civil" are inside parentheses among other words.

E.g.:

La première édition du code civil français est l'aboutissement d'une double réflexion menée afin d'améliorer les échanges juridiques ... (édition du code civil français) est ...
 
Joined
Jan 30, 2023
Messages
107
Reaction score
13
Thanks for your answer. But that regex doesn't work when the words "code civil" are inside parentheses among other words.

E.g.:

La première édition du code civil français est l'aboutissement d'une double réflexion menée afin d'améliorer les échanges juridiques ... (édition du code civil français) est ...
To handle the case where the words "code civil" are inside parentheses among other words, you can use a lookahead based approach. The modified expression would be:

[(]∗)code(civil∣peˊnal∣deproceˊdurecivile∣deproceˊdurepeˊnale)(?![(]∗)/g

The positive lookbehind (?<!) asserts that the pattern is not preceded by an opening parenthesis, and the positive lookahead (?![^(]*)) asserts that it's not followed by a closing parenthesis with any other characters between the parentheses.
 

A/B

Joined
Jan 29, 2023
Messages
7
Reaction score
1
I cannot use Lookbehind/Lookahead, because I need to support older browsers.
 
Joined
Jan 30, 2023
Messages
107
Reaction score
13
In that case, you can use the following alternative regex:

Code:
/([^(]|^)\bcode (civil|pénal|de procédure civile|de procédure pénale)\b([^)|$])/g



This regex uses the alternation operator (|) to match either a character that is not an opening parenthesis, or the start of the string (^), before the word boundary (\b) at the beginning of the match. The ([^)|$]) after the word boundary at the end of the match matches either a character that is not a closing parenthesis, or the end of the string ($).
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,015
Latest member
AmbrosePal

Latest Threads

Top