Regular Expression Problem

  • Thread starter Kenneth Baltrinic
  • Start date
K

Kenneth Baltrinic

I am trying to parse a relatively simple SQL query with a regular
expression. All is going well except for one issue I don't seem to be able
to find a solution for. Handling optional parentheses and brackets. This
seems like a back reference problem to me but I am not sure. Let me give a
simple example.

An table name might be enclosed in brackets or it might not. For the sake of
simplification, lets assume that the pattern [A-Za-z_]+ is what we are
looking for in a table name. A really simple solution to find table names
that are bracket enclosed or not would as follows:

/(\[[A-Za-z_]+\])|([A-Za-z_]+)/

The problem is that the pattern for a table name is not in reality as simple
as [A-Za-z_]+ . In fact it is quite long, and repeating it twice in the
expression seems inefficient not to mention prone to bugs if I need to tweak
it and don't get each side exactly the same. What I want to do is something
like the following:

/(\[?)[A-Za-z_]+\1/

This almost works except of course that my backreference is looking for an
opening bracket [ when I need it to be looking for a closing bracket ]. So
here it the crux of my question. Is there any way to do something like
this--to have a backreference that does some sort of fuzzy match? I have a
similar issues with parentheses.

Thanks for the help,
Ken Baltrinic
 
S

Sam Holden

I am trying to parse a relatively simple SQL query with a regular
expression. All is going well except for one issue I don't seem to be able
to find a solution for. Handling optional parentheses and brackets. This
seems like a back reference problem to me but I am not sure. Let me give a
simple example.

An table name might be enclosed in brackets or it might not. For the sake of
simplification, lets assume that the pattern [A-Za-z_]+ is what we are
looking for in a table name. A really simple solution to find table names
that are bracket enclosed or not would as follows:

/(\[[A-Za-z_]+\])|([A-Za-z_]+)/

The problem is that the pattern for a table name is not in reality as simple
as [A-Za-z_]+ . In fact it is quite long, and repeating it twice in the
expression seems inefficient not to mention prone to bugs if I need to tweak
it and don't get each side exactly the same. What I want to do is something
like the following:

Why not something like:

(\[$table_name_pattern])|($table_name_pattern)

Of course if nested parentheses are needed things get more complicated, and
moving to a non-regex solution is often easier. Something like
Parse::RecDescent, for example.
/(\[?)[A-Za-z_]+\1/

This almost works except of course that my backreference is looking for an
opening bracket [ when I need it to be looking for a closing bracket ]. So
here it the crux of my question. Is there any way to do something like
this--to have a backreference that does some sort of fuzzy match? I have a
similar issues with parentheses.

Can't help with that, sorry...
 
C

Chauncey Williams

What I want to do is something like the following:
/(\[?)[A-Za-z_]+\1/

This almost works except of course that my backreference is looking
for an opening bracket [ when I need it to be looking for a closing
bracket ]. So here it the crux of my question. Is there any way to
do something like this--to have a backreference that does some sort
of fuzzy match? I have a similar issues with parentheses.

Thanks for the help,
Ken Baltrinic

Maybe a (?<= [ )] positive lookbehind is what you are looking for...

I'm pretty new to this though, and if what you are showing is just a small
part of the regex, it may not work.

XC
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,768
Messages
2,569,575
Members
45,053
Latest member
billing-software

Latest Threads

Top