Match CASE/END SQL Construct

P

Perry Aynum

I am working on a SQL parser. I have a routine that recursively removes
enclosing parentheses and it works fine. Below is the regex that I use.

However, I want to use the same routine, but instead of looking for
enclosing parens, I want to look for a string enclosed by CASE and END. Can
someone help me translate the regex below so that it will match a CASE/END
construct?

Thanks very much.

Parens
----------
(?:\s+)?\([^\(\)]*\)



This is what I've managed so far with the CASE/END

(?:\s+)?case(?!case|end)\s+end
 
S

sln

I am working on a SQL parser. I have a routine that recursively removes
enclosing parentheses and it works fine. Below is the regex that I use.

However, I want to use the same routine, but instead of looking for
enclosing parens, I want to look for a string enclosed by CASE and END. Can
someone help me translate the regex below so that it will match a CASE/END
construct?

Thanks very much.

Parens
----------
(?:\s+)?\([^\(\)]*\)



This is what I've managed so far with the CASE/END

(?:\s+)?case(?!case|end)\s+end
Its probably not this simple.

sln

-------------------------

use strict;
use warnings;

my $txt = "(this (is a) test)";

while ($txt =~ s/\(([^()]*?)\)/$1/) {};

print $txt,"\n";

$txt = "case this case is a end test end";

while ($txt =~ s/case\s+(.*?)\s+end/$1/) {};

print $txt,"\n";

__END__

this is a test
this is a test
 
J

Jim Gibson

Perry Aynum said:
I am working on a SQL parser. I have a routine that recursively removes
enclosing parentheses and it works fine. Below is the regex that I use.

However, I want to use the same routine, but instead of looking for
enclosing parens, I want to look for a string enclosed by CASE and END. Can
someone help me translate the regex below so that it will match a CASE/END
construct?

Thanks very much.

Parens
----------
(?:\s+)?\([^\(\)]*\)



This is what I've managed so far with the CASE/END

(?:\s+)?case(?!case|end)\s+end

Have you tried m{ case \s* (.*?) \s* end }ix
 
T

Tad J McClellan

You have just not yet encountered a test case where it does not work fine...



Parenthesis in character classes are not "special" and therefore
do not need to be backslashed...

s/Test/Text/


Why would one need a module for something so apparently simple?


Because appearances can be deceiving.

$_ = "(an opening parenthesis ('(') starts a 'memory' in a Perl regex.)\n";
print "$&\n" if /(?:\s+)?\(([^()]*)\)/;
 
S

sln

I am working on a SQL parser. I have a routine that recursively removes
enclosing parentheses and it works fine. Below is the regex that I use.

However, I want to use the same routine, but instead of looking for
enclosing parens, I want to look for a string enclosed by CASE and END. Can
someone help me translate the regex below so that it will match a CASE/END
construct?

Thanks very much.

Parens
----------
(?:\s+)?\([^\(\)]*\)



This is what I've managed so far with the CASE/END

(?:\s+)?case(?!case|end)\s+end

I've revisited this, became intrigued with zero-assertion width
extented regexp constructs. These constructs don't get enough air-time
here. Since you appear to be leaning in that direction, I thought I would
flesh out a look ahead regexp for your example, perhaps to try to glean insight on
the regexp engine, not really sure. Its very facinating for me. I'm not a big book
reader since I am dislexic, so I try to discover things on my own.

The below would seem to tackle your problem from the perspective of a file slurped
into a variable which is processed. All relavent delimeters are taken into acccount,
my other penchant is for parsing. It is possible to buffer line by line file info
until we just have enough to parse. I didn't do it of course but it is fairly easy.
This would aviod sucking up huge amounts of memory, and is fairly trivial once the
master regexp is known.

I've learned some stuff about the regexp engine's extended operations. I won't go into it.
I decided to include the progression of guesses that went into settling on its final form.
Obviously this form does take into account several delimiting factors as well as look-ahead.
Its not fully tested of course, but it passes my initial alpha form that could be
presented to testers.

As it is now, CASE/END are the targets, however, any can be substituted.
Should you like to employ me for extended projects, set up a contact arangement.

Note the code is at the bottom, the output is at the top, in true dyslexic fashion.
Particularly note in the output, how inner to outter matching goes. This is key.

sln


__OUTPUT__

c:\temp>perl misc9.pl

<<<<<<<<<<< Phase1 >>>>>>>>>>>
$1= --------
' case'
$txt= --------
'
case
1 case end
2 case case end end
fricases can erupt even among friends
end'

<<<<<<<<<<< Phase2 >>>>>>>>>>>
$1= --------
''
$txt= --------
'
case
1
2 case case end end
fricases can erupt even among friends
end'

<<<<<<<<<<< Phase3 >>>>>>>>>>>
$1= --------
''
$txt= --------
'
case
1
2 case end
fricases can erupt even among friends
end'

<<<<<<<<<<< Phase4 >>>>>>>>>>>
$1= --------
''
$txt= --------
'
case
1
2
fricases can erupt even among friends
end'

<<<<<<<<<<< Phase5 >>>>>>>>>>>
$1= --------
'
1
2
fricases can erupt even among friends'
$txt= --------
'
1
2
fricases can erupt even among friends'


************************
FINAL:
'
1
2
fricases can erupt even among friends'

c:\temp>

__CODE__



use strict;
use warnings;

my $txt = join '', <DATA>;

{
# while ($txt =~ s/(?:\s+|^)case(?=\s)(.*)(?!case)(?<=\s)end(?:\s+|$)/$1/is) {} <- sick

# while ($txt =~ s/(?:\s+)case(?=\s)(.*)(?!case)(?<=\s)end(?:\s+)/$1/is) { print "--------\n'$1'\n"} <- disgusting

# while ($txt =~ s/(?:\s+)case(?=\s)(.(?!case)*?)(?<=\s)end(?:\s+)/$1/is) { print "--------\n'$1'\n"} <- putrid

# while ($txt =~ s/(?:\s+)case(?=\s)((?<!case).*?)(?<=\s)end(?:\s+)/$1/is) { print "--------\n'$1'\n"} <- DOA

# while ($txt =~ s/\s+case\s+(.*(?!case))\s+end\s+/ $1 /is) <- what's this?

# while ($txt =~ s/\s+case\s+((.(?!case))*?)end\s+/ $1 /is) <- almost

# while ($txt =~ s/\s+case\s+((.(?!\scase\s))*?)\s+end\s+/ $1 /is) <- better

# while ($txt =~ s/\s+case((.(?!\scase\s))*?)\s+end\s+/ $1 /is) <- more better

# while ($txt =~ s/\s+case((.(?!\scase\s))*?)\s+end\s+/ $1/is) <- hmmm

# while ($txt =~ s/\s+case((.(?!\scase\s))*?)\s+end(\s+)/ $1 /is) <- confused

# while ($txt =~ s/\s+case((?:.(?!\scase\s))*?)\s+end(\s+)/$1$2/is) <- approaching excellence

# while ($txt =~ s/\s+case((?:.(?!\scase\s))*?)\s+end(\s+)/$1$2/is) <- excellence

# while ($txt =~ s/(?:\s+|^)case((?:.(?!\scase\s))*?)\s+end(\s+|$)/$1$2/is) <- PRIMO !!!!

my $cntr = 1;

while ($txt =~ s/(?:\s+|^)case((?:.(?!\scase\s))*?)\s+end(\s+|$)/$1$2/is) # <- Production Regex, Ship to QA
{
print "\n<<<<<<<<<<< Phase".$cntr++." >>>>>>>>>>>\n";
print "\$1= --------\n'$1'\n";
print "\$txt= --------\n'$txt'\n";
}
print "\n\n************************\n FINAL:\n'$txt'\n";
}

__DATA__

case
1 case case end end
2 case case end end
fricases can erupt even among friends
end
 
S

sln

I am working on a SQL parser. I have a routine that recursively removes
enclosing parentheses and it works fine. Below is the regex that I use.

However, I want to use the same routine, but instead of looking for
enclosing parens, I want to look for a string enclosed by CASE and END. Can
someone help me translate the regex below so that it will match a CASE/END
construct?

Thanks very much.

Parens
----------
(?:\s+)?\([^\(\)]*\)



This is what I've managed so far with the CASE/END

(?:\s+)?case(?!case|end)\s+end

[snip explanation]
use strict;
use warnings;

my $txt = join '', <DATA>;

{
my $cntr = 1;

while ($txt =~ s/(?:\s+|^)case((?:.(?!\scase\s))*?)\s+end(\s+|$)/$1$2/is) # <- Production Regex, Ship to QA
{
print "\n<<<<<<<<<<< Phase".$cntr++." >>>>>>>>>>>\n";
print "\$1= --------\n'$1'\n";
print "\$txt= --------\n'$txt'\n";
}
print "\n\n************************\n FINAL:\n'$txt'\n";
}

__DATA__

case
1 case case end end
2 case case end end
fricases can erupt even among friends
end

The regex needed a look-ahead for '\s', without it is's a bug.
///g was added to reduce passes, equals the depth of nesting now.
No more posts for a while. See ya later.

sln

-------------------------------------------------


use strict;
use warnings;

my $txt = join '', <DATA>;
my $cntr = 1;

while ($txt =~ s/(\s|^)case(?=\s)((?:(?!\scase\s).)*?\s)end(\s|$)/$1$2$3/isg)
{
print "\n<<<<<<<<<<< Phase".$cntr++." >>>>>>>>>>>\n";
print "\$txt= --------\n'$txt'\n";
}
print "\n\n************************\n FINAL:\n'$txt'\n";

__DATA__
case First Line
1 case line case end spacing end
2 case case end end
3 case case END end
fricases can erupt even among friends
end

case can erupt even among end
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top