How can I backreference a ?: group in perl regular expression?

K

kun niu

Dear all,
My question is listed as title.
I want to use strings like \1,\2 etc to represent the group matched
last time.
But the group is not captured, as signed by ?:.
Can I implement this in perl?
If so, how?
Thanks for any hints or advice in advance.
 
K

kun niu

No. That, in fact, is the whole point of (?:) groups.

Why do you think you need to do this?

Ben

Thank you for your attention to my question and your precious time.
In my application, I'll have reformat the telephone number in my
company.
Since they came from all over the world, we have various telephone
format.
Here's some examples:
000-000-000
(00)00-000-0000
!00!00-000-0000
0000000
0000-0000
0000 0000 000
you dont care
What my job is to check the valid telephone number and to give them
out.
In case that I meet a ')' character, I'll have to check if there's a
'(' in the front.
'!' is also valid here.
But all these characters should not be captured.
So I'll have to backreference a previous group
Expression like "((?:[!|]*)\d+\1\d+)" gives a bad result.
Any further hints or advice here?
Thanks again for your reply.
 
J

John W. Krahn

kun said:
No. That, in fact, is the whole point of (?:) groups.

Why do you think you need to do this?

Thank you for your attention to my question and your precious time.
In my application, I'll have reformat the telephone number in my
company.
Since they came from all over the world, we have various telephone
format.
Here's some examples:
000-000-000
(00)00-000-0000
!00!00-000-0000
0000000
0000-0000
0000 0000 000
you dont care
What my job is to check the valid telephone number and to give them
out.
In case that I meet a ')' character, I'll have to check if there's a
'(' in the front.
'!' is also valid here.
But all these characters should not be captured.
So I'll have to backreference a previous group
Expression like "((?:[!|]*)\d+\1\d+)" gives a bad result.
Any further hints or advice here?

"(([!|]*)\d+\2\d+)"



John
 
K

kun niu

Thank you for your attention to my question and your precious time.
In my application, I'll have reformat the telephone number in my
company.
Since they came from all over the world, we have various telephone
format.
Here's some examples:
000-000-000
(00)00-000-0000
!00!00-000-0000
0000000
0000-0000
0000 0000 000
you dont care
What my job is to check the valid telephone number and to give them
out.
In case that I meet a ')' character, I'll have to check if there's a
'(' in the front.
'!' is also valid here.
But all these characters should not be captured.
So I'll have to backreference a previous group
Expression like "((?:[!|]*)\d+\1\d+)" gives a bad result.
Any further hints or advice here?

"(([!|]*)\d+\2\d+)"

John
--
Those people who think they know everything are a great
annoyance to those of us who do. -- Isaac Asimov- Òþ²Ø±»ÒýÓÃÎÄ×Ö -

- ÏÔʾÒýÓõÄÎÄ×Ö -

Thank you for your attention.
It's true that this is a way out.
But in this case, ! or ! are captured. I don't want these characters
captured.
So only !00!-000-0000 are captured instead of ! and !00!-000-0000.
 
D

derykus

...
Thank you for your attention to my question and your precious time.
In my application, I'll have reformat the telephone number in my
company.
Since they came from all over the world, we have various telephone
format.
Here's some examples:
000-000-000
(00)00-000-0000
!00!00-000-0000
0000000
0000-0000
0000 0000 000
you dont care
What my job is to check the valid telephone number and to give them
out.
In case that I meet a ')' character, I'll have to check if there's a
'(' in the front.
'!' is also valid here.
But all these characters should not be captured.
So I'll have to backreference a previous group
Expression like "((?:[!|]*)\d+\1\d+)" gives a bad result.

You could use lookahead to ensure a
matching paren or bang, eg,

/ ^ (?:
\( (?=\d+\)) # alternative #1
| # or
# etc..
)
/x
 
S

sln

No. That, in fact, is the whole point of (?:) groups.

Why do you think you need to do this?

Ben

Thank you for your attention to my question and your precious time.
In my application, I'll have reformat the telephone number in my
company.
Since they came from all over the world, we have various telephone
format.
Here's some examples:
000-000-000
(00)00-000-0000
!00!00-000-0000
0000000
0000-0000
0000 0000 000
you dont care
What my job is to check the valid telephone number and to give them
out.
In case that I meet a ')' character, I'll have to check if there's a
'(' in the front.
'!' is also valid here.
But all these characters should not be captured.
So I'll have to backreference a previous group
Expression like "((?:[!|]*)\d+\1\d+)" gives a bad result.
Any further hints or advice here?
Thanks again for your reply.

Although your not too clear about your intent,
it is obvious you *think* you can capture individual numbers
in groups, shed surrounding delimeters, validate special delemeters
(one of which is paranthetical closures) then wrap it all up and
stamp it "Special Delivery" for an array.

I dunno, maybe, but if not, an approach like this might get you started.
(this goes into my bin for future reference, if any)

-sln

---------------------------
## ph_capt.pl
##
use strict;
use warnings;

my $str = <<TELNUMS;

phone numbers:

100-junk-(00)
200-000-000-!99!
(30)-00-000-0000
!40!00-000-0000-
5000000
6000-0000
7000 0000 000
!80)-00-000-0000
TELNUMS

print $str;

my @ar = "!40!00-000-0000" =~

/^ # line start
( # begin capture group 1 ($1)
!\d+! # '!' + 1 or more digits + '!'
| # or
\(\d+\) # '(' + 1 or more digits + ')'
| # or
# defined-empty string
) # end group 1

[ \t-]* # 0 or more of these chars

( # begin capture group 2 ($2)
(?: # grouping
\d+[ \t-]* # 1 or more digits + 0 or more of these chars
)+ # end group, do 1 or more times
) # end group 2

$ # line end
/x;

print "\n-------------\n\n";
print "'$_'\n" for (@ar);
print "\n-------------\n\n";

for (split /\n/,$str)
{
next if (!length());

print "Target: '$_'\n";

if ( @ar = /^(!\d+!|\(\d+\)|)[ \t-]*((?:\d+[ \t-]*)+)$/ )
{
my @ngroups = split /[ \t-]/, $ar[1];
$ar[0] =~ s/[!()]//g;
print " ++ matched ". join (' , ',@ngroups) ."\n";
print " special prefix $ar[0]\n" if length $ar[0];
}
else {
print " -- invalid phone #\n";
}
print "\n";
}

__END__


Output:

phone numbers:

100-junk-(00)
200-000-000-!99!
(30)-00-000-0000
!40!00-000-0000-
5000000
6000-0000
7000 0000 000
!80)-00-000-0000

-------------

'!40!'
'00-000-0000'

-------------

Target: 'phone numbers:'
-- invalid phone #

Target: '100-junk-(00)'
-- invalid phone #

Target: '200-000-000-!99!'
-- invalid phone #

Target: '(30)-00-000-0000'
++ matched 00 , 000 , 0000
special prefix 30

Target: '!40!00-000-0000-'
++ matched 00 , 000 , 0000
special prefix 40

Target: '5000000'
++ matched 5000000

Target: '6000-0000'
++ matched 6000 , 0000

Target: '7000 0000 000'
++ matched 7000 , 0000 , 000

Target: '!80)-00-000-0000'
-- invalid phone #
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,780
Messages
2,569,611
Members
45,281
Latest member
Pedroaciny

Latest Threads

Top