rename captures in regex

T

Todd W

A factory function we have makes some stupid assumptions about the data it
is parsing. I give it content and a regex, and it gives me back an array.

Is there any way, for example, to tell capture 1 of a regex to store its
value in $2?

Here is the output of the program below.

[trwww@waveright misc]$ perl cap.pl
One:
title: bar
link: foo
descr: bazz
Two:
title: bazz
link: bar
descr: foo

Is there any way to make the output of "One:" identical to the output of
"Two:" by changing ONLY the the string stored in $reg2?

use warnings;
use strict;

my $str1 = '<a href="foo">bar</a><div>bazz</div>';
my $reg1 = '<a href="([^"]+)">([^<]+)</a><div>([^<]+)<';

$str1 =~ m|$reg1|;

print("One:
title: $2
link: $1
descr: $3
");

my $str2 = '<div>bar</div><div>bazz</div><a href="foo">readmore</a>';

### modify only this regex
my $reg2 = '<div>([^<]+)</div><div>([^<]+)</div><a href="([^"]+)"';


$str2 =~ m|$reg2|;

print("Two:
title: $2
link: $1
descr: $3
");

Thanks in advance,

Todd W.
 
G

Gunnar Hjalmarsson

Todd said:
A factory function we have makes some stupid assumptions about the data it
is parsing. I give it content and a regex, and it gives me back an array.

Is there any way, for example, to tell capture 1 of a regex to store its
value in $2?

Not that I know of.

But if the function returns an array, and you want to print it in some
other order, can't you just do:

my @array = qw/bar foo bazz/;
printf "One:\n title: %s\n link: %s\n descr: %s\n",
@array[2,0,1];
 
T

Todd W

Gunnar Hjalmarsson said:
Todd said:
A factory function we have makes some stupid assumptions about the data it
is parsing. I give it content and a regex, and it gives me back an array.

Is there any way, for example, to tell capture 1 of a regex to store its
value in $2?

Not that I know of.

But if the function returns an array, and you want to print it in some
other order, can't you just do:

my @array = qw/bar foo bazz/;
printf "One:\n title: %s\n link: %s\n descr: %s\n",
@array[2,0,1];
The function sticks the data in a db before it gets returned. Imagine the =~
and print in a function. Hence the requirement in my post that ony the value
of the $reg2 var could be changed to make the program work for me.

The developer provided a hook so you could bypass the default mechanism and
write your own, which is what I used, but I was just wondering if anyone
knew offhand how to do what I asked in the original post.

Thanks anyway,

Todd W.
 
S

Steven Kuo

A factory function we have makes some stupid assumptions about the data it
is parsing. I give it content and a regex, and it gives me back an array.

Is there any way, for example, to tell capture 1 of a regex to store its
value in $2?

Here is the output of the program below.

[trwww@waveright misc]$ perl cap.pl
One:
title: bar
link: foo
descr: bazz
Two:
title: bazz
link: bar
descr: foo

Is there any way to make the output of "One:" identical to the output of
"Two:" by changing ONLY the the string stored in $reg2?



In genernal, no.

use warnings;
use strict;

my $str1 = '<a href="foo">bar</a><div>bazz</div>';
my $reg1 = '<a href="([^"]+)">([^<]+)</a><div>([^<]+)<';

$str1 =~ m|$reg1|;

print("One:
title: $2
link: $1
descr: $3
");



You really should check whether the match succeeded before printing
$1, etc.

my $str2 = '<div>bar</div><div>bazz</div><a href="foo">readmore</a>';

### modify only this regex
my $reg2 = '<div>([^<]+)</div><div>([^<]+)</div><a href="([^"]+)"';


$str2 =~ m|$reg2|;

print("Two:
title: $2
link: $1
descr: $3
");



In this specific case, you could try this (the lookahead pattern):

my $str2 = '<div>bar</div><div>bazz</div><a href="foo">readmore</a>';
my $reg2 = qr!(?=.*?</div><a href="([^"]+)")<div>([^<]+)</div><div>([^<]+)</div>!;

if ($str2 =~ /$reg2/) {
print <<""
Two
title: $2
link: $1
descr: $3

}
 
I

ioneabu

Todd said:
A factory function we have makes some stupid assumptions about the data it
is parsing. I give it content and a regex, and it gives me back an array.

Is there any way, for example, to tell capture 1 of a regex to store its
value in $2?

Here is the output of the program below.

[trwww@waveright misc]$ perl cap.pl
One:
title: bar
link: foo
descr: bazz
Two:
title: bazz
link: bar
descr: foo

Is there any way to make the output of "One:" identical to the output of
"Two:" by changing ONLY the the string stored in $reg2?


use warnings;
use strict;

my $str1 = '<a href="foo">bar</a><div>bazz</div>';
my $reg1 = '<a href="([^"]+)">([^<]+)</a><div>([^<]+)<';

$str1 =~ m|$reg1|;

print("One:
title: $2
link: $1
descr: $3
");

my $str2 = '<div>bar</div><div>bazz</div><a href="foo">readmore</a>';

### modify only this regex
my $reg2 = '<div>([^<]+)</div><div>([^<]+)</div><a href="([^"]+)"';


$str2 =~ m|$reg2|;

Sorry if I'm not getting the problem, but it seems that this would do
it:

my ($title, $link, $descr) = ($1, $3, $2);

print("Two:
title: $title
link: $link
descr: $descr
");


wana
 
T

Todd W

Steven Kuo said:
A factory function we have makes some stupid assumptions about the data it
is parsing. I give it content and a regex, and it gives me back an array.

Is there any way, for example, to tell capture 1 of a regex to store its
value in $2?

Here is the output of the program below.

[trwww@waveright misc]$ perl cap.pl
One:
title: bar
link: foo
descr: bazz
Two:
title: bazz
link: bar
descr: foo

Is there any way to make the output of "One:" identical to the output of
"Two:" by changing ONLY the the string stored in $reg2?

In genernal, no.

Thats okay, considering what you provided below
use warnings;
use strict;

my $str1 = '<a href="foo">bar</a><div>bazz</div>';
my $reg1 = '<a href="([^"]+)">([^<]+)</a><div>([^<]+)<';

$str1 =~ m|$reg1|;

print("One:
title: $2
link: $1
descr: $3
");

You really should check whether the match succeeded before printing
$1, etc.

It happens in the actual function. What I posted was just for demonstration.
my $str2 = '<div>bar</div><div>bazz</div><a href="foo">readmore</a>';

### modify only this regex
my $reg2 = '<div>([^<]+)</div><div>([^<]+)</div><a href="([^"]+)"';


$str2 =~ m|$reg2|;

print("Two:
title: $2
link: $1
descr: $3
");

In this specific case, you could try this (the lookahead pattern):

my $str2 = '<div>bar</div><div>bazz</div><a href="foo">readmore</a>';
my $reg2 = qr!(?=.*?</div><a
href="([^ said:
+)")"]
if ($str2 =~ /$reg2/) {
print <<""
Two
title: $2
link: $1
descr: $3

}

Yes it does. I've used lookahead to, for example, extract only links to the
/bin/ directory:

$string =~ m|<a href="(?=/bin/)([^"]+)">([^<]+)<|;

but I never thought to capture data inside the lookahead.

Thank you,

Todd W.
 
T

Todd W

Todd said:
A factory function we have makes some stupid assumptions about the data it
is parsing. I give it content and a regex, and it gives me back an array.

Is there any way, for example, to tell capture 1 of a regex to store its
value in $2?

Here is the output of the program below.

[trwww@waveright misc]$ perl cap.pl
One:
title: bar
link: foo
descr: bazz
Two:
title: bazz
link: bar
descr: foo

Is there any way to make the output of "One:" identical to the output of
"Two:" by changing ONLY the the string stored in $reg2?


use warnings;
use strict;

my $str1 = '<a href="foo">bar</a><div>bazz</div>';
my $reg1 = '<a href="([^"]+)">([^<]+)</a><div>([^<]+)<';

$str1 =~ m|$reg1|;

print("One:
title: $2
link: $1
descr: $3
");

my $str2 = '<div>bar</div><div>bazz</div><a href="foo">readmore</a>';

### modify only this regex
my $reg2 = '<div>([^<]+)</div><div>([^<]+)</div><a href="([^"]+)"';


$str2 =~ m|$reg2|;

Sorry if I'm not getting the problem, but it seems that this would do
it:

my ($title, $link, $descr) = ($1, $3, $2);

print("Two:
title: $title
link: $link
descr: $descr
");


wana

The only solutions that will work for me are ones that involve changing ONLY
the $reg2 variable. Any other changes would require modification of a
function that I can not modify.

Someone else has posted a solution in another branch.

Thanks for replying, though =0)

Todd W.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,768
Messages
2,569,574
Members
45,048
Latest member
verona

Latest Threads

Top