Using $1, $2 ... but don't know in which order

T

Tore Aursand

Hi!

I have a number of old Perl scripts doing fairly the same job; They're
connecting to a (local) web server and retrieves something from it (plain
text, that is).

As I said, each script is doing the same job; Parsing the text, and
writing some "meaningful" data to a MySQL database.

Everything works just great, but I want to put everything these scripts do
into one script, as most of what they do is identical. I have created a
list of hashes which describe each resource I try to parse;

my %sources = (
{
'title' => 'Server #1',
'href' => 'http://.../1/',
'regexp' => '...',
},
{
'title' => 'Server #2',
'href' => 'http://.../1/',
'regexp' => '...',
},
# etc...
);

Iterating through these resources;

foreach ( @sources ) {
my $text = get( $_->{'href'} ); # LWP::Simple
if ( defined $text && length $text ) {
while ( $text =~ m,$_->{regexp},sig ) {
my $foo = $1;
my $bar = $2;
# etc...
}
}
}

This works as expected for the majority of the files I download, but for
some I need to - hmm - match in a different order. Example: For most of
the sites it is suitable to set $foo = $1, but for some $foo should be $2
instead (or $3, whatever).

How should I deal with this in a sexy way? :)


--
Tore Aursand <[email protected]>
"Writing is a lot like sex. At first you do it because you like it.
Then you find yourself doing it for a few close friends and people you
like. But if you're any good at all, you end up doing it for money."
(Unknown)
 
J

J. Gleixner

Tore said:
Hi!

I have a number of old Perl scripts doing fairly the same job; They're
connecting to a (local) web server and retrieves something from it (plain
text, that is).

As I said, each script is doing the same job; Parsing the text, and
writing some "meaningful" data to a MySQL database.

Everything works just great, but I want to put everything these scripts do
into one script, as most of what they do is identical. I have created a
list of hashes which describe each resource I try to parse;

my %sources = (

Ahhhhh.. might want to cut & paste next time...
{
'title' => 'Server #1',
'href' => 'http://.../1/',
'regexp' => '...',
},
{
'title' => 'Server #2',
'href' => 'http://.../1/',
'regexp' => '...',
},
# etc...
);

Iterating through these resources;

foreach ( @sources ) {
my $text = get( $_->{'href'} ); # LWP::Simple
if ( defined $text && length $text ) {
while ( $text =~ m,$_->{regexp},sig ) {
my $foo = $1;
my $bar = $2;
# etc...
}
}
}

This works as expected for the majority of the files I download, but for
some I need to - hmm - match in a different order. Example: For most of
the sites it is suitable to set $foo = $1, but for some $foo should be $2
instead (or $3, whatever).

How should I deal with this in a sexy way? :)

First, you blow in its ear... You haven't given us the criteria for
when $foo=$2. If it's based on the URL, then add an if:

my $foo;
if ($_->{href} =~ /criteria/) { $foo = $2 }

Whatever the reason, you have to have something make the decision.. if,
switch, m//, whatever.
 
W

Walter Roberson

:Everything works just great, but I want to put everything these scripts do
:into one script,

: {
: 'title' => 'Server #1',
: 'href' => 'http://.../1/',
: 'regexp' => '...',

It might make sense to use 'regexp' => qr/.../
so as to avoid recompiling the regexp on each step


:Iterating through these resources;

: while ( $text =~ m,$_->{regexp},sig ) {
: my $foo = $1;
: my $bar = $2;

:This works as expected for the majority of the files I download, but for
:some I need to - hmm - match in a different order.

:How should I deal with this in a sexy way? :)

Instead of a regexp in the structure, put in an anonymous sub
that does the match and returns the elements in a consistant order.

while ( my ($foo, $bar) = $_->{extractsub} ) {
# etc
}
 
W

Walter Roberson

: while ( my ($foo, $bar) = $_->{extractsub} ) {
: # etc
: }

Opps, typo. I meant

my $sref = $_->{extractsub};

while ( my ($foo, $bar) = &$sref($text) ) {
# etc
}
 
M

Mark Clements

Tore said:
Hi!
Iterating through these resources;

foreach ( @sources ) {
my $text = get( $_->{'href'} ); # LWP::Simple
if ( defined $text && length $text ) {
while ( $text =~ m,$_->{regexp},sig ) {
my $foo = $1;
my $bar = $2;
# etc...
}
}
}

This works as expected for the majority of the files I download, but for
some I need to - hmm - match in a different order. Example: For most of
the sites it is suitable to set $foo = $1, but for some $foo should be $2
instead (or $3, whatever).

How should I deal with this in a sexy way? :)
from man perlop:

If the "/g" option is not used, "m//" in list
context returns a list consisting of the
subexpressions matched by the parentheses in the
pattern, i.e., ($1, $2, $3...).

so how about defining the match positions in the config data:

my @sources = (
{
'title' => 'Server #1',
'href' => 'http://.../1/',
'regexp' => '...',
matchPositions => {
foo => 1,
bar => 2
}
},
{
'title' => 'Server #2',
'href' => 'http://.../2/',
'regexp' => '...',
matchPositions => {
foo => 2,
bar => 4
}
},
);

foreach (@sources) {
my $text = get( $_->{'href'} ); # LWP::Simple
my $matchPositions = $_->{positions};
if ( defined $text && length $text ) {
while ( my @matches = $text =~ m,$_->{regexp},sig ) {
my $foo = $matches[ $matchPositions->{foo} ];
my $bar = $matches[ $matchPositions->{bar} ];

# etc...
}
}
}

regards,

Mark
 
T

Tore Aursand

from man perlop:

If the "/g" option is not used, "m//" in list
context returns a list consisting of the
subexpressions matched by the parentheses in the
pattern, i.e., ($1, $2, $3...).

Doh! I read this one, but I never saw that "returns a list". Of course,
that's the solution; "Telling" each source where in that list the data is.

Thanks!
 
A

Anno Siegel

Tore Aursand said:
Hi!

I have a number of old Perl scripts doing fairly the same job; They're
connecting to a (local) web server and retrieves something from it (plain
text, that is).

As I said, each script is doing the same job; Parsing the text, and
writing some "meaningful" data to a MySQL database.

Everything works just great, but I want to put everything these scripts do
into one script, as most of what they do is identical. I have created a
list of hashes which describe each resource I try to parse;

my %sources = (
{
'title' => 'Server #1',
'href' => 'http://.../1/',
'regexp' => '...',
},
{
'title' => 'Server #2',
'href' => 'http://.../1/',
'regexp' => '...',
},
# etc...
);

Iterating through these resources;

foreach ( @sources ) {
my $text = get( $_->{'href'} ); # LWP::Simple
if ( defined $text && length $text ) {
^^
Precedence problem here.
while ( $text =~ m,$_->{regexp},sig ) {
my $foo = $1;
my $bar = $2;
# etc...
}
}
}

This works as expected for the majority of the files I download, but for
some I need to - hmm - match in a different order. Example: For most of
the sites it is suitable to set $foo = $1, but for some $foo should be $2
instead (or $3, whatever).

How should I deal with this in a sexy way? :)

Include a permutation vector in the resource description that says what goes
where. Then use an array slice to assign the matches to your variable,
re-ordering as necessary. The pseudo-code below shows a modified while-loop
that deals with this situation. If no permutation is given, the matches
are assigned in natural order:

foreach ( @sources ) {
my $text = get( $_->{'href'} ); # LWP::Simple
if ( defined $text and length $text ) {
while ( my @matches = $text =~ m,$_->{regexp},sig ) {
my @perm = @{ $_->{ perm} || [ 0 .. $#matches]};
my( $foo, $bar, ...) = @matches[ @perm];
}
}
}

Anno
 
A

Anno Siegel

[following up to myself]
while ( my @matches = $text =~ m,$_->{regexp},sig ) {

Ugh. It's not a good idea to put the match in list context here, that
would loop forever. That detracts from the sexiness.
my @perm = @{ $_->{ perm} || [ 0 .. $#matches]};
my( $foo, $bar, ...) = @matches[ @perm];
}

Since the number of variables $foo, $bar, ... is known, a fix could be

while ( $text =~ m,$_->{regexp},sig ) {
my @matches = ( $1, $2, ...);
# (otherwise unchanged)

That detracts a bit from the sexiness too, but not as badly as not working.
You may think of a better fix in the context of your actual code, or
adapt one of the other suggestions.

Anno
 
T

Tassilo v. Parseval

Also sprach Anno Siegel:
^^
Precedence problem here.

No, that's alright. Terms have the highest precedence in Perl and thus
'&&' and 'and' are interchangeable in this case.

Tassilo
 
P

Paul Lalli

^^
Precedence problem here.

Funny, I was going to say there was a redundancy problem here. Afterall,
if $text passes the length() test, won't it by definition also pass the
defined() test? That is, no undef value can ever have a non-zero length,
can it?

Paul Lalli
 
R

Richard Morse

Paul Lalli said:
Funny, I was going to say there was a redundancy problem here. Afterall,
if $text passes the length() test, won't it by definition also pass the
defined() test? That is, no undef value can ever have a non-zero length,
can it?

Yes, but if the code is running under warnings, trying to get length of
$text will print a warning about using an uninitialized value. Although
it's just a warning, and doesn't actually hurt the program, it is ugly.
So that's why you first test the definedness...

Ricy
 
T

Tassilo v. Parseval

Also sprach Paul Lalli:
Funny, I was going to say there was a redundancy problem here. Afterall,
if $text passes the length() test, won't it by definition also pass the
defined() test? That is, no undef value can ever have a non-zero length,
can it?

The preliminary test for definedness might be there in order to avoid a
"Use of uninitialized value" warning. Functionally, it is not necessary.

Tassilo
 
P

Paul Lalli

Yes, but if the code is running under warnings, trying to get length of
$text will print a warning about using an uninitialized value. Although
it's just a warning, and doesn't actually hurt the program, it is ugly.
So that's why you first test the definedness...


Ahh, yes, I see that now. I hadn't thought of the warning issue. Thanks
for clarifying for me.

Paul Lalli
 
B

Ben Morrow

Quoth (e-mail address removed)-cnrc.gc.ca (Walter Roberson):
: while ( my ($foo, $bar) = $_->{extractsub} ) {
: # etc
: }

Opps, typo. I meant

my $sref = $_->{extractsub};

while ( my ($foo, $bar) = &$sref($text) ) {
# etc
}

Or simply

while ( my ($foo, $bar) = $_->{extractsub}() ) {

Ben
 
W

Walter Roberson

:> Feep if you love VT-52's.

:But VT-52's don't have feepers. (sound of a '52 Chevy stripping its gears)

The Jargon dictionary says,

"The feeper on a VT-52 has been compared to the sound of a '52 Chevy
stripping its gears."

which is only possible if the the VT-52 is considered to -have-
a feeper.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,014
Latest member
BiancaFix3

Latest Threads

Top