backreferneces in search pattern

M

Marek Stepanek

Hallo all,


I try to set up a Perl-Filter (for BBEdit on Macintosh). I want to match and
print out all gifs needed for a rollover in an HTML-File. The filter should
match

onmouseover="document.podium.src='../../pix/grafix/hili_podium.gif';

but also beasts like the following :

onmouseover="document.podium.src='../../pix/logos/theembassy1.gif';
document.bios.src='../../pix/logos/theembassy2.gif';
document.addresses.src='../../pix/logos/theembassy3.gif';
document.events.src='../../pix/logos/theembassy4.gif';
document.priwat.src='../../pix/logos/theembassy5.gif';
document.yrpodium.src='../../pix/logos/theembassy6.gif';
document.logo.src='../../pix/logos/hili_pilogo.gif'"


the result should print from the above examples :

'../../pix/grafix/hili_podium.gif',
'../../pix/logos/theembassy1.gif',
'../../pix/logos/theembassy2.gif',
'../../pix/logos/theembassy3.gif',
'../../pix/logos/theembassy4.gif',
'../../pix/logos/theembassy5.gif',
'../../pix/logos/theembassy6.gif',
'../../pix/logos/hili_pilogo.gif',


I am labouring since a long while already on this filter. Could somebody
help me out ? I am beginner and want learn Perl very much.

This filter here produces the following error :


Modification of a read-only value attempted <> line 1.


(mind line breaks from my email client)


#!/usr/bin/perl -w


while (<>) {
($1, $2, $3) =
m!onmouseover="[^']+'([^']+)'(?:(?:(?:;\s+[^']+)'([^']+)')+)?(?:(?:[^']+)'([
^']+)')?"!g;
my @results = ($1, $2, $3);
foreach $i (0..$#results) {
print "'", $i, "',\n";
}
}

My question is not about the grep search pattern, but how to put the
backreferences into an array to print it out.

one other try gives :

Use of uninitialized value in print <> line 2.


#!/usr/bin/perl -w

while (<>) {
while (
s!onmouseover="[^']+'([^']+)'(?:(?:(?:;\s+[^']+)'([^']+)')+)?(?:(?:[^']+)'(
[^']+)')?"!!g ) {print "'", $1, "',\n"; print "'", $2, "',\n"; print "'",
$3, "',\n";}
}

thank you



marek



--
______________________________________________________________________
___PODIUM_INTERNATIONAL_//_the_embassy_for_talented_young_musicians___
_______Marek_Stepanek__mstep_[at]_PodiumInternational_[dot]_org_______
__________________http://www.PodiumInternational.org__________________
______________________________________________________________________
 
B

Ben Morrow

Quoth (e-mail address removed):
The uninitialized value might have come from a capturing parentheses
pair that matched an empty string because it was inside the scope of a
"?" metacharacter and the "no occurrences" branch was taken. Prior to
the match, all the $1, $2 etc variables are set to undef, and are only
modified upon their corresponding set of parens actually being used in
the final match.

This is not true. The $n variables are only modified if the match
succeeded: if the match failed, they will still have the values from the
last *successful* match.

Ben
 
M

Marek Stepanek

thank you Bob for this long, patient and pedagogic reply :)


Marek Stepanek wrote:
...


Unless you control the layout of the HTML, you would be much better off
to parse your HTML using a HTML parser module (HTML::parser, perhaps).
Correctly parsing HTML is much harder than it appears at first glance.

I just installed HTML::parser, but before learning to use it, I would like
to learn my Perl first. And this Module would not function as a filter (in
BBEdit), isn't it ? And I think my regex was right, to extract my rollover
gifs, I was searching for, or am I wrong ?


You are missing:

use strict;
use warnings;

Let Perl help you all it can. I see you did use the -w switch -- but
these days it is better to

use warnings;

because of the additional control over the warnings it affords. See
below for what I am talking about.

Thank you ! I thought the -w switch would do the same as "use warnings". The
"use strict" I should put systematically specially as a beginner ! ( bad
conscience :)
-------^^--^^--^^

Those are your readonly variables that you are attempting to modify. See

perldoc perlvar

particularly the section about $<digits> . Note that the $1, $2, etc
variables are assigned by simply placing capturing sets of parentheses
in a regular expression. Judging from your next couple of lines of
code, you probably want:

my @results =

on this line, and then remove the "my @results=($1,$2,$3);" line.

like that ? :

#!/usr/bin/perl -w


use strict;
use warnings;


while (<>) {
my @results =
m!onmouseover="[^']+'([^']+)'(?:(?:(?:;\s+[^']+)'([^']+)')+)?(?:(?:[^']+)'([
^']+)')?"!g;
foreach $i (0..$#results) {
print "'", $i, "',\n";
}
}

but this gives me a funny result :

'0',
'1',
'2',
'3',
'4',
'5',
'6',
'7',
'8',
'9',
'10',
'11', etc etc


I suppose there is something wrong with "scalar context" and the array
@results ... I would be very grateful, if somebody could correct this
version above !


greetings from Munich


marek




--
______________________________________________________________________
___PODIUM_INTERNATIONAL_//_the_embassy_for_talented_young_musicians___
_______Marek_Stepanek__mstep_[at]_PodiumInternational_[dot]_org_______
__________________http://www.PodiumInternational.org__________________
______________________________________________________________________
 
E

Eric Amick

Marek Stepanek, Fri20041112@20:42:43(CET):
while (<>) {
my @results =
m!onmouseover="[^']+'([^']+)'(?:(?:(?:;\s+[^']+)'([^']+)')+)?(?:(?:[^']+)'([
^']+)')?"!g;
foreach $i (0..$#results) {
print "'", $i, "',\n";
}
}

but this gives me a funny result :

'0',
'1',
'2',

I think you want to print $results[$i] instead.

Better still

foreach $i (@results) {
print "'$i'\n";
}

or even

print "'$_'\n" for @results;
 
M

Marek Stepanek

Marek Stepanek, Fri20041112@20:42:43(CET):
while (<>) {
my @results =
m!onmouseover="[^']+'([^']+)'(?:(?:(?:;\s+[^']+)'([^']+)')+)?(?:(?:[^']+)'([
^']+)')?"!g;
foreach $i (0..$#results) {
print "'", $i, "',\n";
}
}

but this gives me a funny result :

'0',
'1',
'2',

I think you want to print $results[$i] instead.

Better still

foreach $i (@results) {
print "'$i'\n";
}

or even

print "'$_'\n" for @results;


thanx a lot, I am blushing, this was a too silly question :)


May I ask one silly question more ? I would like to kill the duplicates of
this search result and met it in an alphabetical order; is it possible to
put two scripts in one, working on the result of the first ? I was
experimenting with something like the following :


#!/usr/bin/perl -w

use strict;
use warnings;


while (<>) {
no warnings qw(uninitialized);
my @results =
m!onmouseover="[^']+'([^']+)'(?:(?:(?:;\s+[^']+)'([^']+)')+)?(?:(?:[^']+)'([
^']+)')?"!g;
foreach my $i (0..$#results) {
print "'$results[$i]',\n";
my %seen;
while (<>) {
$seen{$_}++;
}
print sort keys %seen;
}
}

Also this try is not working :


#!/usr/bin/perl -w

use strict;
use warnings;

while (<>) {
no warnings qw(uninitialized);
my @results =
m!onmouseover="[^']+'([^']+)'(?:(?:(?:;\s+[^']+)'([^']+)')+)?(?:(?:[^']+)'([
^']+)')?"!g;
foreach my $i (0..$#results) {
print "'$results[$i]',\n";
}
}


while (<>) {
my %seen;
while (<>) {
$seen{$_}++;
}
print sort keys %seen;
}


Thank you for all your help



marek



--
______________________________________________________________________
___PODIUM_INTERNATIONAL_//_the_embassy_for_talented_young_musicians___
_______Marek_Stepanek__mstep_[at]_PodiumInternational_[dot]_org_______
__________________http://www.PodiumInternational.org__________________
______________________________________________________________________
 
B

Brian McCauley

Marek said:
Marek Stepanek, Fri20041112@20:42:43(CET):

while (<>) {
my @results =
m!onmouseover="[^']+'([^']+)'(?:(?:(?:;\s+[^']+)'([^']+)')+)?(?:(?:[^']+)'([
^']+)')?"!g;
foreach $i (0..$#results) {
print "'", $i, "',\n";
}
}

but this gives me a funny result :

'0',
'1',
'2',

I think you want to print $results[$i] instead.

Better still

foreach $i (@results) {
print "'$i'\n";
}

or even

print "'$_'\n" for @results;



thanx a lot, I am blushing, this was a too silly question :)


May I ask one silly question more ? I would like to kill the duplicates of
this search result and met it in an alphabetical order; is it possible to
put two scripts in one, working on the result of the first ? I was
experimenting with something like the following :

You are doing random things without understanding what you are doing.
You can't just cut and paste fragments of code together and expect them
to figure out how you want them to interact. You need to step back and
learn a few basics of computer programming.

The Perl <> operator reads from the process's input and print() sends
stuff to the process's output. If you have two Perl programs written to
process from input stream to output stream and you want to combine them
you either have to do some adavanced trickery to redirect input and
output or you need to actually combine them so that you read from the
imput, do two things, then write to the output.
 
M

Marek Stepanek

You are doing random things without understanding what you are doing.
You can't just cut and paste fragments of code together and expect them
to figure out how you want them to interact. You need to step back and
learn a few basics of computer programming.

The Perl <> operator reads from the process's input and print() sends
stuff to the process's output. If you have two Perl programs written to
process from input stream to output stream and you want to combine them
you either have to do some adavanced trickery to redirect input and
output or you need to actually combine them so that you read from the
imput, do two things, then write to the output.

I see I abused your patience. Is their perhaps a beginners Perl-Group, where
I get some help and nobody is shouting on my silly questions ?

If somebody could point me to the right direction of the "adavanced
trickery", I will find the rest myself. Is this the right tricky direction ?


#!/usr/bin/perl -w

use strict;
use warnings;

my (@results02, %seen);

while (<>) {
no warnings qw(uninitialized);
my @results =
m!onmouseover="[^']+'([^']+)'(?:(?:(?:;\s+[^']+)'([^']+)')+)?(?:(?:[^']+)'([
^']+)')?"!g;

foreach my $i (0..$#results) {
@results02 = "'$results[$i]',\n";
}
for (@results02) {
$seen{$_}++;
}
print sort keys %seen;
}
 
T

Tad McClellan

use warnings;

Good!


no warnings qw(uninitialized);


Bad!

Do you turn the radio volume up to "fix" the funny noise that
your car is making?

You should try and understand why you are getting a warning rather
than ignore it. The warning is trying to tell you something...
listen to it.

my @results =
m!onmouseover="[^']+'([^']+)'(?:(?:(?:;\s+[^']+)'([^']+)')+)?(?:(?:[^']+)'([
^']+)')?"!g;


That regex it too horrid for me to look at, you need a m!!gx instead...

foreach my $i (0..$#results) {
@results02 = "'$results[$i]',\n";
}


When you find yourself doing explicit indexing in Perl, you should
pause and rethink, because you don't often need to do that.

You can loop over all of the elements without any explict indexing,
leaving that part to Perl (and a machine is much less likely to
make an indexing mistake than a human is).

foreach my $result ( @results ) {


Ask yourself these questions:

What will @results contain if the match *fails* ? (A: the empty list)

What will $#results return in that case? (A: -1)

How many iterations will your foreach() above go thru? (A: 0)


ie. The body of the loop is never entered, and the for() below
is never executed.


Did you try executing this code, or did you just type it in?

for (@results02) {

@results02 can never have more than a single element in it, because
you stomp over it each time thru the outer loop.

This loop (would) iterate no more that 1 time, so there is no
need for a loop at all...



If you post a short and complete program *that we can run* that
illustrates your problem, then we could surely help you fix it.

Have you seen the Posting Guidelines that are posted here frequently?
 
B

Brian McCauley

Tad said:

Yes in this case. I don't think that you should give the impression
that switching off warnings is always bad. Particularly in the case of
'uninitialized' one can often produce code that is both more efficient
and more readable by disabling this particular warning:

my $record = do {
no warnings qw(uninitialized);
join ':', @fields;
};

my $record = join ':', map { defined() ? $_ : '' } @fields;
You should try and understand why you are getting a warning rather
than ignore it.

That is true. But you may give some people the impression that you
think understanding the cause and choosing to ignore are warning are
mutually exclusive. Indeed understanding the cause should be considered
a prerequisite of making the descision to ignore a warning.
The warning is trying to tell you something... listen to it.

I agree completely.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,581
Members
45,057
Latest member
KetoBeezACVGummies

Latest Threads

Top