malformated substitution attempt

O

OllyP

_____________________________________________________________
Tue Jun 06 00:43:59 CDT 2006

I am not a programmer or an IT guy, but have an interest in learning
Perl. I am attempting to take a list of filepaths and rework them into
an html index so as to click them into a browser easily.
Perl, v5.8.3 Linux

The list format is:
/home/olly/aasys/perl/sandtr.html
(and on etc, for 80/90 files)

I ran:


#!/usr/bin/perl
#use warnings;
#use strict;
#use diagnostics;

use Tie::File;

tie @hot, 'Tie::File', "dead.html" or die;


for (@hot) {
s/^/<a href="file:/g;
}

for (@hot) {
s/$/">X<\/a><br>/g;
}

# At this point each line in the whole list appears similar to this:
# <a href="file:/home/olly/aasys/perl/sandtr.html">X</a><br>
# This is exactly what I want, and the X is clickable in the browser.

# However I want the filename to display instead of the X.

for (@hot =~ m/perl\/.+html/){
s/\X/\1/g;
}

# It does not displace the X with the filename.

print \1, "\n";
untie @hot, "\n";
----cut----

I have tried $& , $1 etc, but can't find the filename
The best I can do is: SCALAR(0x8186eec), however it does not replace X.

The filename will appear with the match above if I do:

-----------
my $string="<a href=\"file:/home/olly/aasys/perl/sandtr.html\">X</a><br>";
if($string =~ m/perl\/.+html/)
{s/\X/$&/};
print $&, "\n";
print $string. "\n";
#Ollynotes: this works to grab $&, but the string is unchanged?
return:

perl/sandtr.html <-- This is $& as I wanted.
<a href="file:/home/olly/aasys/perl/sandtr.html">X</a><br>
But no change .................................. ^^ here.
--------
Obviously I am violating some basic principals but can't find it in the
reams of documentation I have. I do find simple substitution aplenty
but nothing about picking up a portion and tacking it back.

Where would I look for this type of substitution?
Is there a better procedure to approach my project?
Should I throw in the towel and become a street musician?

All replies will be thoroughly read and saved.

Thank you
 
J

John W. Krahn

OllyP said:
I am not a programmer or an IT guy, but have an interest in learning
Perl. I am attempting to take a list of filepaths and rework them into
an html index so as to click them into a browser easily.
Perl, v5.8.3 Linux

The list format is:
/home/olly/aasys/perl/sandtr.html
(and on etc, for 80/90 files)

I ran:


#!/usr/bin/perl
#use warnings;
#use strict;

You should uncomment the previous two lines.
#use diagnostics;

use Tie::File;

tie @hot, 'Tie::File', "dead.html" or die;


for (@hot) {
s/^/<a href="file:/g;
}

for (@hot) {
s/$/">X<\/a><br>/g;
}

# At this point each line in the whole list appears similar to this:
# <a href="file:/home/olly/aasys/perl/sandtr.html">X</a><br>
# This is exactly what I want, and the X is clickable in the browser.

# However I want the filename to display instead of the X.

Something like this should work (UNTESTED):

#!/usr/bin/perl
use warnings;
use strict;

use Tie::File;
use File::Basename;

tie my @hot, 'Tie::File', 'dead.html' or die "Cannot open 'dead.html' $!";

for my $path ( @hot ) {
next if /^<a href=/; # skip if already converted.
my $file = basename $path;
$path = qq[<a href="file:$path">$file</a><br>];
}

__END__



John
 
D

Dr.Ruud

OllyP schreef:

I [...] have an interest in learning Perl.
[...]
#use warnings;
#use strict;
#use diagnostics;

Start by removing the first character of each of those 3 lines.
 
M

Mumia W.

OllyP said:
_____________________________________________________________
Tue Jun 06 00:43:59 CDT 2006

I am not a programmer or an IT guy, but have an interest in learning
Perl. I am attempting to take a list of filepaths and rework them into
an html index so as to click them into a browser easily.
Perl, v5.8.3 Linux

The list format is:
/home/olly/aasys/perl/sandtr.html
(and on etc, for 80/90 files)

I ran:


#!/usr/bin/perl
#use warnings;
#use strict;
#use diagnostics;

use Tie::File;
[...]

IMO, Tie::File complicates matters. Try something like this:

use strict;
use warnings;
use File::Basename qw(basename);
use File::Slurp qw(read_file write_file);
use English;

$ARG = read_file 'file-list.txt';
s/^.*$/<a href="$MATCH">@{[ basename $MATCH ]}<\/a><br>/mg;
write_file 'file-list.html', $ARG;


__END__

The 'm' option to the s/// operator allows it to match multiple lines in
a single string. See "man perlop."

HTH
 
M

Mumia W.

Mumia said:
OllyP said:
_____________________________________________________________
Tue Jun 06 00:43:59 CDT 2006

I am not a programmer or an IT guy, but have an interest in learning
Perl. I am attempting to take a list of filepaths and rework them into
an html index so as to click them into a browser easily.

$ARG = read_file 'file-list.txt';
s/^.*$/<a href="$MATCH">@{[ basename $MATCH ]}<\/a><br>/mg;
write_file 'file-list.html', $ARG;

Oops, I forgot that the source file was HTML with some stuff in it that
he wanted to keep. That substitution should be changed to this:

s/^[\/.\w\d-]+$/<a href="$MATCH">@{[ basename $MATCH ]}<\/a><br>/mg;

I was using English and File::Slurp in the code above.
 
O

OllyP

* Christian Winter said:
[..msg snipped..]
Just do it all in one iteration.
_____________________________________________________________
Tue Jun 06 05:47:03 CDT 2006

Thank you Chris, John, and Dr Ruud for looking at my post and
commenting.
for( @hot )
{
s/(.*)(perl.*)/<a href="file:$1$2">$2</a>/;
}

This should produce the required output. Note the $1 and $2
variables, that refer to the capturing parenthesises inside
the match (see "perldoc perlvar" and "perldoc perlre" for
details.
I can understand the second part easy enough Chris, assuming you omitted
the <br> unintentionally but the first part I will have to hit those doc
pages on, as it looks like it would match the whole line. Keeping in mind
that I know very little in this area :cool: .
If you want to do the insertion of the filename in a different
step, the following may help:

if( m#(perl/.+html)# )
{
my $substitute = $1; # Just for readability
s/X/$substitute/;
}

Note that I have replaced the pattern separator "/" in the above
example with "#", thus ridding me of the need to escape the
forward slash. This comes in handy when doing replacements on paths
or matching on closing html tags.
This "if" will likely cure my problem, and I learn in the bargain.
I had read about the "cure the toothpick syndrome".
[...]
if($string =~ m/perl\/.+html/)
{s/\X/$&/};

That wont work. In your replacement pattern $& is already filled
with the current match (this being "X"). And no need to escape the
X character either.
I was beating a dead horse here Chris, and of course could not make it
print a substitution, the escape on the X was an act of desperation that I
added in the throes of agony while trying to understand what was going
wrong. I fully intended to remove that before posting where people would
see it arrgh. The same thing with the commented warnings/strict that
both John and DR Ruud admonished me about, I do understand the
importance of those. I let them swoosh right by when proof reading.
Sometimes blindly stabbing at what you think it might be can work, of course
never with munitions.

John wrote in part:
#!/usr/bin/perl
use warnings;
use strict;
use Tie::File;
use File::Basename;
tie my @hot, 'Tie::File', 'dead.html' or die "Cannot open 'dead.html' $!";
for my $path ( @hot ) {
next if /^<a href=/; # skip if already converted.
my $file = basename $path;
$path = qq[<a href="file:$path">$file</a><br>];
}
I am unfamiliar with Basename John so I will do some reading there, I
had ran out of ideas on places to look as there is so much documentation
thank you for pointing me toward it.

Thanks to all of you
 
O

OllyP

* Mumia W. said:
IMO, Tie::File complicates matters. Try something like this:

use strict;
use warnings;
use File::Basename qw(basename);
use File::Slurp qw(read_file write_file);
use English;

$ARG = read_file 'file-list.txt';
s/^.*$/<a href="$MATCH">@{[ basename $MATCH ]}<\/a><br>/mg;
write_file 'file-list.html', $ARG;


__END__

The 'm' option to the s/// operator allows it to match multiple lines in
a single string. See "man perlop."
_____________________________________________________________
Tue Jun 06 08:11:41 CDT 2006

I am amazed Mumia, shock and awe.

I ran your code with only changing it to my filename 'dead.html' and
using your second post line:
s/^[\/.\w\d-]+$/<a href="$MATCH">@{[ basename $MATCH ]}<\/a><br>/mg;

and it produced:
<a href="/home/olly/aasys/perl/sandtr.html">sandtr.html</a><br>
<a href="/home/olly/aasys/perl/perlreg.html">perlreg.html</a><br>
<a href="/home/olly/aasys/perl/ch07_05.html">ch07_05.html</a><br>
and on etc.

This is exactly what you said, and is what I was trying to get perfectly.

I was convinced I was only going to get the prefix and suffix as I
can't see what matches the embedded filename even after I ran the code
and observed the results? It is a complete mystery to me how it finds
that filename. Unless you are some sort of wizard? I am going to study
the basename module, perhaps the key is there? At the moment I am
expecting my LED screen to burst into blue flames at any second.

Thank you
 
M

Mumia W.

OllyP said:
* Mumia W. said:
IMO, Tie::File complicates matters. Try something like this:

use strict;
use warnings;
use File::Basename qw(basename);
use File::Slurp qw(read_file write_file);
use English;

$ARG = read_file 'file-list.txt';
s/^.*$/<a href="$MATCH">@{[ basename $MATCH ]}<\/a><br>/mg;
write_file 'file-list.html', $ARG;


__END__

The 'm' option to the s/// operator allows it to match multiple lines in
a single string. See "man perlop."
_____________________________________________________________
Tue Jun 06 08:11:41 CDT 2006

I am amazed Mumia, shock and awe.

Thanks.

I ran your code with only changing it to my filename 'dead.html' and
using your second post line:
s/^[\/.\w\d-]+$/<a href="$MATCH">@{[ basename $MATCH ]}<\/a><br>/mg;

and it produced:
<a href="/home/olly/aasys/perl/sandtr.html">sandtr.html</a><br>
<a href="/home/olly/aasys/perl/perlreg.html">perlreg.html</a><br>
<a href="/home/olly/aasys/perl/ch07_05.html">ch07_05.html</a><br>
and on etc.

This is exactly what you said, and is what I was trying to get perfectly.

I was convinced I was only going to get the prefix and suffix as I
can't see what matches the embedded filename even after I ran the code
and observed the results? It is a complete mystery to me how it finds
that filename.

I needed a regex that would only match file names--not HTML, and since
HTML tags tend to have angle brackets (<>) and quotes ('") in them, I
needed to choose a regex that didn't allow for those.

[\/.\w\d-]+ selects only strings that contain slashes, periods, word
characters, digits and dashes. Putting that regex between ^ and $
requires it to match the entire line. That should exclude HTML from the
substitution.
Unless you are some sort of wizard? I am going to study
the basename module, perhaps the key is there? At the moment I am
expecting my LED screen to burst into blue flames at any second.

Thank you

You're welcome.

File::Basename::basename only extracts the basename from a fully
qualified filename, e.g, /usr/local/pixmaps/firefox.png -> firefox.png
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,764
Messages
2,569,566
Members
45,041
Latest member
RomeoFarnh

Latest Threads

Top