simple web site mapper

D

Dan Jacobson

Say, does this working simple web site mapper (vs. Eric Raymond's
extra large version) have any stuffing hanging out that perl novice me
ought to fix? Say, how does one do '/bin/sh -e' in perl? Would
rewriting it in python be as easy? Does perl have an internal "ls"
that could be called as easy? File::Find couldn't give me the ls -R
order I prefer I suppose. Goal: to use even less lines of code.

use strict;
require HTML::HeadParser;
my $dir=<~/mywebsite>; #where files are on my computer
my $name="Bob Blobkowski";
my $url='http://blobkowski.org/';
chdir $dir||die;
print <<EOF;
<!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd"><HTML><TITLE>$name\'s site map</TITLE>
</HEAD><BODY><H1>$name\'s site map</H1>\n<P><A href="$url">$url</A> as of
EOF
system "date"; print '</P><HR><PRE>';
my $p = HTML::HeadParser->new; my $d;
open LS, "ls -R|"||die;
while(<LS>){ #order nicer than find(1)
chomp;
if(s@:$@@){$d=$_;next}
if(/\.(txt|html)$/){s@^@$d/@;s/..//;print "<A href=\"$_\">$_</A>\n";
if(/\.html$/){$p->parse_file($_);print "\t",$p->header('Title'),"\n";}}}
print "</PRE></BODY></HTML>";
 
D

David K. Wall

Dan Jacobson said:
Say, does this working simple web site mapper (vs. Eric Raymond's
extra large version) have any stuffing hanging out that perl
novice me ought to fix?

My comments below are about style, because that's what really got my
attention.
Say, how does one do '/bin/sh -e' in
perl? Would rewriting it in python be as easy?

I don't know much about Python, I've only played with it a bit. But
it would *force* you to use whitespace, which might not be a bad
thing. :) If you're interested, grab a copy and try it:
http://www.python.org
Does perl have an
internal "ls" that could be called as easy? File::Find couldn't
give me the ls -R order I prefer I suppose. Goal: to use even
less lines of code.

Forget using less lines of code unless it makes the program more
reliable and/or readable. If I had to revise the code below the very
first thing I would do would be to format it in a way that aids
comprehension instead of hindering it.

Ok, so it's not a long, complicated program. But even short programs
can benefit from a readable style. See 'perldoc perlstyle' for
suggestions.
use strict;
require HTML::HeadParser;
my $dir=<~/mywebsite>; #where files are on my computer
my $name="Bob Blobkowski";
my $url='http://blobkowski.org/';
chdir $dir||die;

There's a low-precedence 'or' that you can use, too.

chdir $dir or die "Cannot chdir to $dir: $!";

print <<EOF;
<!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd"><HTML><TITLE>$name\'s site
^^
It's not necessary to escape that single-quote after $name.
map</TITLE>
</HEAD><BODY><H1>$name\'s site map</H1>\n<P><A href="$url">$url</A>
as of
EOF
system "date"; print '</P><HR><PRE>';
my $p = HTML::HeadParser->new; my $d;

$d? What's $d? I can figure it out from the code, but a more
descriptive name would be useful. If someone else needs to edit the
code they'll appreciate a longer name -- and you might, too, when you
come back to this program after a while and don't remember all the
details.

Newlines and spaces are not a scarce resource.
open LS, "ls -R|"||die;
while(<LS>){ #order nicer than find(1)
chomp;
if(s@:$@@){$d=$_;next}

Why use a non-standard delimiter for the substitution operator when
you don't need to? s/:$// is pretty clear; s@:$@@ is slightly
obfuscated, IMHO.
if(/\.(txt|html)$/){s@^@$d/@;s/..//;print "<A
href=\"$_\">$_</A>\n";

qq() would make that easier to read; you wouldn't need to escape the
double-quote each time. See 'perldoc perlop' for quote and quote-
like operators.
if(/\.html$/){$p->parse_file($_);print
"\t",$p->header('Title'),"\n";}}}
print "</PRE></BODY></HTML>";

Just because you *can* eliminate whitespace doesn't mean you
*should*.

That may be why no-one else has responded. (No-one had when I posted
this, anyway)

As for bugs, I haven't looked -- I was distracted too much by the
style.
 
S

Steve Grazzini

Have a look at File::Find's "preprocess" option.

% perldoc File::Find
There's a low-precedence 'or' that you can use, too.

chdir $dir or die "Cannot chdir to $dir: $!";

He'll *need* to use the low-precedence version or add some
parentheses.

Same thing here:

% perl -MO=Deparse -e 'open LS, "ls -R|" || die'
open LS, 'ls -R|';
 
J

Jeff 'japhy' Pinyan

^^
It's not necessary to escape that single-quote after $name.

Yes it is. $name's is pre-Perl 5 syntax for $name::s. It's something
that bites many Perl programmers from time to time.
 
D

David K. Wall

Jeff 'japhy' Pinyan said:
Yes it is. $name's is pre-Perl 5 syntax for $name::s. It's
something that bites many Perl programmers from time to time.

Damn. I knew about the pre-Perl 5 syntax, but a quick
copy/paste/edit/run of code from the OP convinced me that it didn't
matter any more. But there are *two* places in the here-doc that use
an escape quote. I edited one and saw the output from the other.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top