simple web site mapper

Discussion in 'Perl Misc' started by Dan Jacobson, Jul 24, 2003.

  1. Dan Jacobson

    Dan Jacobson Guest

    Say, does this working simple web site mapper (vs. Eric Raymond's
    extra large version) have any stuffing hanging out that perl novice me
    ought to fix? Say, how does one do '/bin/sh -e' in perl? Would
    rewriting it in python be as easy? Does perl have an internal "ls"
    that could be called as easy? File::Find couldn't give me the ls -R
    order I prefer I suppose. Goal: to use even less lines of code.

    use strict;
    require HTML::HeadParser;
    my $dir=<~/mywebsite>; #where files are on my computer
    my $name="Bob Blobkowski";
    my $url='http://blobkowski.org/';
    chdir $dir||die;
    print <<EOF;
    <!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 4.01//EN"
    "http://www.w3.org/TR/html4/strict.dtd"><HTML><TITLE>$name\'s site map</TITLE>
    </HEAD><BODY><H1>$name\'s site map</H1>\n<P><A href="$url">$url</A> as of
    EOF
    system "date"; print '</P><HR><PRE>';
    my $p = HTML::HeadParser->new; my $d;
    open LS, "ls -R|"||die;
    while(<LS>){ #order nicer than find(1)
    chomp;
    if(s@:$@@){$d=$_;next}
    if(/\.(txt|html)$/){s@^@$d/@;s/..//;print "<A href=\"$_\">$_</A>\n";
    if(/\.html$/){$p->parse_file($_);print "\t",$p->header('Title'),"\n";}}}
    print "</PRE></BODY></HTML>";
     
    Dan Jacobson, Jul 24, 2003
    #1
    1. Advertising

  2. Dan Jacobson <> wrote:

    > Say, does this working simple web site mapper (vs. Eric Raymond's
    > extra large version) have any stuffing hanging out that perl
    > novice me ought to fix?


    My comments below are about style, because that's what really got my
    attention.

    > Say, how does one do '/bin/sh -e' in
    > perl? Would rewriting it in python be as easy?


    I don't know much about Python, I've only played with it a bit. But
    it would *force* you to use whitespace, which might not be a bad
    thing. :) If you're interested, grab a copy and try it:
    http://www.python.org

    > Does perl have an
    > internal "ls" that could be called as easy? File::Find couldn't
    > give me the ls -R order I prefer I suppose. Goal: to use even
    > less lines of code.


    Forget using less lines of code unless it makes the program more
    reliable and/or readable. If I had to revise the code below the very
    first thing I would do would be to format it in a way that aids
    comprehension instead of hindering it.

    Ok, so it's not a long, complicated program. But even short programs
    can benefit from a readable style. See 'perldoc perlstyle' for
    suggestions.

    > use strict;
    > require HTML::HeadParser;
    > my $dir=<~/mywebsite>; #where files are on my computer
    > my $name="Bob Blobkowski";
    > my $url='http://blobkowski.org/';
    > chdir $dir||die;


    There's a low-precedence 'or' that you can use, too.

    chdir $dir or die "Cannot chdir to $dir: $!";


    > print <<EOF;
    ><!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 4.01//EN"
    > "http://www.w3.org/TR/html4/strict.dtd"><HTML><TITLE>$name\'s site

    ^^
    It's not necessary to escape that single-quote after $name.

    > map</TITLE>
    ></HEAD><BODY><H1>$name\'s site map</H1>\n<P><A href="$url">$url</A>
    >as of
    > EOF
    > system "date"; print '</P><HR><PRE>';
    > my $p = HTML::HeadParser->new; my $d;


    $d? What's $d? I can figure it out from the code, but a more
    descriptive name would be useful. If someone else needs to edit the
    code they'll appreciate a longer name -- and you might, too, when you
    come back to this program after a while and don't remember all the
    details.

    Newlines and spaces are not a scarce resource.

    > open LS, "ls -R|"||die;
    > while(<LS>){ #order nicer than find(1)
    > chomp;
    > if(s@:$@@){$d=$_;next}


    Why use a non-standard delimiter for the substitution operator when
    you don't need to? s/:$// is pretty clear; s@:$@@ is slightly
    obfuscated, IMHO.

    > if(/\.(txt|html)$/){s@^@$d/@;s/..//;print "<A
    > href=\"$_\">$_</A>\n";


    qq() would make that easier to read; you wouldn't need to escape the
    double-quote each time. See 'perldoc perlop' for quote and quote-
    like operators.

    > if(/\.html$/){$p->parse_file($_);print
    > "\t",$p->header('Title'),"\n";}}}
    > print "</PRE></BODY></HTML>";
    >


    Just because you *can* eliminate whitespace doesn't mean you
    *should*.

    That may be why no-one else has responded. (No-one had when I posted
    this, anyway)

    As for bugs, I haven't looked -- I was distracted too much by the
    style.

    --
    David Wall
     
    David K. Wall, Jul 24, 2003
    #2
    1. Advertising

  3. David K. Wall <> wrote:
    > Dan Jacobson <> wrote:
    >
    > > Does perl have an internal "ls" that could be called as easy?
    > > File::Find couldn't give me the ls -R order I prefer I suppose.


    Have a look at File::Find's "preprocess" option.

    % perldoc File::Find

    > > use strict;
    > > require HTML::HeadParser;
    > > my $dir=<~/mywebsite>; #where files are on my computer
    > > my $name="Bob Blobkowski";
    > > my $url='http://blobkowski.org/';
    > > chdir $dir||die;

    >
    > There's a low-precedence 'or' that you can use, too.
    >
    > chdir $dir or die "Cannot chdir to $dir: $!";


    He'll *need* to use the low-precedence version or add some
    parentheses.

    > > open LS, "ls -R|"||die;


    Same thing here:

    % perl -MO=Deparse -e 'open LS, "ls -R|" || die'
    open LS, 'ls -R|';

    --
    Steve
     
    Steve Grazzini, Jul 24, 2003
    #3
  4. On Thu, 24 Jul 2003, David K. Wall wrote:

    >Dan Jacobson <> wrote:
    >
    >> print <<EOF;
    >><!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 4.01//EN"
    >> "http://www.w3.org/TR/html4/strict.dtd"><HTML><TITLE>$name\'s site

    > ^^
    >It's not necessary to escape that single-quote after $name.


    Yes it is. $name's is pre-Perl 5 syntax for $name::s. It's something
    that bites many Perl programmers from time to time.

    --
    Jeff Pinyan RPI Acacia Brother #734 2003 Rush Chairman
    "And I vos head of Gestapo for ten | Michael Palin (as Heinrich Bimmler)
    years. Ah! Five years! Nein! No! | in: The North Minehead Bye-Election
    Oh. Was NOT head of Gestapo AT ALL!" | (Monty Python's Flying Circus)
     
    Jeff 'japhy' Pinyan, Jul 24, 2003
    #4
  5. Jeff 'japhy' Pinyan <> wrote:

    > On Thu, 24 Jul 2003, David K. Wall wrote:
    >
    >>Dan Jacobson <> wrote:
    >>
    >>> print <<EOF;
    >>><!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 4.01//EN"
    >>> "http://www.w3.org/TR/html4/strict.dtd"><HTML><TITLE>$name\'s
    >>> site

    >> ^^
    >>It's not necessary to escape that single-quote after $name.

    >
    > Yes it is. $name's is pre-Perl 5 syntax for $name::s. It's
    > something that bites many Perl programmers from time to time.


    Damn. I knew about the pre-Perl 5 syntax, but a quick
    copy/paste/edit/run of code from the OP convinced me that it didn't
    matter any more. But there are *two* places in the here-doc that use
    an escape quote. I edited one and saw the output from the other.
     
    David K. Wall, Jul 24, 2003
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Flare
    Replies:
    8
    Views:
    680
    Sayed Hashimi
    Sep 24, 2004
  2. MyGeneration
    Replies:
    0
    Views:
    428
    MyGeneration
    Jan 25, 2005
  3. Deepak Nayal

    JDO or O/R Mapper ?

    Deepak Nayal, Feb 3, 2004, in forum: Java
    Replies:
    5
    Views:
    476
    Silvio Bierman
    Feb 12, 2004
  4. Nico Schuyt

    Re: Looking for web page site mapper

    Nico Schuyt, Jul 13, 2003, in forum: HTML
    Replies:
    1
    Views:
    388
    David M
    Jul 13, 2003
  5. LRW

    Site mapper programs?

    LRW, Dec 11, 2003, in forum: HTML
    Replies:
    7
    Views:
    498
Loading...

Share This Page