HTML whitespace/commnets cruncher

Discussion in 'Perl Misc' started by Garry Heaton, Oct 19, 2003.

  1. Garry Heaton

    Garry Heaton Guest

    Can anyone recommend a perl script for crunching HTML whitespace and
    comments? I wish to make duplicates of HTML files for uploading.

    Garry Heaton
    Garry Heaton, Oct 19, 2003
    #1
    1. Advertising

  2. Garry Heaton

    ko Guest

    Garry Heaton wrote:
    > Can anyone recommend a perl script for crunching HTML whitespace and
    > comments? I wish to make duplicates of HTML files for uploading.
    >
    > Garry Heaton
    >


    Use one of the HTML parsing modules. For example:

    http://search.cpan.org/~gaas/HTML-Parser-3.33/

    Download and unpack the distribution, and check out the example scripts
    in the 'eg' directory.

    HTH - keith
    ko, Oct 19, 2003
    #2
    1. Advertising

  3. -----BEGIN PGP SIGNED MESSAGE-----
    Hash: SHA1

    Garry Heaton <> wrote in news:zOtkb.12373$kA.3236929
    @wards.force9.net:

    > Can anyone recommend a perl script for crunching HTML whitespace and
    > comments? I wish to make duplicates of HTML files for uploading.


    Why not gzip the html files? Seems to me that'd be even better.


    A quick google search turned up a couple freeware and commercial HTML
    strippers. And I seem to recall there's an apache module that does it, but
    I'm not sure.

    - --
    Eric
    $_ = reverse sort $ /. r , qw p ekca lre uJ reh
    ts p , map $ _. $ " , qw e p h tona e and print

    -----BEGIN PGP SIGNATURE-----
    Version: PGPfreeware 7.0.3 for non-commercial use <http://www.pgp.com>

    iQA/AwUBP5KTlWPeouIeTNHoEQLWUACgpXEJ99HvToQI6liJHMN5tBLWYZMAoKoH
    3r8JlfmJtxcwvovr3YPz1/YD
    =MhE2
    -----END PGP SIGNATURE-----
    Eric J. Roode, Oct 19, 2003
    #3
  4. It was a dark and stormy night, and Garry Heaton managed to scribble:

    > Can anyone recommend a perl script for crunching HTML whitespace and
    > comments? I wish to make duplicates of HTML files for uploading.
    >
    > Garry Heaton


    Would you believe I saw some code yesterday on the net that did this but now I cant find it.

    The basic algorithm used regular expressions and was only a few lines long:
    convert consecutive whitespace characters to single whitespace
    remove whitespace from the beginning of lines
    conver consecutive newlines to a single newline

    gtoomey
    Gregory Toomey, Oct 20, 2003
    #4
  5. Andrew Shitov <> wrote:

    > Look at the code on this page: http://webcode.ru/cgi/despace1/



    It has several bugs in it.

    It open()s FILE, but never reads from it.

    It uses ampersand on function calls when it does not want the
    semantics the go with using ampersand on function calls.

    It will mangle spaces in <pre> sections.


    --
    Tad McClellan SGML consulting
    Perl programming
    Fort Worth, Texas
    Tad McClellan, Oct 20, 2003
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. nadia
    Replies:
    2
    Views:
    387
    nadia
    Jul 13, 2004
  2. Oli Filth
    Replies:
    9
    Views:
    3,323
    Uncle Pirate
    Jan 17, 2005
  3. David Given

    C source cruncher wanted

    David Given, Oct 12, 2005, in forum: C Programming
    Replies:
    7
    Views:
    345
    the Swampster
    Oct 17, 2005
  4. Replies:
    10
    Views:
    736
    Eric Brunel
    Dec 16, 2008
  5. MRAB
    Replies:
    3
    Views:
    382
Loading...

Share This Page