Handling and recursing subdirectories

Discussion in 'Perl Misc' started by Kloudnyne, Sep 3, 2004.

  1. Kloudnyne

    Kloudnyne Guest

    I'm a newcomer to Perl, and am currently attempting to teach myself
    Perl through using it, but I have currently come across an issue I
    can't seem to see any way around.

    I am trying to write a Perl script that will go through a series of
    directories and their subdirectories, removing javascript, images,
    bots, etc from HTML files in order to provide a text-reader friendly
    version of each page. The actual conversion of any given file has been
    taken care of, thanks to code heavily borrowed from an existing
    script, but I can't seem to work out how I can get it to recurse
    through the various subdirs.

    The snippet of code I've thrown together for it so far is:

    ***

    sub reading {
    do {
    opendir (CURRENTFOLDER, $htmdir) || die 'Ay SeƱor! Los bandidos have
    raided that directory!'
    while defined($filename = readdir(FOLDER)) = True {
    $nesting = directorycheck(); #nesting tells me how deep we are into
    subdirectories. a zero value is at the root of the process. only
    really intended as a flag for testing
    #dircheck checks to see if our victim this cycle is a
    subdirectory. If it is, (I hope) we'll launch in to a nested subcycle.
    }

    closedir(CURRENTFOLDER); # with a little luck, re-opening the
    previous folder will have the pointer still at the last position
    checked, or else we have uber-recursives
    chop ($htmdir); #prepping the string to ensure that the trailing char
    is NOT a / (not that it should be anyway)
    do {

    }
    until (chop($htmdir) ne '/');
    # now that we've gone back to (and removed) the / nearest to the end
    of the handle, we've effectively gone back to the parent directory
    $nesting -- ;
    }
    until $htmdir = $htmroot;
    }


    sub directorycheck {
    if (-d $filename) {
    dircheck = $nesting + 1 ;
    $htmdir = $filename .=$htmdir;
    chdir ($htmdir);
    } else {
    $txtdir = $htmdir; # sets $txtdir to mirror $htmdir, but in the
    /txt/ directory, where we want our output to be.
    $txtdir =~s/htdocs/txt/; #(hopefully) changes the file path for
    output to the /txt/ equivalent of the current /htdocs/ folder
    parsetxt(); # only parses if we've hit a file, rather than a subdir.
    }
    }

    ***

    where "parsetxt" is the subroutine that handles the actual conversion.
    However, I can't even get this to compile, let alone run it to see if
    it just dies or recurses away to infinity, or whatever.

    The script is intended to run on a linux box acting as a webserver,
    but for purposes of writing/testing I'm using ActivePerl 5.8 on a
    win2k machine.

    My question, after all this explanation, is this: Am I barking up the
    wrong tree here, or am I just missing one little thing that will make
    all this work? If anyone else has a piece of code that will fulfil my
    requirements and make my life easier, you will have my undying
    gratitude, because at this point I'm seriously starting to reconsider
    scripting and just perform the conversions manually.

    Thanks for your time.


    PS: I apologise for the hideous formatting. It's actually quite
    legible on a full-width screen, and I didn't want to disturb the text
    for fear of accidentally altering the code.
     
    Kloudnyne, Sep 3, 2004
    #1
    1. Advertising

  2. Kloudnyne

    Paul Lalli Guest

    "Kloudnyne" <> wrote in message
    news:...
    > I am trying to write a Perl script that will go through a series of
    > directories and their subdirectories, removing javascript, images,
    > bots, etc from HTML files in order to provide a text-reader friendly
    > version of each page. The actual conversion of any given file has been
    > taken care of, thanks to code heavily borrowed from an existing
    > script, but I can't seem to work out how I can get it to recurse
    > through the various subdirs.
    >
    > The snippet of code I've thrown together for it so far is:


    <snip attempt at manual directory recursion>

    > My question, after all this explanation, is this: Am I barking up the
    > wrong tree here, or am I just missing one little thing that will make
    > all this work? If anyone else has a piece of code that will fulfil my
    > requirements and make my life easier, you will have my undying
    > gratitude, because at this point I'm seriously starting to reconsider
    > scripting and just perform the conversions manually.


    The standard (that is, included with your Perl distibution) module
    File::Find is what you want to use to recurse through directories. Read
    about it by typing the command
    perldoc File::Find
    at your shell prompt. The CPAN modules File::Finder and
    File::Find::Rule also exist if you prefer an alternate syntax.

    In the more general case, whenever you find yourself trying to do
    something in Perl that has most likely done before (surely you don't
    think you're the only one who's ever needed to recurse through a
    directory structure, do you?), you should always check to see if a
    module exists which already does it. Modules are stored and shared on
    the CPAN, which you can search at http://search.cpan.org

    Give File::Find a shot, and if you have problems with it, feel free to
    ask for help.

    Paul Lalli
     
    Paul Lalli, Sep 3, 2004
    #2
    1. Advertising

  3. Kloudnyne

    Paul Lalli Guest

    "Kloudnyne" <> wrote in message
    news:...
    > I am trying to write a Perl script that will go through a series of
    > directories and their subdirectories, removing javascript, images,
    > bots, etc from HTML files in order to provide a text-reader friendly
    > version of each page. The actual conversion of any given file has been
    > taken care of, thanks to code heavily borrowed from an existing
    > script, but I can't seem to work out how I can get it to recurse
    > through the various subdirs.
    >
    > The snippet of code I've thrown together for it so far is:


    <snip attempt at manual directory recursion>

    > My question, after all this explanation, is this: Am I barking up the
    > wrong tree here, or am I just missing one little thing that will make
    > all this work? If anyone else has a piece of code that will fulfil my
    > requirements and make my life easier, you will have my undying
    > gratitude, because at this point I'm seriously starting to reconsider
    > scripting and just perform the conversions manually.


    The standard (that is, included with your Perl distibution) module
    File::Find is what you want to use to recurse through directories. Read
    about it by typing the command
    perldoc File::Find
    at your shell prompt. The CPAN modules File::Finder and
    File::Find::Rule also exist if you prefer an alternate syntax.

    In the more general case, whenever you find yourself trying to do
    something in Perl that has most likely done before (surely you don't
    think you're the only one who's ever needed to recurse through a
    directory structure, do you?), you should always check to see if a
    module exists which already does it. Modules are stored and shared on
    the CPAN, which you can search at http://search.cpan.org

    Give File::Find a shot, and if you have problems with it, feel free to
    ask for help.

    Paul Lalli
     
    Paul Lalli, Sep 3, 2004
    #3
  4. Kloudnyne

    Anno Siegel Guest

    Kloudnyne <> wrote in comp.lang.perl.misc:

    [...]

    > script, but I can't seem to work out how I can get it to recurse
    > through the various subdirs.


    You want File::Find (a standard module).

    [code snipped]

    > PS: I apologise for the hideous formatting. It's actually quite
    > legible on a full-width screen, and I didn't want to disturb the text
    > for fear of accidentally altering the code.


    ....so you left the formatting to Usenet, which really messed it up.

    Anno
     
    Anno Siegel, Sep 3, 2004
    #4
  5. Kloudnyne

    Joe Smith Guest

    Kloudnyne wrote:

    > If anyone else has a piece of code that will fulfil my requirements
    > and make my life easier, you will have my undying gratitude...


    use File::Find;
    sub process { print "Found file $_ in $File::Find::dir\n" if -f $_; }
    find(\&process,'/tmp');

    -Joe
     
    Joe Smith, Sep 3, 2004
    #5
  6. Kloudnyne

    Kloudnyne Guest

    Joe Smith <> wrote in message news:<P84_c.96904$9d6.59001@attbi_s54>...
    <snip>

    Thanks for your help. I apologise again for my blatantly obvious noobness.
     
    Kloudnyne, Sep 6, 2004
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. snowdy

    Recursing code problem

    snowdy, Aug 29, 2003, in forum: C Programming
    Replies:
    19
    Views:
    1,204
    Kevin D. Quitt
    Sep 2, 2003
  2. Scott Carlson

    recursing through files in a folder

    Scott Carlson, Oct 1, 2004, in forum: Python
    Replies:
    3
    Views:
    378
    Mirko Zeibig
    Oct 1, 2004
  3. Replies:
    4
    Views:
    352
  4. Henrik Goldman

    Recursing macro preprocessing?

    Henrik Goldman, Oct 21, 2006, in forum: C++
    Replies:
    4
    Views:
    389
    Kaz Kylheku
    Oct 22, 2006
  5. Randy

    StackOverFlowException When Recursing Page Controls

    Randy, Jan 18, 2006, in forum: ASP .Net Web Controls
    Replies:
    1
    Views:
    151
    Randy
    Jan 19, 2006
Loading...

Share This Page