postprocessing in os.walk

Discussion in 'Python' started by kj, Oct 10, 2009.

  1. kj

    kj Guest

    Perl's directory tree traversal facility is provided by the function
    find of the File::Find module. This function accepts an optional
    callback, called postprocess, that gets invoked "just before leaving
    the currently processed directory." The documentation goes on to
    say "This hook is handy for summarizing a directory, such as
    calculating its disk usage", which is exactly what I use it for in
    a maintenance script.

    This maintenance script is getting long in the tooth, and I've been
    meaning to add a few enhancements to it for a while, so I thought
    that in the process I'd port it to Python, using the os.walk
    function, but I see that os.walk does not have anything like this
    File::Find::find's postprocess hook. Is there a good way to simulate
    it (without having to roll my own File::Find::find in Python)?

    TIA!

    kynn
     
    kj, Oct 10, 2009
    #1
    1. Advertising

  2. kj

    jordilin Guest

    Well, you could use the alternative os.path.walk instead. You can pass
    a callback as a parameter, which will be invoked every time you
    bump into a new directory. The signature is os.path.walk
    (path,visit,arg). Take a look at the python library documentation.


    On 11 Oct, 00:12, kj <> wrote:
    > Perl's directory tree traversal facility is provided by the function
    > find of the File::Find module.  This function accepts an optional
    > callback, called postprocess, that gets invoked "just before leaving
    > the currently processed directory."  The documentation goes on to
    > say "This hook is handy for summarizing a directory, such as
    > calculating its disk usage", which is exactly what I use it for in
    > a maintenance script.
    >
    > This maintenance script is getting long in the tooth, and I've been
    > meaning to add a few enhancements to it for a while, so I thought
    > that in the process I'd port it to Python, using the os.walk
    > function, but I see that os.walk does not have anything like this
    > File::Find::find's postprocess hook.  Is there a good way to simulate
    > it (without having to roll my own File::Find::find in Python)?
    >
    > TIA!
    >
    > kynn
     
    jordilin, Oct 11, 2009
    #2
    1. Advertising

  3. kj

    Dave Angel Guest

    kj wrote:
    > Perl's directory tree traversal facility is provided by the function
    > find of the File::Find module. This function accepts an optional
    > callback, called postprocess, that gets invoked "just before leaving
    > the currently processed directory." The documentation goes on to
    > say "This hook is handy for summarizing a directory, such as
    > calculating its disk usage", which is exactly what I use it for in
    > a maintenance script.
    >
    > This maintenance script is getting long in the tooth, and I've been
    > meaning to add a few enhancements to it for a while, so I thought
    > that in the process I'd port it to Python, using the os.walk
    > function, but I see that os.walk does not have anything like this
    > File::Find::find's postprocess hook. Is there a good way to simulate
    > it (without having to roll my own File::Find::find in Python)?
    >
    > TIA!
    >
    > kynn
    >
    >

    Why would you need a special hook when the os.walk() generator yields
    exactly once per directory? So whatever work you do on the list of
    files you get, you can then put the summary logic immediately after.

    Or if you really feel you need a special hook, then write a wrapper for
    os.walk(), which takes a hook function as a parameter, and after
    yielding each file in a directory, calls the hook. Looks like about 5
    lines.

    DaveA
     
    Dave Angel, Oct 12, 2009
    #3
  4. kj

    kj Guest

    In <> Dave Angel <> writes:

    >kj wrote:
    >> Perl's directory tree traversal facility is provided by the function
    >> find of the File::Find module. This function accepts an optional
    >> callback, called postprocess, that gets invoked "just before leaving
    >> the currently processed directory." The documentation goes on to
    >> say "This hook is handy for summarizing a directory, such as
    >> calculating its disk usage", which is exactly what I use it for in
    >> a maintenance script.
    >>
    >> This maintenance script is getting long in the tooth, and I've been
    >> meaning to add a few enhancements to it for a while, so I thought
    >> that in the process I'd port it to Python, using the os.walk
    >> function, but I see that os.walk does not have anything like this
    >> File::Find::find's postprocess hook. Is there a good way to simulate
    >> it (without having to roll my own File::Find::find in Python)?
    >>
    >> TIA!
    >>
    >> kynn
    >>
    >>

    >Why would you need a special hook when the os.walk() generator yields
    >exactly once per directory? So whatever work you do on the list of
    >files you get, you can then put the summary logic immediately after.


    >Or if you really feel you need a special hook, then write a wrapper for
    >os.walk(), which takes a hook function as a parameter, and after
    >yielding each file in a directory, calls the hook. Looks like about 5
    >lines.


    I think you're missing the point. The hook in question has to be
    called *immediately after* all the subtrees that are rooted in
    subdirectories contained in the current directory have been visited
    by os.walk.

    I'd love to see your "5 lines" for *that*.

    kj
     
    kj, Oct 13, 2009
    #4
  5. kj

    Peter Otten Guest

    kj wrote:

    > In <> Dave Angel
    > <> writes:
    >
    >>kj wrote:
    >>> Perl's directory tree traversal facility is provided by the function
    >>> find of the File::Find module. This function accepts an optional
    >>> callback, called postprocess, that gets invoked "just before leaving
    >>> the currently processed directory." The documentation goes on to
    >>> say "This hook is handy for summarizing a directory, such as
    >>> calculating its disk usage", which is exactly what I use it for in
    >>> a maintenance script.
    >>>
    >>> This maintenance script is getting long in the tooth, and I've been
    >>> meaning to add a few enhancements to it for a while, so I thought
    >>> that in the process I'd port it to Python, using the os.walk
    >>> function, but I see that os.walk does not have anything like this
    >>> File::Find::find's postprocess hook. Is there a good way to simulate
    >>> it (without having to roll my own File::Find::find in Python)?
    >>>
    >>> TIA!
    >>>
    >>> kynn
    >>>
    >>>

    >>Why would you need a special hook when the os.walk() generator yields
    >>exactly once per directory? So whatever work you do on the list of
    >>files you get, you can then put the summary logic immediately after.

    >
    >>Or if you really feel you need a special hook, then write a wrapper for
    >>os.walk(), which takes a hook function as a parameter, and after
    >>yielding each file in a directory, calls the hook. Looks like about 5
    >>lines.

    >
    > I think you're missing the point. The hook in question has to be
    > called *immediately after* all the subtrees that are rooted in
    > subdirectories contained in the current directory have been visited
    > by os.walk.
    >
    > I'd love to see your "5 lines" for *that*.


    import os

    def find(root, process):
    for pdf in os.walk(root, topdown=False):
    process(*pdf)

    def process(path, dirs, files):
    print path

    find(".", process)

    Peter
     
    Peter Otten, Oct 13, 2009
    #5
  6. kj

    Paul Rubin Guest

    kj <> writes:
    > I think you're missing the point. The hook in question has to be
    > called *immediately after* all the subtrees that are rooted in
    > subdirectories contained in the current directory have been visited
    > by os.walk.
    >
    > I'd love to see your "5 lines" for *that*.


    I'm having trouble understanding the specification. To find the disk
    usage (in bytes) of a directory:

    import os,stat
    def find_disk_usage(dirname):
    return sum(sum(os.stat(dirpath+'/'+filename)[stat.ST_SIZE]
    for filename in fn_list)
    for dirpath, dirlist, fn_list in os.walk(dirname))
     
    Paul Rubin, Oct 13, 2009
    #6
  7. kj

    Dave Angel Guest

    Peter Otten wrote:
    > kj wrote:
    >
    >
    >> In <> Dave Angel
    >> <> writes:
    >>
    >>
    >>> kj wrote:
    >>>
    >>>> Perl's directory tree traversal facility is provided by the function
    >>>> find of the File::Find module. This function accepts an optional
    >>>> callback, called postprocess, that gets invoked "just before leaving
    >>>> the currently processed directory." The documentation goes on to
    >>>> say "This hook is handy for summarizing a directory, such as
    >>>> calculating its disk usage", which is exactly what I use it for in
    >>>> a maintenance script.
    >>>>
    >>>> This maintenance script is getting long in the tooth, and I've been
    >>>> meaning to add a few enhancements to it for a while, so I thought
    >>>> that in the process I'd port it to Python, using the os.walk
    >>>> function, but I see that os.walk does not have anything like this
    >>>> File::Find::find's postprocess hook. Is there a good way to simulate
    >>>> it (without having to roll my own File::Find::find in Python)?
    >>>>
    >>>> TIA!
    >>>>
    >>>> kynn
    >>>>
    >>>>
    >>>>
    >>> Why would you need a special hook when the os.walk() generator yields
    >>> exactly once per directory? So whatever work you do on the list of
    >>> files you get, you can then put the summary logic immediately after.
    >>>
    >>> Or if you really feel you need a special hook, then write a wrapper for
    >>> os.walk(), which takes a hook function as a parameter, and after
    >>> yielding each file in a directory, calls the hook. Looks like about 5
    >>> lines.
    >>>

    >> I think you're missing the point. The hook in question has to be
    >> called *immediately after* all the subtrees that are rooted in
    >> subdirectories contained in the current directory have been visited
    >> by os.walk.
    >>
    >> I'd love to see your "5 lines" for *that*.
    >>

    >
    > import os
    >
    > def find(root, process):
    > for pdf in os.walk(root, topdown=False):
    > process(*pdf)
    >
    > def process(path, dirs, files):
    > print path
    >
    > find(".", process)
    >
    > Peter
    >
    >
    >
    >

    Thanks Peter,

    To expand it to five lines, and make it the generator I mentioned,

    import os

    def find(root, process):
    for pdf in os.walk(root, topdown=False):
    for file in pdf[2]:
    yield os.path.join(pdf[0],file)
    process(*pdf)

    def process(path, dirs, files):
    print "hooked --", path

    for fullpath in find("..", process):
    print fullpath


    This is a generator which yields each file in a directory tree, and
    after all the files below a particular directory are processed,
    "immediately" calls the hook

    DaveA
     
    Dave Angel, Oct 13, 2009
    #7
  8. kj

    Ethan Furman Guest

    kj wrote:
    > In <> Dave Angel <> writes:
    >>


    [snippetty snip]

    >>Why would you need a special hook when the os.walk() generator yields
    >>exactly once per directory? So whatever work you do on the list of
    >>files you get, you can then put the summary logic immediately after.

    >
    >
    > I think you're missing the point. The hook in question has to be
    > called *immediately after* all the subtrees that are rooted in
    > subdirectories contained in the current directory have been visited
    > by os.walk.
    >
    > I'd love to see your "5 lines" for *that*.
    >
    > kj


    So now that you've seen a couple examples, perhaps you noticed the flag
    "topdown=False"? With that (un)set, I repeat the question -- why do you
    need a hook?

    ~Ethan~
     
    Ethan Furman, Oct 13, 2009
    #8
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. SD
    Replies:
    1
    Views:
    466
  2. pembed2003

    how to walk a binary tree

    pembed2003, Apr 19, 2004, in forum: C++
    Replies:
    7
    Views:
    7,164
    pembed2003
    Apr 20, 2004
  3. Andrew
    Replies:
    2
    Views:
    446
    Jonathan Turkanis
    Aug 1, 2004
  4. WIWA

    pySNMP: SNMP walk

    WIWA, Aug 21, 2003, in forum: Python
    Replies:
    0
    Views:
    2,145
  5. Marcus Alves Grando
    Replies:
    7
    Views:
    497
    Marcus Alves Grando
    Nov 14, 2007
Loading...

Share This Page