"Maximum recursion depth exceeded"...why?

Thomas Allen · Feb 17, 2009

I must not be understanding something. This is a simple recursive
function that prints all HTML files in argv[1] as its scans the
directory's contents. Why do I get a RuntimeError for recursion depth
exceeded?

#!/usr/bin/env python

import os, sys

def main():
absToRel(sys.argv[1], sys.argv[2])

def absToRel(dir, root):
for filename in os.listdir(dir):
if os.path.isdir(filename):
absToRel(filename, root)
else:
if(filename.endswith("html") or filename.endswith("htm")):
print filename

if __name__ == "__main__":
main()

Martin v. Löwis · Feb 17, 2009

Thomas said:
I must not be understanding something. This is a simple recursive
function that prints all HTML files in argv[1] as its scans the
directory's contents. Why do I get a RuntimeError for recursion depth
exceeded?

def main():
absToRel(sys.argv[1], sys.argv[2])

def absToRel(dir, root):
for filename in os.listdir(dir):
if os.path.isdir(filename):

Perhaps you have a symlink somewhere that makes the tree appear to
have infinite depth?

Regards,
Martin

Thomas Allen · Feb 17, 2009

I must not be understanding something. This is a simple recursive
function that prints all HTML files in argv[1] as its scans the
directory's contents. Why do I get a RuntimeError for recursion depth
exceeded?

#!/usr/bin/env python

import os, sys

def main():
absToRel(sys.argv[1], sys.argv[2])

def absToRel(dir, root):
for filename in os.listdir(dir):
if os.path.isdir(filename):
absToRel(filename, root)
else:
if(filename.endswith("html") or filename.endswith("htm")):
print filename

if __name__ == "__main__":
main()

Please note that I'm not using os.walk(sys.argv[1]) because the
current depth of recursion is relevant to the transformation I'm
attempting. Basically, I'm transforming a live site to a local one and
the live site uses all absolute paths (not my decision...). I planned
on performing the replace like so for each line:

line.replace(root, "../" * depth)

So that a file in the top-level would simple remove all instances of
root, one level down would sub "../", etc.

Peter Otten · Feb 17, 2009

Thomas said:
I must not be understanding something. This is a simple recursive
function that prints all HTML files in argv[1] as its scans the
directory's contents. Why do I get a RuntimeError for recursion depth
exceeded?

#!/usr/bin/env python

import os, sys

def main():
absToRel(sys.argv[1], sys.argv[2])

def absToRel(dir, root):
for filename in os.listdir(dir):

filename = os.path.join(dir, filename)

if os.path.isdir(filename):
absToRel(filename, root)
else:
if(filename.endswith("html") or filename.endswith("htm")):
print filename

if __name__ == "__main__":
main()

Without the addition for a directory and a subdirectory of the same
name, "dir/dir", os.listdir("dir") has "dir" (the child) in the result list
which triggers an absToRel() call on "dir" (the parent) ad infinitum.

Peter

Thomas Allen · Feb 17, 2009

Thomas said:
Thomas said:

I must not be understanding something. This is a simple recursive
function that prints all HTML files in argv[1] as its scans the
directory's contents. Why do I get a RuntimeError for recursion depth
exceeded?

Click to expand...

#!/usr/bin/env python

Click to expand...

import os, sys

Click to expand...

def main():
absToRel(sys.argv[1], sys.argv[2])

Click to expand...

def absToRel(dir, root):
for filename in os.listdir(dir):

Click to expand...

filename = os.path.join(dir, filename)

if os.path.isdir(filename):
absToRel(filename, root)
else:
if(filename.endswith("html") or filename.endswith("htm")):
print filename

Click to expand...

if __name__ == "__main__":
main()

Click to expand...

Without the addition for a directory and a subdirectory of the same
name, "dir/dir", os.listdir("dir") has "dir" (the child) in the result list
which triggers an absToRel() call on "dir" (the parent) ad infinitum.

Peter

I have two problems in this case:

1. I don't know how to reliably map the current filename to an
absolute path beyond the top-most directory because my method of doing
so would be to os.path.join(os.getcwd(), filename)

2. For some reason, only one folder in the directory gets marked as a
directory itself when there are about nine others in the top-most
directory. I don't even know where to begin to solve this one.

I'm sure the first is an easy answer, but what do I need to do to
solve the second?

Peter Otten · Feb 17, 2009

Thomas said:
Thomas said:

I must not be understanding something. This is a simple recursive
function that prints all HTML files in argv[1] as its scans the
directory's contents. Why do I get a RuntimeError for recursion depth
exceeded?

Click to expand...

#!/usr/bin/env python

Click to expand...

import os, sys

Click to expand...

def main():
absToRel(sys.argv[1], sys.argv[2])

Click to expand...

def absToRel(dir, root):
for filename in os.listdir(dir):

Click to expand...

filename = os.path.join(dir, filename)

if os.path.isdir(filename):
absToRel(filename, root)
else:
if(filename.endswith("html") or filename.endswith("htm")):
print filename

Click to expand...

if __name__ == "__main__":
main()

Click to expand...

Without the addition for a directory and a subdirectory of the same
name, "dir/dir", os.listdir("dir") has "dir" (the child) in the result
list which triggers an absToRel() call on "dir" (the parent) ad
infinitum.

Peter

Click to expand...

I have two problems in this case:

1. I don't know how to reliably map the current filename to an
absolute path beyond the top-most directory because my method of doing
so would be to os.path.join(os.getcwd(), filename)

Don't make things more complicated than necessary. If you can do
os.listdir(somedir) you can also do [os.path.join(somedir, fn) for fn in
os.listdir(somedir)].

2. For some reason, only one folder in the directory gets marked as a
directory itself when there are about nine others in the top-most
directory. I don't even know where to begin to solve this one.

I'm sure the first is an easy answer, but what do I need to do to
solve the second?

If you solve the first properly the second might magically disappear. This
is what my crystal ball tells me because there is no code in sight...

Peter

Thomas Allen · Feb 17, 2009

Thomas said:
Thomas said:

Thomas Allen wrote:
I must not be understanding something. This is a simple recursive
function that prints all HTML files in argv[1] as its scans the
directory's contents. Why do I get a RuntimeError for recursion depth
exceeded?
#!/usr/bin/env python
import os, sys
def main():
absToRel(sys.argv[1], sys.argv[2])
def absToRel(dir, root):
for filename in os.listdir(dir):
filename = os.path.join(dir, filename)
if os.path.isdir(filename):
absToRel(filename, root)
else:
if(filename.endswith("html") or filename.endswith("htm")):
print filename
if __name__ == "__main__":
main()
Without the addition for a directory and a subdirectory of the same
name, "dir/dir", os.listdir("dir") has "dir" (the child) in the result
list which triggers an absToRel() call on "dir" (the parent) ad
infinitum.
Peter

Click to expand...

Click to expand...

I have two problems in this case:

Click to expand...

1. I don't know how to reliably map the current filename to an
absolute path beyond the top-most directory because my method of doing
so would be to os.path.join(os.getcwd(), filename)

Click to expand...

Don't make things more complicated than necessary. If you can do
os.listdir(somedir) you can also do [os.path.join(somedir, fn) for fn in
os.listdir(somedir)].

2. For some reason, only one folder in the directory gets marked as a
directory itself when there are about nine others in the top-most
directory. I don't even know where to begin to solve this one.

Click to expand...

I'm sure the first is an easy answer, but what do I need to do to
solve the second?

Click to expand...

If you solve the first properly the second might magically disappear. This
is what my crystal ball tells me because there is no code in sight...

Peter

I'm referring to the same code, but with a print:

for file in os.listdir(dir):
if os.path.isdir(file):
print "D", file

in place of the internal call to absToRel...and only one line prints
such a message. I mean, if I can't trust my OS or its Python
implementation (on a Windows box) to recognize a directory, I'm
wasting everyone's time here.

In any case, is this the best way to go about the problem in general?
Or is there already a way to recursively walk a directory, aware of
the current depth?

Thanks,
Thomas

MRAB · Feb 18, 2009

Thomas said:
I must not be understanding something. This is a simple recursive
function that prints all HTML files in argv[1] as its scans the
directory's contents. Why do I get a RuntimeError for recursion depth
exceeded?

#!/usr/bin/env python

import os, sys

def main():
absToRel(sys.argv[1], sys.argv[2])

def absToRel(dir, root):
for filename in os.listdir(dir):

os.listdir() returns a list of filenames, not filepaths. Create the
filepath with os.path.join(dir, filename).

Paul Rubin · Feb 18, 2009

Thomas Allen said:
attempting. Basically, I'm transforming a live site to a local one and

Something wrong with wget -R ?

Thomas Allen · Feb 18, 2009

You are under a wrong assumption. You think os.listdir() returns a list
of absolute path elements. In fact it returns just a list of names. You
have to os.path.join(dir, file) to get an absolute path.

Anyway stop reinventing the wheel and use os.walk() as I already
explained. You can easily spot the depth with "directory.count(os.sep)".
os.path.normpath() helps you to sanitize the path before counting the
number of os.sep.

Christian

If you'd read the messages in this thread you'd understand why I'm not
using os.walk(): I'm not using it because I need my code to be aware
of the current recursion depth so that the correct number of "../" are
substituted in.

Also, somebody mentioned wget -R...did you mean wget -r? In any case,
I have all of these files locally already and am trying to replace
absolute paths with relative ones so that a colleague can present some
website content at a location with no internet.

Thomas

Thomas Allen · Feb 18, 2009

I'm well aware of your messages and your requirements. However you
didn't either read or understand what I was trying to tell you. You
don't need to know the recursion depths in order to find the correct
number of "../".

base = os.path.normpath(base)
baselevel = root.count(os.sep)

for root, dirs, files in os.walk(base):
level = root.count(os.sep) - baselevel
offset = level * "../"
...

See?

Christian

Very clever (and now seemingly obvious)! That certainly is one way to
measure directory depth; I hadn't thought of counting the separator.
Sorry that I misunderstood what you meant there.

Thanks,
Thomas

alex23 · Feb 18, 2009

Something wrong with wget -R ?

Did you mean wget -r ?

That will just grab the entire site, though. I'm guessing that Thomas'
function absToRel will eventually replace the print with something
that changes links accordingly so the local version is traversable.

rdmurray · Feb 18, 2009

alex23 said:
Did you mean wget -r ?

That will just grab the entire site, though. I'm guessing that Thomas'
function absToRel will eventually replace the print with something
that changes links accordingly so the local version is traversable.

Yeah, but wget -r -k will do that bit of it, too.

--RDM

alex23 · Feb 18, 2009

Yeah, but wget -r -k will do that bit of it, too.

Wow, nice, I don't know why I never noticed that. Cheers!

Thomas Allen · Feb 18, 2009

Wow, nice, I don't know why I never noticed that. Cheers!

Hm...doesn't do that over here. I thought it may have been because of
absolute links (not to site root), but it even leaves things like <a
href="/">. Does it work for you guys?

rdmurray · Feb 19, 2009

Thomas Allen said:
Hm...doesn't do that over here. I thought it may have been because of
absolute links (not to site root), but it even leaves things like <a
href="/">. Does it work for you guys?

It works for me. The sample pages I just tested on it don't use
any href="/" links, but my 'href="/about.html"' got properly
converted to 'href="../about.html"'. (On the other hand my '/contact.html'
got converted to a full external URL...but that's apparently because the
contact.html file doesn't actually exist

--RDM

Thomas Allen · Feb 19, 2009

It works for me. The sample pages I just tested on it don't use
any href="/" links, but my 'href="/about.html"' got properly
converted to 'href="../about.html"'. (On the other hand my '/contact..html'
got converted to a full external URL...but that's apparently because the
contact.html file doesn't actually exist

--RDM

Thanks for the help everyone. The idea of counting the slashes was the
linchpin of this little script, and with a little trial and error, I
successfully generated a local copy of the site. I don't think my
colleague knows what went into this, but he seemed appreciative :^)

Thomas

Tkinter: Exception RuntimeError: 'maximum recursion depth exceeded'	5	Oct 12, 2010
recursion depth problem	28	Apr 22, 2007
Maximum recursion depth	2	Jul 12, 2004
RuntimeError 'maximum recursion depth exceeded'	6	Nov 15, 2003
RuntimeError: maximum recursion limit exceeded ??	0	Aug 17, 2003
pickle: maximum recursion depth exceeded	3	Nov 3, 2003
Recursion limit of pickle?	4	Feb 9, 2008
use python to split a video file into a set of parts	2	May 7, 2013

"Maximum recursion depth exceeded"...why?

Thomas Allen

Martin v. Löwis

Thomas Allen

Peter Otten

Thomas Allen

Peter Otten

Thomas Allen

MRAB

Paul Rubin

Thomas Allen

Thomas Allen

alex23

rdmurray

alex23

Thomas Allen

rdmurray

Thomas Allen

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads