download all mib files from a web page

powah · May 27, 2009

I want to download all mib files from the web page:
http://www.juniper.net/techpubs/sof...t/juniper-specific-mibs-junos-nm.html#jN18E19

All mib filenames are of this format:
www.juniper.net/techpubs ... .txt

I write this program but has the following error.
Please help.
Thanks.

Code:

#!/usr/bin/env python
import urllib2,os,urlparse
url="http://www.juniper.net/techpubs/software/junos/junos94/swconfig-
net-mgmt/juniper-specific-mibs-junos-nm.html#jN18E19"
page=urllib2.urlopen(url)
f=0
links=[]
data=page.read().split("\n")
for item in data:
    if "www.juniper.net/techpubs" in item:
        httpind=item.index("www.juniper.net/techpubs")
        item=item[httpind:]
        #print "item " + item
        ind=item.index("<")
        links.append(item[:ind]) #grab all links
# download all links
for link in links:
    print "link " + link
    filename=link.split("/")[-1]
    print "downloading ... " + filename
    u=urllib2.urlopen(link)
    p=u.read()
    open(filename,"w").write(p)

$ ~/python/downloadjuniper.py
link www.juniper.net/techpubs/software/junos/junos94/swconfig-net-mgmt/mib-jnx-user-aaa.txt
downloading ... mib-jnx-user-aaa.txt
Traceback (most recent call last):
File "/home/powah/python/downloadjuniper.py", line 20, in ?
u=urllib2.urlopen(link)
File "/usr/lib/python2.4/urllib2.py", line 130, in urlopen
return _opener.open(url, data)
File "/usr/lib/python2.4/urllib2.py", line 350, in open
protocol = req.get_type()
File "/usr/lib/python2.4/urllib2.py", line 233, in get_type
raise ValueError, "unknown url type: %s" % self.__original
ValueError: unknown url type:
www.juniper.net/techpubs/software/junos/junos94/swconfig-net-mgmt/mib-jnx-user-aaa.txt

$ python
Python 2.4.4 (#1, Oct 23 2006, 13:58:00)
[GCC 4.1.1 20061011 (Red Hat 4.1.1-30)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
My computer is FC6 linux.

Chris Rebert · May 27, 2009

Code:
I want to download all mib files from the web page:
http://www.juniper.net/techpubs/sof...t/juniper-specific-mibs-junos-nm.html#jN18E19

All mib filenames are of this format:
www.juniper.net/techpubs ... .txt

I write this program but has the following error.
Please help.
Thanks.

Code:

#!/usr/bin/env python import urllib2,os,urlparse url="http://www.juniper.net/techpubs/software/junos/junos94/swconfig- net-mgmt/juniper-specific-mibs-junos-nm.html#jN18E19" page=urllib2.urlopen(url) f=0 links=[] data=page.read().split("\n") for item in data: Â Â if "www.juniper.net/techpubs" in item: Â Â Â Â httpind=item.index("www.juniper.net/techpubs") Â Â Â Â item=item[httpind:] Â Â Â Â #print "item " + item Â Â Â Â ind=item.index("<") Â Â Â Â links.append(item[:ind]) #grab all links # download all links for link in links: Â Â print "link " + link Â Â filename=link.split("/")[-1] Â Â print "downloading ... " + filename Â Â u=urllib2.urlopen(link) Â Â p=u.read() Â Â open(filename,"w").write(p)

$ ~/python/downloadjuniper.py
link www.juniper.net/techpubs/software/junos/junos94/swconfig-net-mgmt/mib-jnx-user-aaa.txt
downloading ... mib-jnx-user-aaa.txt
Traceback (most recent call last):
Â File "/home/powah/python/downloadjuniper.py", line 20, in ?
Â Â u=urllib2.urlopen(link)
Â File "/usr/lib/python2.4/urllib2.py", line 130, in urlopen
Â Â return _opener.open(url, data)
Â File "/usr/lib/python2.4/urllib2.py", line 350, in open
Â Â protocol = req.get_type()
Â File "/usr/lib/python2.4/urllib2.py", line 233, in get_type
Â Â raise ValueError, "unknown url type: %s" % self.__original
ValueError: unknown url type:
www.juniper.net/techpubs/software/junos/junos94/swconfig-net-mgmt/mib-jnx-user-aaa.txt

You need to ensure that all URL strings include the protocol to use,
i.e. "http://"

Cheers,
Chris

Jeff McNeil · May 27, 2009

Code:
I want to download all mib files from the web page:http://www.juniper.net/techpubs/software/junos/junos94/swconfig-net-m...

All mib filenames are of this format:www.juniper.net/techpubs... .txt

I write this program but has the following error.
Please help.
Thanks.

Code:

#!/usr/bin/env python import urllib2,os,urlparse url="http://www.juniper.net/techpubs/software/junos/junos94/swconfig- net-mgmt/juniper-specific-mibs-junos-nm.html#jN18E19" page=urllib2.urlopen(url) f=0 links=[] data=page.read().split("\n") for item in data: if "www.juniper.net/techpubs" in item: httpind=item.index("www.juniper.net/techpubs") item=item[httpind:] #print "item " + item ind=item.index("<") links.append(item[:ind]) #grab all links # download all links for link in links: print "link " + link filename=link.split("/")[-1] print "downloading ... " + filename u=urllib2.urlopen(link) p=u.read() open(filename,"w").write(p)

$ ~/python/downloadjuniper.py
linkwww.juniper.net/techpubs/software/junos/junos94/swconfig-net-mgmt/mib....
downloading ... mib-jnx-user-aaa.txt
Traceback (most recent call last):
File "/home/powah/python/downloadjuniper.py", line 20, in ?
u=urllib2.urlopen(link)
File "/usr/lib/python2.4/urllib2.py", line 130, in urlopen
return _opener.open(url, data)
File "/usr/lib/python2.4/urllib2.py", line 350, in open
protocol = req.get_type()
File "/usr/lib/python2.4/urllib2.py", line 233, in get_type
raise ValueError, "unknown url type: %s" % self.__original
ValueError: unknown url type:www.juniper.net/techpubs/software/junos/junos94/swconfig-net-mgmt/mib...

$ python
Python 2.4.4 (#1, Oct 23 2006, 13:58:00)
[GCC 4.1.1 20061011 (Red Hat 4.1.1-30)] on linux2
Type "help", "copyright", "credits" or "license" for more information.

My computer is FC6 linux.

There's only a couple dozen of them, right-click->Save As. I'm sure
Juniper would appreciate that much more than an automated crawler.

As far as your ValueError is concerned, consider that
'www.juniper.com' doesn't start with a protocol specification when
passed into urllib2.urlopen.

-Jeff
mcjeff.blogspot.com

powah · May 27, 2009

Code:
I want to download all mib files from the web page:http://www.juniper.net/techpubs/software/junos/junos94/swconfig-net-m...

Click to expand...

All mib filenames are of this format:www.juniper.net/techpubs... .txt

Click to expand...

I write this program but has the following error.
Please help.
Thanks.

Click to expand...

Code:

#!/usr/bin/env python import urllib2,os,urlparse url="http://www.juniper.net/techpubs/software/junos/junos94/swconfig- net-mgmt/juniper-specific-mibs-junos-nm.html#jN18E19" page=urllib2.urlopen(url) f=0 links=[] data=page.read().split("\n") for item in data: if "www.juniper.net/techpubs" in item: httpind=item.index("www.juniper.net/techpubs") item=item[httpind:] #print "item " + item ind=item.index("<") links.append(item[:ind]) #grab all links # download all links for link in links: print "link " + link filename=link.split("/")[-1] print "downloading ... " + filename u=urllib2.urlopen(link) p=u.read() open(filename,"w").write(p)

Click to expand...

$ ~/python/downloadjuniper.py
linkwww.juniper.net/techpubs/software/junos/junos94/swconfig-net-mgmt/mib...
downloading ... mib-jnx-user-aaa.txt
Traceback (most recent call last):
File "/home/powah/python/downloadjuniper.py", line 20, in ?
u=urllib2.urlopen(link)
File "/usr/lib/python2.4/urllib2.py", line 130, in urlopen
return _opener.open(url, data)
File "/usr/lib/python2.4/urllib2.py", line 350, in open
protocol = req.get_type()
File "/usr/lib/python2.4/urllib2.py", line 233, in get_type
raise ValueError, "unknown url type: %s" % self.__original
ValueError: unknown url type:www.juniper.net/techpubs/software/junos/junos94/swconfig-net-mgmt/mib...

Click to expand...

$ python
Python 2.4.4 (#1, Oct 23 2006, 13:58:00)
[GCC 4.1.1 20061011 (Red Hat 4.1.1-30)] on linux2
Type "help", "copyright", "credits" or "license" for more information.

Click to expand...

My computer is FC6 linux.

Click to expand...

There's only a couple dozen of them, right-click->Save As. I'm sure
Juniper would appreciate that much more than an automated crawler.

As far as your ValueError is concerned, consider that
'www.juniper.com'doesn't start with a protocol specification when
passed into urllib2.urlopen.

-Jeff
mcjeff.blogspot.com

Juniper's web page is simple, I am learning python so as to download
files from more complex web page and do other things as well.

powah · May 27, 2009

Code:
I want to download all mib files from the web page:http://www.juniper.net/techpubs/software/junos/junos94/swconfig-net-m...

All mib filenames are of this format:www.juniper.net/techpubs... .txt

I write this program but has the following error.
Please help.
Thanks.

Code:

#!/usr/bin/env python import urllib2,os,urlparse url="http://www.juniper.net/techpubs/software/junos/junos94/swconfig- net-mgmt/juniper-specific-mibs-junos-nm.html#jN18E19" page=urllib2.urlopen(url) f=0 links=[] data=page.read().split("\n") for item in data: if "www.juniper.net/techpubs" in item: httpind=item.index("www.juniper.net/techpubs") item=item[httpind:] #print "item " + item ind=item.index("<") links.append(item[:ind]) #grab all links # download all links for link in links: print "link " + link filename=link.split("/")[-1] print "downloading ... " + filename u=urllib2.urlopen(link) p=u.read() open(filename,"w").write(p)

$ ~/python/downloadjuniper.py
linkwww.juniper.net/techpubs/software/junos/junos94/swconfig-net-mgmt/mib....
downloading ... mib-jnx-user-aaa.txt
Traceback (most recent call last):
File "/home/powah/python/downloadjuniper.py", line 20, in ?
u=urllib2.urlopen(link)
File "/usr/lib/python2.4/urllib2.py", line 130, in urlopen
return _opener.open(url, data)
File "/usr/lib/python2.4/urllib2.py", line 350, in open
protocol = req.get_type()
File "/usr/lib/python2.4/urllib2.py", line 233, in get_type
raise ValueError, "unknown url type: %s" % self.__original
ValueError: unknown url type:www.juniper.net/techpubs/software/junos/junos94/swconfig-net-mgmt/mib...

$ python
Python 2.4.4 (#1, Oct 23 2006, 13:58:00)
[GCC 4.1.1 20061011 (Red Hat 4.1.1-30)] on linux2
Type "help", "copyright", "credits" or "license" for more information.

My computer is FC6 linux.

I fixed one error, now if the filename is misspelled, how to ignore
the error and continue?

Code:

#!/usr/bin/env python
import urllib2,os,urlparse
url="http://www.juniper.net/techpubs/software/junos/junos94/swconfig-
net-mgmt/juniper-specific-mibs-junos-nm.html#jN18E19"
page=urllib2.urlopen(url)
f=0
links=[]
data=page.read().split("\n")
for item in data:
    if "www.juniper.net/techpubs" in item:
        httpind=item.index("www.juniper.net/techpubs")
        item=item[httpind:]
        #print "item " + item
        ind=item.index(".txt") + 4
        links.append(item[:ind]) #grab all links
# download all links
for link in links:
    filename=link.split("/")[-1]
    link = "http://" + link
    print "link " + link
    print "downloading ... " + filename
    u=urllib2.urlopen(link)
    p=u.read()
    open(filename,"w").write(p)

$ ~/python/downloadjuniper_onepage.py
link http://www.juniper.net/techpubs/software/junos/junos94/swconfig-net-mgmt/mib-jnx-virtual-chassis.txt
downloading ... mib-jnx-virtual-chassis.txt
Traceback (most recent call last):
File "/home/powah/python/downloadjuniper_onepage.py", line 7, in ?
u=urllib2.urlopen(link)
File "/usr/lib/python2.4/urllib2.py", line 130, in urlopen
return _opener.open(url, data)
File "/usr/lib/python2.4/urllib2.py", line 364, in open
response = meth(req, response)
File "/usr/lib/python2.4/urllib2.py", line 471, in http_response
response = self.parent.error(
File "/usr/lib/python2.4/urllib2.py", line 402, in error
return self._call_chain(*args)
File "/usr/lib/python2.4/urllib2.py", line 337, in _call_chain
result = func(*args)
File "/usr/lib/python2.4/urllib2.py", line 480, in
http_error_default
raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 404: Not Found

Chris Rebert · May 27, 2009

Code:
I want to download all mib files from the web page:http://www.juniper.net/techpubs/software/junos/junos94/swconfig-net-m...

All mib filenames are of this format:www.juniper.net/techpubs... .txt

I write this program but has the following error.
Please help.
Thanks.

Code:

#!/usr/bin/env python import urllib2,os,urlparse url="http://www.juniper.net/techpubs/software/junos/junos94/swconfig- net-mgmt/juniper-specific-mibs-junos-nm.html#jN18E19" page=urllib2.urlopen(url) f=0 links=[] data=page.read().split("\n") for item in data: Â Â if "www.juniper.net/techpubs" in item: Â Â Â Â httpind=item.index("www.juniper.net/techpubs") Â Â Â Â item=item[httpind:] Â Â Â Â #print "item " + item Â Â Â Â ind=item.index("<") Â Â Â Â links.append(item[:ind]) #grab all links # download all links for link in links: Â Â print "link " + link Â Â filename=link.split("/")[-1] Â Â print "downloading ... " + filename Â Â u=urllib2.urlopen(link) Â Â p=u.read() Â Â open(filename,"w").write(p)

$ ~/python/downloadjuniper.py
linkwww.juniper.net/techpubs/software/junos/junos94/swconfig-net-mgmt/mib...
downloading ... mib-jnx-user-aaa.txt
Traceback (most recent call last):
Â File "/home/powah/python/downloadjuniper.py", line 20, in ?
Â Â u=urllib2.urlopen(link)
Â File "/usr/lib/python2.4/urllib2.py", line 130, in urlopen
Â Â return _opener.open(url, data)
Â File "/usr/lib/python2.4/urllib2.py", line 350, in open
Â Â protocol = req.get_type()
Â File "/usr/lib/python2.4/urllib2.py", line 233, in get_type
Â Â raise ValueError, "unknown url type: %s" % self.__original
ValueError: unknown url type:www.juniper.net/techpubs/software/junos/junos94/swconfig-net-mgmt/mib...

$ python
Python 2.4.4 (#1, Oct 23 2006, 13:58:00)
[GCC 4.1.1 20061011 (Red Hat 4.1.1-30)] on linux2
Type "help", "copyright", "credits" or "license" for more information.

My computer is FC6 linux.

Click to expand...

I fixed one error, now if the filename is misspelled, how to ignore
the error and continue?

Read the fine tutorial: http://docs.python.org/tutorial/errors.html

Cheers,
Chris

powah · May 28, 2009

You really should go through the tutorial. It will explain this and
other important things well. But, since I'm feeling generous:

Replace this:> u=urllib2.urlopen(link)

with this:
try:
u = urllib2.urlopen(link)
p = u.read()
except urllib2.HTTPError:
pass
else:
dest = open(filename, "w")
dest.write(p)
dest.close()

--Scott David Daniels
(e-mail address removed)

Thanks!

Improving the web page download code.	5	Aug 27, 2013
python: HTTP connections through a proxy server requiring authentication	3	Jan 26, 2013
TypeError: not all arguments converted during string formatting	2	Dec 13, 2013
urllib2 - safe way to download something	3	Nov 14, 2008
python network access restriccion in ipod touch/iphone?	0	Nov 23, 2007
urllib and urllib2, with proxies	0	Aug 8, 2006
cron job times out	0	Nov 24, 2006
Problem with ez_setup on a "non-networked" machine	1	Oct 2, 2006

download all mib files from a web page

powah

Chris Rebert

Jeff McNeil

powah

powah

Chris Rebert

powah

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads