Ideas on how to parse a dynamically generated html pages

C

chad

Let's say there is a site that uses javascript to generate menus. More
or less what happens is when a person clicks on url, a pop up menu
appears asking the users for some data. How would I go about
automating this? Just curious because the web spider doesn't actually
pick up the urls that generate the menu. I'm assuming the actual url
link is dynamically generated?

Here is the code I'm using to get the URLs...
.... def __init__(self, url):
.... HTMLParser.__init__(self)
.... req = urlopen(url)
.... self.feed(req.read())
.... def handle_starttag(self, tag, attrs):
.... if tag == 'a' and attrs:
.... print "Found Link => %s" % attrs[0][1]
 
T

Tim Harig

Let's say there is a site that uses javascript to generate menus. More
or less what happens is when a person clicks on url, a pop up menu
appears asking the users for some data. How would I go about
automating this? Just curious because the web spider doesn't actually
pick up the urls that generate the menu. I'm assuming the actual url
link is dynamically generated?

You have two options:

1. Look at the javascript to see what interfaces it uses. If it is
generating menues, then it is getting the data it uses to generate
those menus from somewhere. Once you have found that resource,
you can access it yourself with a request from your Python code.
This is generally the best approach if possible.

2. You can automate a bowser thorough a COM/XPCOM/etc. interface
which allows you to access the DOM object in real time as it is
modified by the Javascript and even to trigger javascript events.
There are libraries that will do this as well. I have used
this on heavy AJAX style interfaces with mountains of spagetti
Javascript that were simply too large and poorly designed to
try to understand.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,482
Members
44,900
Latest member
Nell636132

Latest Threads

Top