More on Urllib, and Urllib2

A

Alex Bryan

Okay, so I am having issues figuring anything out about this and have
read the "missing manual" about it so please don't send me that link
again. To put it simply I want to be able to input a word and get the
definition from dictionary.com. Now I found a work-around for
searching for the word, I just make it in the actual address. For
example I want to search for cheese, I can just do a:

urllib2.urlopen("http://dictionary.reference.com/browse/cheese")

However, the actual definition is in javascript on the page. I used
firebug to see it, and the first def, looks like this:

<table class="luna-Ent">
<tbody>
<tr>
<td class="dn" valign="top">1.</td>
<td valign="top">the curd of milk separated from the whey and prepared
in many ways as a food. </td>

the problem being that if I use code like this to get the html of that
page in python:

response = urllib2.urlopen("the webiste....")
html = response.read()
print html

I get the html source of the page, but no table with my definitions.
So what can I do? Also, is there a book or a better tutorial or
explanation of this urllib2, and urllib? If so, PLEASE let me know
about it; I will be eternally grateful.
 
C

CracKPod

Okay, so I am having issues figuring anything out about this and have  
read the "missing manual" about it so please don't send me that link  
again. To put it simply I want to be able to input a word and get the  
definition from dictionary.com. Now I found a work-around for  
searching for the word, I just make it in the actual address. For  
example I want to search for cheese, I can just do a:

urllib2.urlopen("http://dictionary.reference.com/browse/cheese")

However, the actual definition is in javascript on the page. I used  
firebug to see it, and the first def, looks like this:

<table class="luna-Ent">
<tbody>
<tr>
<td class="dn" valign="top">1.</td>
<td valign="top">the curd of milk separated from the whey and prepared  
in many ways as a food. </td>

the problem being that if I use code like this to get the html of that  
page in python:

response = urllib2.urlopen("the webiste....")
html = response.read()
print html

I get the html source of the page, but no table with my definitions.  
So what can I do? Also, is there a book or a better tutorial or  
explanation of this urllib2, and urllib? If so, PLEASE let me know  
about it; I will be eternally grateful.

It would probably be a good idea to take a look at mechanize:
http://wwwsearch.sourceforge.net/mechanize/
and at BeautifulSoup: http://www.crummy.com/software/BeautifulSoup/

Greetz,
CracKPod
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top