Searching Google?

O

Oltmans

Hey all,

I want to search Google.com using a specific keyword and I just want
to read back the response using Pyhon. After some thorough Googling I
realized that I probably need a Search API key to do that. Is that
correct? Now, I don't have a search key so is there a workaround?
Please enlighten me.

Thanks,
Oltmans
 
C

Curt Hash

Hey all,

I want to search Google.com using a specific keyword and I just want
to read back the response using Pyhon. After some thorough Googling I
realized that I probably need a Search API key to do that. Is that
correct? Now, I don't have a search key so is there a workaround?
Please enlighten me.

Thanks,
Oltmans

You just need to change your User-Agent so that Google doesn't know a
Python script is making the request:

import urllib2
headers = {'User-Agent' : 'Mozilla/5.0 (X11; U; Linux i686; en-US;
rv:1.9.0.6) Gecko/2009020911 Ubuntu/8.04 (hardy) Firefox/3.0.6'}
query = 'foo'
req = urllib2.Request('http://www.google.com/search?&q=' + query,
headers=headers)
response = urllib2.urlopen(req)
results = response.read()
 
J

Johannes Bauer

Curt said:
You just need to change your User-Agent so that Google doesn't know a
Python script is making the request:

Why would Google not send a response if the User-Agent header field is
not a recognized browser? I doubt that's what's happening and the OP
would just like the convenience of a search API instead of parsing the
web page (like other applications do). Ugly, but it works.

Regards,
Johannes
 
A

Alex

Why would Google not send a response if the User-Agent header field is
not a recognized browser?

Because making automated queries to Google is against its TOS so
Google block any client that doesn't seam to be human.
On the other hands Google's API does not return the same exact result
as in a normal web query.
My suggestion is: set a browser like User agent and accept gzipped
content to be as friendly as possible and don't do too many queries in
a small time span.
Parsing Google result page with Beautifulsoup is a piece of cake.

Alex
 
T

Tim Wintle

Why would Google not send a response if the User-Agent header field is
not a recognized browser? I doubt that's what's happening and the OP
would just like the convenience of a search API instead of parsing the
web page (like other applications do). Ugly, but it works.

I suspect that it's the other way around - Google has black-listed the
standard python user-agent rather than whitelisting useragents.

Think about how much power it takes to do a query on Google - if they
provided a search API they would lose out on advertising on the results
- which at the end of the day is their income-source.

It's a pain not to have a search API, but you've got to understand it!
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Staff online

Members online

Forum statistics

Threads
473,770
Messages
2,569,586
Members
45,087
Latest member
JeremyMedl

Latest Threads

Top