python and http POST

galileo228 · Feb 11, 2010

Hey All,

Been teaching myself Python for a few weeks, and am trying to write a
program that will go to a url, enter a string in one of the search
fields, submit the search, and return the contents of the search
result.

I'm using httplib2.

My two particular questions:

1) When I set my 'body' var, (i.e. 'body = {'query':'search_term'}),
how do I know what the particular key should be? In other words, how
do I tell python which form on the web page I'm visiting I'd like to
fill in? Do I simply go to the webpage itself and look at the html
source? But if that's the case, which tag tells me the name of the
key?

2) Even once python fills in the form properly, how can I tell it to
'submit' the search?

Thanks all!

Matt

Ken Seehart · Feb 11, 2010

"Use tamperdata to view and modify HTTP/HTTPS headers and post
parameters... "

https://addons.mozilla.org/en-US/firefox/addon/966

Enjoy,
Ken

Terry Reedy · Feb 11, 2010

Hey All,

Been teaching myself Python for a few weeks, and am trying to write a
program that will go to a url, enter a string in one of the search
fields, submit the search, and return the contents of the search
result.

I'm using httplib2.

My two particular questions:

1) When I set my 'body' var, (i.e. 'body = {'query':'search_term'}),
how do I know what the particular key should be? In other words, how
do I tell python which form on the web page I'm visiting I'd like to
fill in? Do I simply go to the webpage itself and look at the html
source? But if that's the case, which tag tells me the name of the
key?

2) Even once python fills in the form properly, how can I tell it to
'submit' the search?

This
http://groups.csail.mit.edu/uid/sikuli/
*might* help you.

Javier Collado · Feb 12, 2010

Hello,

I haven't used httplib2, but you can certainly use any other
alternative to send HTTP requests:
- urllib/urllib2
- mechanize

With regard to how do you find the form you're looking for, you may:
- create the HTTP request on your own with urllib2. To find out what
variables do you need to post, you can use tamperdata Firefox addon as
suggested (I haven't used that one) or httpfox (I have and it works
great).
- use mechanize to locate the form for you, fill the data in and click
on the submit button.

Additionally, you may wan to scrape some data that may be useful for
your requests. For that BeautifulSoup is good solution (with some
Firebug help to visually locate what you're looking for).

Best regards,
Javier

P.S. Some examples here:
http://www.packtpub.com/article/web-scraping-with-python
http://www.packtpub.com/article/web-scraping-with-python-part-2

galileo228 · Feb 13, 2010

Thank you all for your responses, and Javier thank you for your longer
response. I've just downloaded mechanize and beautifulsoup and will
start to play around.

From a pure learning standpoint, however, I'd really like to learn how
to use the python post method (without mechanize) to go to a webpage,
fill in a form, click 'submit', follow the redirect to the results
page, and download content.

For example, if I go to google.com, use firebug and click on the
search bar, the following HTML is highlighted:

<input value="" title="Google Search" class="lst" size="55" name="q"
maxlength="2048"
onblur="google&&google.fade&&google.fade()"
autocomplete="off">

So if I were to use the 'post' method, how can I tell from the code
above what the ID of the searchbar is? Is it 'value', 'name', or
neither?

Assuming that the ID is 'name', then to search google for the term
'olypmics' would the proper code be:

import httplib2
import urllib
data = {'q':'olympics'}
body = urllib.urlencode(data)
h = httplib2.Http()
resp, content = h.request("http://www.google.com", method="POST",
body=body)
print content;

Does content return the content of the 'search results' page? And if
not, how do I tell python to do that?
Finally, must I transmit headers, or are they optional?

Thanks all for your continued help!

Matt

galileo228 · Feb 13, 2010

Thank you all for your responses, and Javier thank you for your longer
response. I've just downloaded mechanize and beautifulsoup and will
start to play around.

From a pure learning standpoint, however, I'd really like to learn how
to use the python post method (without mechanize) to go to a webpage,
fill in a form, click 'submit', follow the redirect to the results
page, and download content.

For example, if I go to google.com, use firebug and click on the
search bar, the following HTML is highlighted:

<input value="" title="Google Search" class="lst" size="55" name="q"
maxlength="2048"
onblur="google&&google.fade&&google.fade()"
autocomplete="off">

So if I were to use the 'post' method, how can I tell from the code
above what the ID of the searchbar is? Is it 'value', 'name', or
neither?

Assuming that the ID is 'name', then to search google for the term
'olypmics' would the proper code be:

import httplib2
import urllib
data = {'q':'olympics'}
body = urllib.urlencode(data)
h = httplib2.Http()
resp, content = h.request("http://www.google.com", method="POST",
body=body)
print content;

Does content return the content of the 'search results' page? And if
not, how do I tell python to do that?
Finally, must I transmit headers, or are they optional?

Thanks all for your continued help!

Matt

galileo228 · Feb 13, 2010

Thank you all for your responses, and Javier thank you for your longer
response. I've just downloaded mechanize and beautifulsoup and will
start to play around.

From a pure learning standpoint, however, I'd really like to learn how
to use the python post method (without mechanize) to go to a webpage,
fill in a form, click 'submit', follow the redirect to the results
page, and download content.

For example, if I go to google.com, use firebug and click on the
search bar, the following HTML is highlighted:

<input value="" title="Google Search" class="lst" size="55" name="q"
maxlength="2048"
onblur="google&&google.fade&&google.fade()"
autocomplete="off">

So if I were to use the 'post' method, how can I tell from the code
above what the ID of the searchbar is? Is it 'value', 'name', or
neither?

Assuming that the ID is 'name', then to search google for the term
'olypmics' would the proper code be:

import httplib2
import urllib
data = {'q':'olympics'}
body = urllib.urlencode(data)
h = httplib2.Http()
resp, content = h.request("http://www.google.com", method="POST",
body=body)
print content;

Does content return the content of the 'search results' page? And if
not, how do I tell python to do that?
Finally, must I transmit headers, or are they optional?

Thanks all for your continued help!

Matt

Python HTTP POST	4	Jul 17, 2013
All CRUD operations work except POST. Why?	2	May 28, 2023
Issue with passing fetched data to POST form. How can I?	0	Jul 23, 2023
HTTP Post Request	7	May 10, 2010
Google Chart API, HTTP POST request format.	3	Jan 6, 2011
http post goes into $_REQUEST instead into $_FILES	1	Aug 15, 2013
Python client/server that reads HTML body from server	1	Apr 12, 2023
KML to CSV file conversion using Python and Windows Powershell	0	Oct 14, 2022

python and http POST

galileo228

Ken Seehart

Terry Reedy

Javier Collado

galileo228

galileo228

galileo228

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads