How to Encode Parameters into an HTML Parsing Script

S

SMERSH009X

I've written a Script that navigates various urls on a website, and
fetches the contents.
The Url's are being fed from a list "urlList". Everything seems to
work splendidly, until I introduce the concept of encoding parameters
for a certain url.
So for example if I wanted to navigate to an encoded url
http://online.investools.com/landing.iedu?signedin=true rather than
just http://online.investools.com/landing.iedu How would I do this?
How can I modify the script to urlencode these parameters:
{signedin:true} and to associate them with a specific url from the
urlList
Thank you!


import datetime, time, re, os, sys, traceback, smtplib, string,
urllib2, urllib, inspect
from urllib2 import build_opener, HTTPCookieProcessor, Request
opener = build_opener(HTTPCookieProcessor)
from urllib import urlencode

def urlopen2(url, data=None, user_agent='urlopen2'):
"""Opens Our URLS """
if hasattr(data, "__iter__"):
data = urlencode(data)
headers = {'User-Agent' : user_agent} # User-Agent for
Unspecified Browser
return opener.open(Request(url, data, headers))

def badCharCheck(host,url):
try:
page = urlopen2("http://"+host+".investools.com/"+url+"", ())
pageRead= page.read()
print "Loading:",url
#print pageRead
except:
print "Failed: ", traceback.format_tb(sys.exc_info()[2]),'\n'


if __name__ == '__main__':
host= "online"
urlList = ["landing.iedu","sitemap.iedu"]
print "\n","***** Begin BadCharCheck for", host
for url in urlList:
badCharCheck(host,url)

print'***** TEST FINISHED! Total Runs:'
sys.exit()

OUTPUT:
***** Begin BadCharCheck for online
Loading: landing.iedu
Loading: sitemap.iedu
***** TEST FINISHED! Total Runs:
 
G

Gabriel Genellina

So for example if I wanted to navigate to an encoded url
http://online.investools.com/landing.iedu?signedin=true rather than
just http://online.investools.com/landing.iedu How would I do this?
How can I modify the script to urlencode these parameters:
{signedin:true} and to associate them with a specific url from the
urlList

If you want to use GET, append '?' plus the encoded parameters to the
desired url:

py> data = {'signedin':'true', 'another':42}
py> print urlencode(data)
signedin=true&another=42

Do not use the data argument to urlopen.
 
S

SMERSH009X

If you want to use GET, append '?' plus the encoded parameters to the
desired url:

py> data = {'signedin':'true', 'another':42}
py> print urlencode(data)
signedin=true&another=42

Do not use the data argument to urlopen.

Sweet! I love this python group
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,764
Messages
2,569,567
Members
45,041
Latest member
RomeoFarnh

Latest Threads

Top