python - HTML processing - need tips

W

wipit

I need to process a HTML form in python. I'm using urllib2 and
HTMLParser to handle the html. There are several steps I need to take
to get to the specific page on the relevant site the first of which is
to log in with a username/password. The html code that processes the
login consists of 2 edit boxes (for User ID and Password) and a Submit
button which uses ASP.net client side validation as follows (formatted
for clarity):

<tr>
<td align="right"><b>User ID:</b>
</td>
<td align="left"><input name="txtUserName" id="txtUserName"
type="text" maxlength="63" /></td>
<td><span id="vEmail" controltovalidate="txtUserName"
errormessage="Valid Email format is required" isvalid="False"
evaluationfunction="RegularExpressionValidatorEvaluateIsValid"
validationexpression="\w+([-+.']\w+)*@\w+([-.]\w+)*\.\w+([-.]\w+)*"
style="color:Red;font-size:Smaller;font-weight:bold;">Valid Email
format is required</span>
</td>
</tr>
<tr>
<td align="right"><b>Password:</B>
</td>
<td align="left"><input name="txtUserPass" id="txtUserPass"
type="password" maxlength="49" /></td>
<td>&nbsp;</td>
</tr>
<tr >
<td>&nbsp;</td>
<td align="left"><input type="submit" name="loginButton"
value="Submit" onclick="if (typeof(Page_ClientValidate) == 'function')
Page_ClientValidate(); " language="javascript" id="loginButton" />
<td>&nbsp;</td>
</tr>

I've looked at all the relevant posts on this topic and already looked
at mechanize and ClientForm. It appears I can't use those for 2
reasons: 1) that they can't handle client side validation and 2) this
button doesn't actually reside in a form and I haven't been able to
find any python code that obtains a handle to a submit control and
simulates clicking on it.

I've tried sending the server a POST message as such:

loginParams = urllib.urlencode({'txtUserName': theUsername,
'txtUserPass': thePassword})
txdata = None
txheaders = {'User-agent' : 'Mozilla/4.0 (compatible; MSIE 5.5;
Windows NT)'}
req = Request(url1, txdata, txheaders) # url1 points to the secure
page seen following login
handle = urlopen(req, loginParams)

But this doesn't work. I dont understand the use of
Page_ClientValidate( ) and haven't really found any useful
documentation on it for my purposes. I basically need to be able to
submit this information to the site, by simulating the onclick event
through python. As far as I understand I need a solution to the 2
points I mentioned above (getting past client-side validation and
simulating a click of a non-form button). Any help on this (or other
issues I might have missed but are important/relevant) would be great!

Many thanks,
Pythonner
 
G

Gabriel Genellina

I need to process a HTML form in python. I'm using urllib2 and
HTMLParser to handle the html. There are several steps I need to take
to get to the specific page on the relevant site the first of which is
to log in with a username/password. The html code that processes the
login consists of 2 edit boxes (for User ID and Password) and a Submit
button which uses ASP.net client side validation as follows (formatted
for clarity):

Another approach would be using HTTPDebugger
<http://www.softx.org/debugger.html> to see exactly what gets
submitted, and then build a compatible Request.
On many sites you don't even need to *get* the login page -nor parse
it-, just posting the right Request is enough to log in successfully.



Gabriel Genellina
'@'.join(('gagsl-py','.'.join(('yahoo','com','ar'))))





__________________________________________________
Preguntá. Respondé. Descubrí.
Todo lo que querías saber, y lo que ni imaginabas,
está en Yahoo! Respuestas (Beta).
¡Probalo ya!
http://www.yahoo.com.ar/respuestas
 
W

wipit

I figured it out... Just turned the POST request into a GET to see what
was getting appended to the URL - thanks Gabe!
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,776
Messages
2,569,603
Members
45,188
Latest member
Crypto TaxSoftware

Latest Threads

Top