some site login problem help plz..

james27 · Oct 5, 2009

hello..
im new to python.
i have some problem with mechanize.
before i was used mechanize with no problem.
but i couldn't success login with some site.
for several days i was looked for solution but failed.
my problem is , login is no probelm but can't retrieve html source code from
opened site.
actually only can read some small html code, such like below.

<html>
<script language=javascript>
location.replace("http://www.naver.com");
</script>
</html>

i want to retrive full html source code..but i can't . i was try with twill
and mechanize and urllib and so on.
i have no idea.. anyone can help me?

here is full source code.
and Thanks in advance!

# -*- coding: cp949 -*-
import sys,os
import mechanize, urllib
import cookielib
import re
import BeautifulSoup

params = urllib.urlencode({'url':'http://www.naver.com',
'svctype':'',
'viewtype':'',
'postDataKey':'',

'encpw':'3a793b174d976d8a614467eb0466898230f39ca68a8ce2e9c866f9c303e7c96a17c0e9bfd02b958d88712f5799abc5d26d5b6e2dfa090e10e236f2afafb723d42d2a2aba6cc3f268e214a169086af782c22d0c440c876a242a4411860dd938c4051acce987',
'encnm':'100003774',
'saveID':'0',
'enctp':'1',
'smart_level':'1',
'id':'lbu142vj',
'pw':'wbelryl',
'x':'24',
'y':'4'
})
rq = mechanize.Request("http://nid.naver.com/nidlogin.login", params)
rs = mechanize.urlopen(rq)
data = rs.read()
print data
rq = mechanize.Request("http://mail2.naver.com")
rs = mechanize.urlopen(rq)
data = rs.read()
print data

Diez B. Roggisch · Oct 5, 2009

james27 said:
hello..
im new to python.
i have some problem with mechanize.
before i was used mechanize with no problem.
but i couldn't success login with some site.
for several days i was looked for solution but failed.
my problem is , login is no probelm but can't retrieve html source code
from opened site.
actually only can read some small html code, such like below.

<html>
<script language=javascript>
location.replace("http://www.naver.com");
</script>
</html>

i want to retrive full html source code..but i can't . i was try with
twill and mechanize and urllib and so on.
i have no idea.. anyone can help me?

Your problem is that the site uses JavaScript to replace itself. Mechanize
can't do anything about that. You might have more luck with scripting a
browser. No idea if there are any special packages available for that
though.

Diez

james27 · Oct 5, 2009

still looking for good solution.
anyway..thanks Diez

lkcl · Oct 12, 2009

Your problem is that the site usesJavaScriptto replace itself. Mechanize
can't do anything about that. You might have more luck with scripting a
browser. No idea if there are any special packages available for that
though.

yes, there are. i've mentioned this a few times, on
comp.lang.python,
(so you can search for them) and have the instances documented here:

http://wiki.python.org/moin/WebBrowserProgramming

basically, you're not going to like this, but you actually need
a _full_ web browser engine, and to _execute_ the javascript.
then, after a suitable period of time (or after the engine's
"stopped executing" callback has been called, if it has one)
you can then node-walk the DOM of the engine, grab the engine's
document.body.innerHTML property, or use the engine's built-in
XPath support (if it has it) to find specific parts of the DOM
faster than if you extracted the text (into lxml etc).

you should not be shocked by this - by the fact that it takes
a whopping 10 or 20mb library, including a graphical display
mechanism, to execute a few bits of javascript.

also, if you ask him nicely, flier liu is currently working on
http://code.google.com/p/pyv8 and on implementing the W3C DOM
standard as a "daemon" service (i.e. with no GUI component) and
he might be able to help you out. the pyv8 project comes with
an example w3c.py file which implements DOM partially, but i
know he's done a lot more.

so - it's all doable, but for a given value of "do"

l.

python urllib mechanize post problem	0	May 24, 2010
mechanize.browser() click or submit related problem	0	Apr 12, 2010
POST value related question	0	Oct 12, 2009
POST value related question	0	Oct 13, 2009
mechanize login problem with website	0	Nov 19, 2009
Urllib and login	0	Sep 10, 2009
urllib2 login help	1	Feb 21, 2009
Login to Drupal 6 site using pycurl	0	Apr 13, 2009

some site login problem help plz..

james27

Diez B. Roggisch

james27

lkcl

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads