Website data-mining.

C

Coogan

Hi--


I'm using Python for the first time to make a plug-in for Firefox.
The goal of this plug-in is to take the source code from a website
and use the metadata and body text for different kinds of analysis.
My question is: How can I retrieve data from a website? I'm not even
sure if this is possible through Python. Any help?




nieu
 
S

SMERSH009

Hi--

I'm using Python for the first time to make a plug-in for Firefox.
The goal of this plug-in is to take the source code from a website
and use the metadata and body text for different kinds of analysis.
My question is: How can I retrieve data from a website? I'm not even
sure if this is possible through Python. Any help?

nieu

How about this? it will fetch the HTML source of the page.

import datetime, time, re, os, sys, traceback, smtplib, string,\
urllib2, urllib, inspect
from urllib2 import build_opener, HTTPCookieProcessor, Request
opener = build_opener(HTTPCookieProcessor)
from urllib import urlencode

def urlopen2(url, data=None, user_agent='urlopen2'):
"""Opens Our URLS """
if hasattr(data, "__iter__"):
data = urlencode(data)
headers = {'User-Agent' : user_agent}
return opener.open(Request(url, data, headers))

###TESTCASES START HERE###
def publishedNotes():
page = urlopen2("http://www.yourURL.com", ())
pageRead = page.read()
print pageRead

if __name__ == '__main__':
publishedNotes()

sys.exit()
 
M

Miki

Hello,
I'm using Python for the first time to make a plug-in for Firefox.
The goal of this plug-in is to take the source code from a website
and use the metadata and body text for different kinds of analysis.
My question is: How can I retrieve data from a website? I'm not even
sure if this is possible through Python. Any help?
Have a look at http://www.myinterestingfiles.com/2007/03/playboy-germany-ads.html
for getting the data and at http://www.crummy.com/software/BeautifulSoup/
for handling it.

HTH.
 
P

Paul Boddie

Jay said:
Well, it's certainly interesting, but I'm not sure how it might help the OP get data from a website...

A case of the Freudian clipboard, perhaps? ;-)

Paul
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,774
Messages
2,569,596
Members
45,142
Latest member
DewittMill
Top