getting title, desciption for webpages

  • Thread starter रवींदर ठाकà¥à¤° (ravinder thakur
  • Start date
À

रवींदर ठाकà¥à¤° (ravinder thakur

hello friends,


i am trying to find some generic way of getting the title and
description of webpages such as the one shown by google in the search
results. is there any _easy_ method that we can use for same ? i have
tried google apis but apparently it doesn't contain the titles/
description of all the web pages. i will be doing this in python.


thanks
ravinder thakur
 
J

Jukka K. Korpela

Scripsit रवींदर ठाकà¥à¤° (ravinder thakur):
i am trying to find some generic way of getting the title and
description of webpages [...] i will be doing this in python.

Try googling with words like
python html parse

The first hit I got is
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/286269
which might suit your needs.

It's probably easier to write two good HTML parsers than to decide which
of them is better. But for extracting the <title> element and the <meta>
element with name="description", any good or half-good parser should do.
Just make sure you recognize the tag and attribute names and the value
"description" in a case-sensitive manner and do not change the case of
anything in the title and description you extract (unless you really
want to).
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,014
Latest member
BiancaFix3

Latest Threads

Top