parse a table in HTML page.

Discussion in 'Python' started by antonio_wn8, Oct 28, 2008.

  1. antonio_wn8

    antonio_wn8 Guest

    Hi all,
    I have a need to read and parse a table in HTML page.

    I’m using the following script:
    http://trac.davidgrant.ca/browser/src/python/misc/siteuptime/TableParser.py

    It works fine aside from link in href.

    Example:

    String to parse:
    <tr><td><a href='vaffa.html'>elog</a></td><td>normal text</td></tr>

    Output:
    [[['elog', 'normal text']]]

    as you can see it misses the info about href...
    how can get this information 'vaffa.html'?

    thanks,
    Antonella
     
    antonio_wn8, Oct 28, 2008
    #1
    1. Advertising

  2. Have you looked at beautiful soup?
    http://www.crummy.com/software/BeautifulSoup/

    antonio_wn8 schrieb:
    > Hi all,
    > I have a need to read and parse a table in HTML page.
    >
    > I’m using the following script:
    > http://trac.davidgrant.ca/browser/src/python/misc/siteuptime/TableParser.py
    >
    > It works fine aside from link in href.
    >
    > Example:
    >
    > String to parse:
    > <tr><td><a href='vaffa.html'>elog</a></td><td>normal text</td></tr>
    >
    > Output:
    > [[['elog', 'normal text']]]
    >
    > as you can see it misses the info about href...
    > how can get this information 'vaffa.html'?




    --
    Thomas Guettler, http://www.thomas-guettler.de/
    E-Mail: guettli (*) thomas-guettler + de
     
    Thomas Guettler, Oct 28, 2008
    #2
    1. Advertising

  3. antonio_wn8 wrote:
    > I have a need to read and parse a table in HTML page.
    >
    > I’m using the following script:
    > http://trac.davidgrant.ca/browser/src/python/misc/siteuptime/TableParser.py
    >
    > It works fine aside from link in href.
    >
    > Example:
    >
    > String to parse:
    > <tr><td><a href='vaffa.html'>elog</a></td><td>normal text</td></tr>
    >
    > Output:
    > [[['elog', 'normal text']]]


    You should try lxml.html. It gives you various tools like XPath to look for
    specific elements and helper functions to find the links in an HTML document.

    http://codespeak.net/lxml/

    Stefan
     
    Stefan Behnel, Oct 28, 2008
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Mitchua
    Replies:
    3
    Views:
    1,193
    Mitchua
    Jul 14, 2003
  2. Axel Dahmen

    How To Parse HTML Page in ASPX Page

    Axel Dahmen, Jul 29, 2005, in forum: ASP .Net
    Replies:
    3
    Views:
    4,648
    Joerg Jooss
    Jul 31, 2005
  3. Bharath

    help for Parse HTML Table..!!

    Bharath, Dec 4, 2003, in forum: Java
    Replies:
    2
    Views:
    735
    Harald Hein
    Dec 4, 2003
  4. Ernst Tanaka
    Replies:
    8
    Views:
    317
    Vitor Peres
    Dec 5, 2007
  5. petro

    How to parse html table

    petro, Jul 8, 2003, in forum: Perl Misc
    Replies:
    0
    Views:
    101
    petro
    Jul 8, 2003
Loading...

Share This Page