BeautifulSoup

Discussion in 'Python' started by elsa, Sep 2, 2009.

  1. elsa

    elsa Guest

    Hi all,

    if I have some HTML that looks like this:

    <area coords="427,724,432,732" href="http://BioCyc.org/ECOLI/NEW-IMAGE?
    type=GENE-IN-CHROM-BROWSER&amp;object=EG12309" onmouseover="return
    overlib('&lt;b&gt;Gene:&lt;/b&gt; yjtD&lt;BR&gt;&lt;b&gt;Product:&lt;/
    b&gt; predicted rRNA methyltransferase, subunit of predicted rRNA
    methyltransferase&lt;BR&gt;&lt;b&gt;Intergenic distances (bp):&lt;/
    b&gt; yjjY&lt; +400 yjtD +214 &gt;thrL');"><b>Gene:</b> yjtD<br /
    ><b>Product:</b> predicted rRNA methyltransferase, subunit of

    predicted rRNA methyltransferase<br /><b>Intergenic distances (bp):</
    b> yjjY< +400 yjtD +214 >thrL');" onmouseout="return nd();">
    </area>

    is there an easy way to use BeautifulSoup to extract just the value of
    the href attribute?

    Thanks,

    elsa
     
    elsa, Sep 2, 2009
    #1
    1. Advertising

  2. elsa

    Peter Otten Guest

    elsa wrote:

    > if I have some HTML that looks like this:
    >
    > <area coords="427,724,432,732" href="http://BioCyc.org/ECOLI/NEW-IMAGE?
    > type=GENE-IN-CHROM-BROWSER&amp;object=EG12309" onmouseover="return
    > overlib('&lt;b&gt;Gene:&lt;/b&gt; yjtD&lt;BR&gt;&lt;b&gt;Product:&lt;/
    > b&gt; predicted rRNA methyltransferase, subunit of predicted rRNA
    > methyltransferase&lt;BR&gt;&lt;b&gt;Intergenic distances (bp):&lt;/
    > b&gt; yjjY&lt; +400 yjtD +214 &gt;thrL');"><b>Gene:</b> yjtD<br /
    >><b>Product:</b> predicted rRNA methyltransferase, subunit of

    > predicted rRNA methyltransferase<br /><b>Intergenic distances (bp):</
    > b> yjjY< +400 yjtD +214 >thrL');" onmouseout="return nd();">
    > </area>
    >
    > is there an easy way to use BeautifulSoup to extract just the value of
    > the href attribute?


    >>> from BeautifulSoup import BeautifulSoup as BS
    >>> html = "<area ..."
    >>> BS(html).find("area")["href"]

    u'http://BioCyc.org/ECOLI/NEW-IMAGE?\ntype=GENE-IN-CHROM-
    BROWSER&object=EG12309'
     
    Peter Otten, Sep 2, 2009
    #2
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Dan Stromberg

    HTML purifier using BeautifulSoup?

    Dan Stromberg, Dec 21, 2004, in forum: Python
    Replies:
    1
    Views:
    392
    Jonathan Clark
    Jan 7, 2005
  2. Steve Young

    BeautifulSoup

    Steve Young, Aug 19, 2005, in forum: Python
    Replies:
    4
    Views:
    470
    Paul McGuire
    Aug 20, 2005
  3. ted

    BeautifulSoup fetch help

    ted, Jan 7, 2006, in forum: Python
    Replies:
    2
    Views:
    435
  4. ye juan

    how to run BeautifulSoup in Jython

    ye juan, Feb 3, 2006, in forum: Python
    Replies:
    1
    Views:
    335
    Diez B. Roggisch
    Feb 5, 2006
  5. Replies:
    7
    Views:
    748
    Kent Johnson
    Apr 4, 2006
Loading...

Share This Page