cannot get html content of tag with BeautifulSoup

S

someone

Hello,

does anyone know how to get html contents of an tag with
BeautifulSoup? In example I'd like to get all html which is in first
<p> tag, i.e. <span id="foo">This is paragraph</span> <b>one</b>. as
unicode object

p.contents gives me a list which I cannot join TypeError: sequence
item 0: expected string, Tag found

Thanks!


from BeautifulSoup import BeautifulSoup
import re

doc = ['<html><head><title>Page title</title></head>',
'<body><p id="firstpara" align="center"><span id="foo">This is
paragraph</span> <b>one</b>.</p>',
'<p id="secondpara" align="blah">This is paragraph <b>two</b>.</
p>',
'</body></html>']
soup = BeautifulSoup(''.join(doc))
#print soup.prettify()
r = re.compile(r'<[^<]*?/?>')
for i, p in enumerate(soup.findAll('p')):
#print type(p) #<class 'BeautifulSoup.Tag'>
#print type(p.contents) #list
content = "".join(p.contents) #fails

p_without_html = r.sub(' ', content)
print p_without_html
 
S

someone

Hello,

does anyone know how to get html contents of an tag with
BeautifulSoup? In example I'd like to get all html which is in first
<p> tag, i.e. <span id="foo">This is paragraph</span> <b>one</b>. as
unicode object

p.contents gives me a list which I cannot join TypeError: sequence
item 0: expected string, Tag found

Thanks!

from BeautifulSoup import BeautifulSoup
import re

doc = ['<html><head><title>Page title</title></head>',
       '<body><p id="firstpara" align="center"><span id="foo">This is
paragraph</span> <b>one</b>.</p>',
       '<p id="secondpara" align="blah">This is paragraph <b>two</b>.</
p>',
       '</body></html>']
soup = BeautifulSoup(''.join(doc))
#print soup.prettify()
r = re.compile(r'<[^<]*?/?>')
for i, p in enumerate(soup.findAll('p')):
    #print type(p) #<class 'BeautifulSoup.Tag'>
    #print type(p.contents) #list
    content = "".join(p.contents) #fails

    p_without_html = r.sub(' ', content)
    print p_without_html

p.renderContents() was what I've looked for
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,483
Members
44,902
Latest member
Elena68X5

Latest Threads

Top