beautiful soup library question

meyerkp · Mar 10, 2006

Hi all,

I'm trying to extract some information from an html file using
beautiful soup. The strings I want get are after br tags, eg:


 this info
 more info
 and more info


I can navigate to the first br tag using find_next_sibling, but how do
I get the string after the br's?
br.contents is empty.

thanks for any ideas.

Erik Max Francis · Mar 10, 2006

I'm trying to extract some information from an html file using
beautiful soup. The strings I want get are after br tags, eg:


 this info
 more info
 and more info


I can navigate to the first br tag using find_next_sibling, but how do
I get the string after the br's?
br.contents is empty.

I'm not familiar with Beautiful Soup specifically, but this isn't how
the tag works. Unlike a tag like <li> or , which need not be
closed in HTML, does not contain anything, it's just a line break.
If it were XHTML, it would be , indicating that it's a
standalone tag.

Instead you want to traverse the contents of the font tag, taking into
account line breaks that you encounter.

Enigma Curry · Mar 11, 2006

Here's how I print each line after the 's:

import BeautifulSoup as Soup
page=open("test.html").read()
soup=Soup.BeautifulSoup(page)
for br in soup.fetch('br'):
print br.next

How do I access 'Beautiful Soup' on python 2.7 or 3.4 , console oridle versions.	41	May 10, 2014
A little complex usage of Beautiful Soup Parsing Help!	1	Jul 20, 2011
Beautiful soup : why does "string" not give me the string?	0	Apr 1, 2009
Beautiful Soup Looping Extraction Question	5	Mar 24, 2008
python-parser running Beautiful Soup needs to be reviewed	4	Dec 11, 2010
Help with my responsive home page	2	Dec 14, 2022
Using Beautiful Soup to entangle bookmarks.html	0	Sep 7, 2006
Using Beautiful Soup to entangle bookmarks.html	15	Sep 7, 2006

beautiful soup library question

meyerkp

Erik Max Francis

Enigma Curry

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads