Extract Information from Tables in html

Jackie Wang · Sep 5, 2008

Dear all,

Here is a html code:

<td valign="top" headers="col4">

Premier Community Bank of Southwest Florida
<br />
Fort Myers, FL

</td>

My question is how I can extract the strings and get the results:
Premier Community Bank of Southwest Florida; Fort Myers, FL

Thanks a lot

Jackie

Stefan Behnel · Sep 5, 2008

Hi,

Jackie said:
Here is a html code:

<td valign="top" headers="col4">

Premier Community Bank of Southwest Florida
<br />
Fort Myers, FL

</td>

My question is how I can extract the strings and get the results:
Premier Community Bank of Southwest Florida; Fort Myers, FL

Use lxml.html. Something like this should do what you want:
... print( td.xpath("normalize-space()") )

Tweak as you see fit, tree iteration is at your service in case you need more.

http://codespeak.net/lxml/

Stefan

Can anyone please help? HTML - two tables applying different styles	4	Dec 1, 2020
Extract information from HTML table	7	Apr 1, 2007
Use BeautifulSoup to delete certain tag while keeping its content	3	Sep 6, 2008
Tables in Post	1	Jul 28, 2008
Help with HTML tables	4	Mar 29, 2006
XSLT, HTML to XML, understanding external Website	0	Jul 15, 2012
Perl script to extract data from webpage? (knucklehead newbie).	6	Jun 23, 2004
problems in rendering cell too much text	3	Dec 28, 2006

Extract Information from Tables in html

Jackie Wang

Stefan Behnel

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads