Extract Information from Tables in html

J

Jackie Wang

Dear all,

Here is a html code:

<td valign="top" headers="col4">

Premier Community Bank of Southwest Florida
<br />
Fort Myers, FL

</td>

My question is how I can extract the strings and get the results:
Premier Community Bank of Southwest Florida; Fort Myers, FL

Thanks a lot

Jackie
 
S

Stefan Behnel

Hi,

Jackie said:
Here is a html code:

<td valign="top" headers="col4">

Premier Community Bank of Southwest Florida
<br />
Fort Myers, FL

</td>

My question is how I can extract the strings and get the results:
Premier Community Bank of Southwest Florida; Fort Myers, FL

Use lxml.html. Something like this should do what you want:
... print( td.xpath("normalize-space()") )

Tweak as you see fit, tree iteration is at your service in case you need more.

http://codespeak.net/lxml/

Stefan
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top