R
rh0dium
Hi all,
I am trying to parse into a dictionary a table and I am having all
kinds of fun. Can someone please help me out.
What I want is this:
dic={'Division Code':'SALS','Employee':'LOO ABLE'}
Here is what I have..
html="""<table> <tr valign="top"><td width="24"><img
src="/icons/ecblank.gif" border="0" height="1" width="1" alt=""
/></td><td width="129"><b><font size="2" face="Arial">Division Code:
</font></b></td><td width="693"><font size="2"
face="Arial">SALS</font></td></tr> <tr valign="top"><td width="24"><img
src="/icons/ecblank.gif" border="0" height="1" width="1" alt="" /> <td
width="129"><b><font size="2" face="Arial">Employee:
</font></b></td> <td width="693"><font size="2"
face="Arial">LOO</font><b><font size="2" face="Arial"> </font></b><font
size="2" face="Arial">ABLE</font></td></tr></table> """
from BeautifulSoup import BeautifulSoup
soup = BeautifulSoup()
soup.feed(html)
dic={}
for row in soup('table')[0]('tr'):
column = row('td')
print column[1].findNext('font').string.strip(),
column[2].findNext('font').string.strip()
dic[column[1].findNext('font').string.strip()]=
column[2].findNext('font').string.strip()
for key in dic.keys():
print key, dic[key]
The problem is I am missing the last name ABLE. How can I get "ALL"
of the text. Clearly I have something wrong with my font string.. but
what it is I am not sure of.
Please and thanks!!
I am trying to parse into a dictionary a table and I am having all
kinds of fun. Can someone please help me out.
What I want is this:
dic={'Division Code':'SALS','Employee':'LOO ABLE'}
Here is what I have..
html="""<table> <tr valign="top"><td width="24"><img
src="/icons/ecblank.gif" border="0" height="1" width="1" alt=""
/></td><td width="129"><b><font size="2" face="Arial">Division Code:
</font></b></td><td width="693"><font size="2"
face="Arial">SALS</font></td></tr> <tr valign="top"><td width="24"><img
src="/icons/ecblank.gif" border="0" height="1" width="1" alt="" /> <td
width="129"><b><font size="2" face="Arial">Employee:
</font></b></td> <td width="693"><font size="2"
face="Arial">LOO</font><b><font size="2" face="Arial"> </font></b><font
size="2" face="Arial">ABLE</font></td></tr></table> """
from BeautifulSoup import BeautifulSoup
soup = BeautifulSoup()
soup.feed(html)
dic={}
for row in soup('table')[0]('tr'):
column = row('td')
print column[1].findNext('font').string.strip(),
column[2].findNext('font').string.strip()
dic[column[1].findNext('font').string.strip()]=
column[2].findNext('font').string.strip()
for key in dic.keys():
print key, dic[key]
The problem is I am missing the last name ABLE. How can I get "ALL"
of the text. Clearly I have something wrong with my font string.. but
what it is I am not sure of.
Please and thanks!!