S
Sandeep Guria
Hi!
I am trying to build a web scraper which fetches Fundamental data for
listed companies from finance websites.
let me show an example.
"<tbody>
<tr><td>PE ratio</td><td class="numericalColumn">
16.83</td><td>14/02/11</td></tr>
<tr><td>EPS (Rs)</td><td class="numericalColumn">
10.59</td><td>Mar, 10</td></tr>
<tr><td>Sales (Rs crore)</td><td class="numericalColumn">
13,963.81</td><td>Dec, 10</td></tr>
<tr><td>Face Value (Rs)</td><td
class="numericalColumn">10</td><td> </td></tr>
<tr><td>Net profit margin (%)</td><td class="numericalColumn">
17.72</td><td>Mar, 10</td></tr>
<tr><td>Last dividend (%)</td><td
class="numericalColumn">30</td><td>18/01/11</td></tr>
<tr><td>Return on average equity</td><td
class="numericalColumn">13.69</td><td>Mar, 10</td></tr>
</tbody>
"
I want to the data '16.83' from the above html , so what I do is
I parse the HTML file and save it into doc.
I search doc for inner text 'PE ratio'
And then I chose the next element using next_sibling.
But I am getting an error
'C:\Users\Administrator\Documents>ruby scraper.rb scraper.rb:9:in
`<main>': undefined method `next_sibling' for #<Hpricot::Elements[{elem
<td> "PE ratio" </td>}]> (NoMethodError)'
I'll be grateful for any suggestions .
Sorry about the formatting of the HTML Text!
Attachments:
http://www.ruby-forum.com/attachment/5911/scraper.rb
I am trying to build a web scraper which fetches Fundamental data for
listed companies from finance websites.
let me show an example.
"<tbody>
<tr><td>PE ratio</td><td class="numericalColumn">
16.83</td><td>14/02/11</td></tr>
<tr><td>EPS (Rs)</td><td class="numericalColumn">
10.59</td><td>Mar, 10</td></tr>
<tr><td>Sales (Rs crore)</td><td class="numericalColumn">
13,963.81</td><td>Dec, 10</td></tr>
<tr><td>Face Value (Rs)</td><td
class="numericalColumn">10</td><td> </td></tr>
<tr><td>Net profit margin (%)</td><td class="numericalColumn">
17.72</td><td>Mar, 10</td></tr>
<tr><td>Last dividend (%)</td><td
class="numericalColumn">30</td><td>18/01/11</td></tr>
<tr><td>Return on average equity</td><td
class="numericalColumn">13.69</td><td>Mar, 10</td></tr>
</tbody>
"
I want to the data '16.83' from the above html , so what I do is
I parse the HTML file and save it into doc.
I search doc for inner text 'PE ratio'
And then I chose the next element using next_sibling.
But I am getting an error
'C:\Users\Administrator\Documents>ruby scraper.rb scraper.rb:9:in
`<main>': undefined method `next_sibling' for #<Hpricot::Elements[{elem
<td> "PE ratio" </td>}]> (NoMethodError)'
I'll be grateful for any suggestions .
Sorry about the formatting of the HTML Text!
Attachments:
http://www.ruby-forum.com/attachment/5911/scraper.rb