Perl HTML::TableExtract Question

P

Paul

Hi !

I hope someone can help.

I want to extract data from a table with 2 columns.

A sample of the table can be generated with:-

"http://moneycentral.msn.com/investo...Q=1&CFA=1&CFQ=1&TYS=1&ITT=1&ITP=1&Type=Equity"

(Sorry about the long URL :) )

What I want is the field from the top table Labelled - "Tot. Shares Out."

My Current Code is :-

#!/usr/bin/perl -w


use strict;
use HTML::TableExtract;


my $inFile = "/home/mas/development/URLTemp.tmp";
my $te = HTML::TableExtract->new( headers => [ 'Fundamental Data', '*' ]);
$te->parse_file( $inFile );
foreach my $ts ( $te->table_states ) {
foreach my $row ( $ts->rows ) {
print join( ",", @$row, "," ), "\n";
}
}


But this seems to get the table lower down the page. This wouldn't be so
bad as it has the value I need repeated but - "How do I get an
un-labelled column ????"

Any help would be appreciated.

Paul
 
P

Paul

Paul said:
Hi !

I hope someone can help.

I want to extract data from a table with 2 columns.

A sample of the table can be generated with:-

"http://moneycentral.msn.com/investo...Q=1&CFA=1&CFQ=1&TYS=1&ITT=1&ITP=1&Type=Equity"


(Sorry about the long URL :) )

What I want is the field from the top table Labelled - "Tot. Shares Out."

My Current Code is :-

#!/usr/bin/perl -w


use strict;
use HTML::TableExtract;


my $inFile = "/home/mas/development/URLTemp.tmp";
my $te = HTML::TableExtract->new( headers => [ 'Fundamental Data', '*' ]);
$te->parse_file( $inFile );
foreach my $ts ( $te->table_states ) {
foreach my $row ( $ts->rows ) {
print join( ",", @$row, "," ), "\n";
}
}


But this seems to get the table lower down the page. This wouldn't be so
bad as it has the value I need repeated but - "How do I get an
un-labelled column ????"

Any help would be appreciated.

Paul
Just a bit more info on this - the ", '*'" doesn't work - in fact it
returns empty data. Without it it assumes that the rows below are what
is wanted and it returns:-

Market Capitalization,,
Earnings/Share,,

The real question is "How do I specify a row with a NULL header ??
 
T

Tad McClellan

Paul said:
What I want is the field from the top table Labelled - "Tot. Shares Out."
my $te = HTML::TableExtract->new( headers => [ 'Fundamental Data', '*' ]);


The headers approach will not work since there are no headers
on the table that contains the data that you are after.

"How do I get an
un-labelled column ????"


Positionally.

"Tot. Shares Out." is the 7th column in the 12th row of the table
at depth=2 and count=1.

Any help would be appreciated.


my $te = HTML::TableExtract->new( depth => 2, count => 1);
my $total_outstanding = ($ts->rows)[11]->[6];
 
P

Paul

Tad said:
What I want is the field from the top table Labelled - "Tot. Shares Out."

my $te = HTML::TableExtract->new( headers => [ 'Fundamental Data', '*' ]);



The headers approach will not work since there are no headers
on the table that contains the data that you are after.


"How do I get an
un-labelled column ????"



Positionally.

"Tot. Shares Out." is the 7th column in the 12th row of the table
at depth=2 and count=1.


Any help would be appreciated.



my $te = HTML::TableExtract->new( depth => 2, count => 1);
my $total_outstanding = ($ts->rows)[11]->[6];
Thanks for that Tad !! I got the same answer at about 0230 in the
morning :-(

It seems the page isn't very well constructed.

I spent lots of time looking for the new version of HTML::TableExtract
which is supposed to address rows as well as columns but could only find
fleeting references to it.

Regards.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,774
Messages
2,569,599
Members
45,172
Latest member
NFTPRrAgenncy
Top