Question: need to parse web pages to extract data

T

Troll

Hi,

The site is:
www.homepriceguide.com.au

A sample page with data can be seen at:
http://www.homepriceguide.com.au/sn...?action=view&suburbORpostcode=6153&source=apm

The only thing that changes is the postcode so the next page in line will
be:
http://www.homepriceguide.com.au/sn...?action=view&suburbORpostcode=6154&source=apm

etc etc

What I'm trying to do is to extract price info and save it to a file where
each record has the postcode as its ID. Last year I wrote a script that went
through the site and gathered the data for me and dumped the results in a
file. Unfortunately it's gone walking somewhere. Can someone pls remind me
which module is best to be used here (I'm mainly concerned with the parsing
side right now)? I have not coded for <> 12mths so I'm a bit rusty now but
hopefully it will all come back.

Let me know if the above is not clear.

Thanks in advance.
Voitec
 
P

Peter Wyzl

Troll said:
Hi,

The site is:
www.homepriceguide.com.au

A sample page with data can be seen at:
http://www.homepriceguide.com.au/sn...?action=view&suburbORpostcode=6153&source=apm

The only thing that changes is the postcode so the next page in line will
be:
http://www.homepriceguide.com.au/sn...?action=view&suburbORpostcode=6154&source=apm

etc etc

What I'm trying to do is to extract price info and save it to a file where
each record has the postcode as its ID. Last year I wrote a script that
went
through the site and gathered the data for me and dumped the results in a
file. Unfortunately it's gone walking somewhere. Can someone pls remind me
which module is best to be used here (I'm mainly concerned with the
parsing
side right now)? I have not coded for <> 12mths so I'm a bit rusty now but
hopefully it will all come back.

Start with the LWP set of modules....
 
V

Voitec

Much obliged :)
Thanks very much Peter.


Peter Wyzl said:
Troll said:
Hi,

The site is:
www.homepriceguide.com.au

A sample page with data can be seen at:
http://www.homepriceguide.com.au/sn...?action=view&suburbORpostcode=6153&source=apm

The only thing that changes is the postcode so the next page in line will
be:
http://www.homepriceguide.com.au/sn...?action=view&suburbORpostcode=6154&source=apm

etc etc

What I'm trying to do is to extract price info and save it to a file where
each record has the postcode as its ID. Last year I wrote a script that
went
through the site and gathered the data for me and dumped the results in a
file. Unfortunately it's gone walking somewhere. Can someone pls remind me
which module is best to be used here (I'm mainly concerned with the
parsing
side right now)? I have not coded for <> 12mths so I'm a bit rusty now but
hopefully it will all come back.

Start with the LWP set of modules....

--
Wyzelli
{{${^_sub}=sub{scalar reverse shift}}{$_={${^_reverse}=>
{${^_scalar}=>{${^_shift}=>{${^_sub}=>{${^_print}=>{}}}}}}
}{s{.*}{rekcaH lreP rehtona tsuJ}}{print("@{[&${^_sub}($_)]}")}}
 
D

dan baker

Troll said:
What I'm trying to do is to extract price info
---------

try module LWP::Simple and get() to grab the source... then extract
what you want....

d
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,776
Messages
2,569,603
Members
45,188
Latest member
Crypto TaxSoftware

Latest Threads

Top