Need help with parsing data

Shan · Aug 9, 2006

So I need code that will go through a list of URLs (formatted as
http://www.google.com) and for each url get the following information:

1. The url after the href= within the following tags <link
rel="alternate" and />

So if there is <link rel="alternate" type="application/atom+xml"
title="Atom" href="http://hello.typepad.com/hello/atom.xml" /> I want
the http://hello.typepad.com/hello/atom.xml

2. everything bewtween the following tags <title> and </title>
so if there is <title>hello, typepad</title> I want hello, typepad

3. everything between the tags <h2 id="banner-description"> and </h2>

4. Finally i would like the results to be saved to a delimited file in
the following format:

column 1: original url
column 2: data obtained from step 1
column 3: data obtained from step 2
column 4: data obtained from step 3

if there is no result for any one of the steps a null should be saved.

I would like to thank whoever can provide me with the code in advance,
Thank you.

DJ Stunks · Aug 9, 2006

Shan said:
So I need code that will go through a list of URLs (formatted as
http://www.google.com) and for each url get the following information:

1. The url after the href= within the following tags <link
rel="alternate" and />

So if there is <link rel="alternate" type="application/atom+xml"
title="Atom" href="http://hello.typepad.com/hello/atom.xml" /> I want
the http://hello.typepad.com/hello/atom.xml

2. everything bewtween the following tags <title> and </title>
so if there is <title>hello, typepad</title> I want hello, typepad

3. everything between the tags <h2 id="banner-description"> and </h2>

4. Finally i would like the results to be saved to a delimited file in
the following format:

column 1: original url
column 2: data obtained from step 1
column 3: data obtained from step 2
column 4: data obtained from step 3

if there is no result for any one of the steps a null should be saved.

I would like to thank whoever can provide me with the code in advance,
Thank you.

it is highly unlikely that anyone will do so for a simple "thanks".
check out jobs.perl.org for someone willing to follow orders in return
for compensation.

-jp

John Bokma · Aug 10, 2006

Shan said:
So I need code that will go through a list of URLs (formatted as
http://www.google.com) and for each url get the following information:

1. The url after the href= within the following tags <link
rel="alternate" and />

So if there is <link rel="alternate" type="application/atom+xml"
title="Atom" href="http://hello.typepad.com/hello/atom.xml" /> I want
the http://hello.typepad.com/hello/atom.xml

2. everything bewtween the following tags <title> and </title>
so if there is <title>hello, typepad</title> I want hello, typepad

3. everything between the tags <h2 id="banner-description"> and </h2>

I use HTML::TreeBuilder for this, since it makes life really easy. See
http://johnbokma.com/perl/ for several examples (Web automation).

For example 3. can be done as:

my $root = HTML::TreeBuilder->new_from_content( $content );

:
:

my @column4;
push @column4, $_->as_trimmed_text
for $root->look_down( _tag => h2, id =>'banner-description' );

I would like to thank whoever can provide me with the code in advance,
Thank you.

I can provide the code, and forms to thank me are here:
http://johnbokma.com/wish-list.html

Either Object Oriented Perl or Perl Best Practices would be fine with me
since directly and indirectly you will contribute back to the Perl
community.

Tad McClellan · Aug 10, 2006

Shan said:
Subject: Need help with parsing data

What part is it that you need help with?

(you should use a module that understands XHTML data if you need
to process XHTML data.
)

I would like to thank whoever can provide me with the code in advance,

What makes you think that someone will write your program for you?

Shan · Aug 10, 2006

Thanks for your advice. i will work on writing a script today and see
what kind of results I get.

I need help making an html website	2	Aug 2, 2023
Only one table shows up with the information	2	Mar 29, 2023
I Need Help with making a function that draws in a canvas using location data.	1	Dec 17, 2021
Help with my responsive home page	2	Dec 14, 2022
I need help fixing my website	2	Oct 15, 2023
Need help with finding N.	1	Nov 21, 2022
I dont get this. Please help me!!	2	Jan 24, 2023
Need help with stripe payment	0	Oct 2, 2021

Need help with parsing data

Shan

DJ Stunks

John Bokma

Tad McClellan

Shan

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads