S
Shan
So I need code that will go through a list of URLs (formatted as
http://www.google.com) and for each url get the following information:
1. The url after the href= within the following tags <link
rel="alternate" and />
So if there is <link rel="alternate" type="application/atom+xml"
title="Atom" href="http://hello.typepad.com/hello/atom.xml" /> I want
the http://hello.typepad.com/hello/atom.xml
2. everything bewtween the following tags <title> and </title>
so if there is <title>hello, typepad</title> I want hello, typepad
3. everything between the tags <h2 id="banner-description"> and </h2>
4. Finally i would like the results to be saved to a delimited file in
the following format:
column 1: original url
column 2: data obtained from step 1
column 3: data obtained from step 2
column 4: data obtained from step 3
if there is no result for any one of the steps a null should be saved.
I would like to thank whoever can provide me with the code in advance,
Thank you.
http://www.google.com) and for each url get the following information:
1. The url after the href= within the following tags <link
rel="alternate" and />
So if there is <link rel="alternate" type="application/atom+xml"
title="Atom" href="http://hello.typepad.com/hello/atom.xml" /> I want
the http://hello.typepad.com/hello/atom.xml
2. everything bewtween the following tags <title> and </title>
so if there is <title>hello, typepad</title> I want hello, typepad
3. everything between the tags <h2 id="banner-description"> and </h2>
4. Finally i would like the results to be saved to a delimited file in
the following format:
column 1: original url
column 2: data obtained from step 1
column 3: data obtained from step 2
column 4: data obtained from step 3
if there is no result for any one of the steps a null should be saved.
I would like to thank whoever can provide me with the code in advance,
Thank you.