get (url) until match is found

L

Lydia Shawn

hi,
i need someones good advice solving the following problem:
i am matching a html page for certain triggers and want to grep only
the text in between two triggers.
as soon as the last trigger has been matched, the html page should not
be downloaded any further

here is what i got, which works well, but it requires the entire html
page do be downloaded before it starts matching:

input: test.htm
bla1
trigger1 bla2 trigger2
bla3

output:
bla2

assuming bla3 is a long text, i do not wish to was bandwith
downloading it, if the match has already been found.

the script i wrote:

require 5.004;
use LWP::Simple;

$return = get("http://test/test.htm");
$before = 'trigger1';
$after = 'trigger2';
($match) = $return =~ /$before(.*?)$after/si;
print $match;


is there a way i can combine the get command with something like
"until match = anything" ?

any help would be greatly appreciated!
thanks in advance!
lydia
 
B

Ben Morrow

i am matching a html page for certain triggers and want to grep only
the text in between two triggers.
as soon as the last trigger has been matched, the html page should not
be downloaded any further

Read perldoc lwpcook "LARGE DOCUMENTS".

Ben
 
B

Brian McCauley

i need someones good advice solving the following problem:

Others have given you a fish, but I would like to show you how you
could have caught it yourself...
$return = get("http://test/test.htm");
$before = 'trigger1';
$after = 'trigger2';
($match) = $return =~ /$before(.*?)$after/si;
print $match;


is there a way i can combine the get command with something like
"until match = anything" ?

So what you are saying is that you are using LWP::Simple and you need
more control.

So let's take a look at the fisrt paragraph of the DESCRIPTION perldoc
of LWP::Simple.


This interface is intended for those who want a simplified view
of the libwww-perl library. It should also be suit- able for
one-liners. If you need more control [...] you should use the
full object oriented interface provided by the "LWP::UserAgent"
module.

If you now look at the SYNOPSIS of LWP::UserAgent you'll find an
example like...

$response = $ua->request($request, \&callback, 4096);

sub callback { my($data, $response, $protocol) = @_; .... }

Well that looks like what you were after.

--
\\ ( )
. _\\__[oo
.__/ \\ /\@
. l___\\
# ll l\\
###LL LL\\
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,483
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top