regular expression

dj · Jul 2, 2003

Hi,
I am writing a script that parses an html file (which has been retrieved as
a scalar by LWP::UserAgent). The script looks for everything in between the
first tag and the last tag, with any number of and tags in
between. I am sure I have done something like this before, but for the life
of me I can't remember how... (maybe i did it before in lex). Anyone got
any neato suggestions?

Thanks for any help,
Drew

Nicholas Knight · Jul 2, 2003

Hi,
I am writing a script that parses an html file (which has been retrieved
as
a scalar by LWP::UserAgent). The script looks for everything in between
the first tag and the last tag, with any number of and 
tags in
between. I am sure I have done something like this before, but for the
life
of me I can't remember how... (maybe i did it before in lex). Anyone
got any neato suggestions?

Are you looking for //s ? It makes '.' match newlines, too. I'd probably
do it like this (the 'i' to ignore case, as some people capitalize all
tags and some don't):

/(.*)<\/p>/si

dj · Jul 2, 2003

Hi Nicholas,

yep, i had something along these lines,

while ($_ =~ s/.+(.+)<\/P>.+/$1/gsi) {
print;
}

but no sub occurs. I have tried a few combinations, but no match

Martien Verbruggen · Jul 2, 2003

A _very_ simpleminded approach could do this:

my ($stuff) = /(.*)/i;

Addition: You also need the /s flag to match newlines. But again, I
wouldn't use it.

Martien

How do I get the text that is found by a regular expression?	10	Apr 30, 2014
A regular expression query	2	Apr 23, 2007
Problem creating a regular expression to parse open-iscsi, iscsiadmoutput (help?)	5	Jun 13, 2013
Regular expression help	13	Nov 19, 2007
Regular Expression Help?	5	Feb 4, 2009
Pattern Search Regular Expression	20	Jun 15, 2013
regular expression question	3	May 15, 2008
Regular expression help	2	Dec 11, 2010

regular expression

dj

Nicholas Knight

dj

Martien Verbruggen

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads