-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
(e-mail address removed) (fatted) wrote in
Abigail said:
Desmo (
[email protected]) wrote on MMMDXCIX September MCMXCIII in
<URL:news:
[email protected]>:
Assuming the string is in $str:
my @filenames = $str =~ /filename="([^"]*)"/g;
Can you explain how the part of the regexp thats inside the () works.
I just don't follow
. Is there any reason not to use (.*?) {instead
of what you have ([^"]*)}?
Speed. It's faster to greedily slurp up "as many non-quote characters
as possible" than it is to non-greedily find "as few of any character as
possible".
When you use a greedy expression, the RE engine zips forward as far as it
can, then backs off until the next atom matches. In the above case,
it'll zip past all the non-quote characters and will match the next atom
(quote), and so will not have to back off at all.
When you use a non-greedy expression, the RE engine trods forward one
matching character at a time, each time stopping to check if the next
atom matches. Ih the above case, it'll match . (any character) and each
time check to see if the next character is a quote, which most of the
time it won't be.
The actual situation inside the RE engine is a bit more complex, because
it does some fancy optimizations here and there, and has a bit more
smarts than I described above. But the above is conceptually correct,
and is the way to think about greed vs non-greed if you want to write
faster expressions.
- --
Eric
$_ = reverse sort qw p ekca lre Js reh ts
p, $/.r, map $_.$", qw e p h tona e; print
-----BEGIN PGP SIGNATURE-----
Version: PGPfreeware 7.0.3 for non-commercial use <
http://www.pgp.com>
iQA/AwUBPw3qWmPeouIeTNHoEQKW7ACg3RfO1WM0D1c+4/ogCRccLVoRUAsAoMDe
COVQikVe3hvOhbuEf/PR28sl
=rK1g
-----END PGP SIGNATURE-----