Grabbing certain section of a string

Desmo · Jul 9, 2003

If I have a rather long string like "blah blah
blah.....filename="intent.doc"...blah blah
blah...filename="second.doc"....blah blah", and I want to assign the words
between the quotation marks after filename= to variables, is there a quick
and easy way of doing it?

--

Ryan Carrier
ISA CCST III
Fraser Papers, Inc.
(207) 728-8601
(e-mail address removed)

Michael Budash · Jul 9, 2003

Desmo said:
If I have a rather long string like "blah blah
blah.....filename="intent.doc"...blah blah
blah...filename="second.doc"....blah blah", and I want to assign the words
between the quotation marks after filename= to variables, is there a quick
and easy way of doing it?

@filenames = $longstring =~ /"([^"]+)"/g;

hth-

fatted · Jul 10, 2003

Abigail said:
Desmo ([email protected]) wrote on MMMDXCIX September MCMXCIII in
<URL:news:[email protected]>:

Assuming the string is in $str:

my @filenames = $str =~ /filename="([^"]*)"/g;

Can you explain how the part of the regexp thats inside the () works.
I just don't follow

. Is there any reason not to use (.*?) {instead
of what you have ([^"]*)}?

Eric J. Roode · Jul 10, 2003

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

(e-mail address removed) (fatted) wrote in

Abigail said:
Abigail said:

Desmo ([email protected]) wrote on MMMDXCIX September MCMXCIII in
<URL:news:[email protected]>:

Click to expand...

Assuming the string is in $str:

my @filenames = $str =~ /filename="([^"]*)"/g;

Click to expand...

Can you explain how the part of the regexp thats inside the () works.
I just don't follow . Is there any reason not to use (.*?) {instead
of what you have ([^"]*)}?

Speed. It's faster to greedily slurp up "as many non-quote characters
as possible" than it is to non-greedily find "as few of any character as
possible".

When you use a greedy expression, the RE engine zips forward as far as it
can, then backs off until the next atom matches. In the above case,
it'll zip past all the non-quote characters and will match the next atom
(quote), and so will not have to back off at all.

When you use a non-greedy expression, the RE engine trods forward one
matching character at a time, each time stopping to check if the next
atom matches. Ih the above case, it'll match . (any character) and each
time check to see if the next character is a quote, which most of the
time it won't be.

The actual situation inside the RE engine is a bit more complex, because
it does some fancy optimizations here and there, and has a bit more
smarts than I described above. But the above is conceptually correct,
and is the way to think about greed vs non-greed if you want to write
faster expressions.

- --
Eric
$_ = reverse sort qw p ekca lre Js reh ts
p, $/.r, map $_.$", qw e p h tona e; print

-----BEGIN PGP SIGNATURE-----
Version: PGPfreeware 7.0.3 for non-commercial use <http://www.pgp.com>

iQA/AwUBPw3qWmPeouIeTNHoEQKW7ACg3RfO1WM0D1c+4/ogCRccLVoRUAsAoMDe
COVQikVe3hvOhbuEf/PR28sl
=rK1g
-----END PGP SIGNATURE-----

A humble start on a Boid System Vocabulary -	0	Mar 21, 2010
The devolution of English language and slothful c.l.p behaviors exposed!	50	Jan 24, 2012
comp.lang.c FAQ list Table of Contents	0	Jan 12, 2008
Roundup of FAQ change requests	4	Dec 6, 2004
anybody help me	1	Feb 10, 2006
How bad is $'? (Was: "Get substring of line")	4	Jan 18, 2005
Ruby Weekly News 15th - 21st August 2005	2	Aug 23, 2005
comp.lang.c FAQ list Table of Contents	0	Jan 1, 2006

Grabbing certain section of a string

Desmo

Michael Budash

fatted

Eric J. Roode

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads