RegEx replace "bracketed text"

H

hrrglburf

I have information that needs to strip out all tags that start with a
'{' and end with a '}' including whatever may be in between them, but
not outside of them... I tried making my own reg. exp. but i suck at
it. can anyone give me an example?
 
N

news reader

Something like

s/\{.*?\}//g;


An example may be useful to avoid misunderstandings.

The siutation complicates a little if '\}' may be part of a tag.

Current exanple would read in:
dasdfsafdsadsa{dsadas}d dasdasd{dasdas} fsddfsf{dsadas}


and spit out
dasdfsafdsadsad dasdasd fsddfsf




bye


N.
 
N

news reader

Something like

s/\{.*?\}//g;


An example may be useful to avoid misunderstandings.

The siutation complicates a little if '\}' may be part of a tag.

Current exanple would read in:
dasdfsafdsadsa{dsadas}d dasdasd{dasdas} fsddfsf{dsadas}


and spit out
dasdfsafdsadsad dasdasd fsddfsf




bye


N.
 
G

Gunnar Hjalmarsson

Dale said:
MD> Assuming that tags don't nest and that matching '{' and '}'
MD> are not separated by line-ends, the following works:

MD> s/\{.*?\}//g

A more efficient solution is:
s/\{[^}]*\}//g

with the \s modifier this will work across line-ends.

The /s modifier isn't needed for that.
 
P

Peter J. Holzer

Marc said:
Dale Henderson said:
"MD" == Marc Dashevsky <[email protected]> writes:
MD> Assuming that tags don't nest and that matching '{' and '}'
MD> are not separated by line-ends, the following works:

MD> s/\{.*?\}//g

A more efficient solution is:
s/\{[^}]*\}//g

Thanks. Would you explain the reasons for the increased efficiency?
I don't know how to even start the analysis.

Theoretically both should be about the same speed since both require
only a linear scan for a single character without backtracking. A simple
benchmark shows that the first expression is slightly faster on my
system:


#!/usr/bin/perl
use strict;
use warnings;

use Benchmark ':all';

my $s = "aaaaaa{bbbbbbbbbbbb}cccccccccc{ddddddddd}eeeeeee";

cmpthese(100000,
{
nongreedy => sub {
local $_ = $s;
s/\{.*?\}//g;
},
class => sub {
local $_ = $s;
s/\{[^}]*\}//g
},
}
);
__END__
Rate class nongreedy
class 90909/s -- -13%
nongreedy 104167/s 15% --
 
D

Dr.Ruud

Dale Henderson schreef:
Marc Dashevsky:
<unattributed>
s/\{.*?\}//g
A more efficient solution is: s/\{[^}]*\}//g
Thanks. Would you explain the reasons for the increased
efficiency? I don't know how to even start the analysis.

It's spelled out in the Owl book :)

That could be old news.

As I understand it, the non-greedy operator gives up too easily.

That could have been optimized already. The patterns /[^x]*x/ and /.*?x/
have a lot in common.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top