Regexp to search over several lines in one string

D

d99alu

Hi!

I have a string, and I want to remove everything behind the ">"
character. The string contains new line characters that I don't want
to remove.

my $string = "line1
line2>
line3";

Why don't I get a match and replacement with this?

$string =~ s/^([^>]*>)/$1/;

I would expect the string to contain:

"line1
line2>"

But it still contains "line3"!!!

Why is this?
Any suggestions for how to do this in an other 8working) manner?

Best Regards,
Andreas - Sweden
 
D

Dr.Ruud

(e-mail address removed) schreef:
I have a string, and I want to remove everything behind the ">"
character. The string contains new line characters that I don't want
to remove.

s/(?:<=>).*//s;

See perldoc perlre, search for "look-behind".
 
G

Gunnar Hjalmarsson

I have a string, and I want to remove everything behind the ">"
character. The string contains new line characters that I don't want
to remove.

my $string = "line1
line2>
line3";

Why don't I get a match and replacement with this?

$string =~ s/^([^>]*>)/$1/;

It does match, but since you capture everything, and insert the captured
string using $1, nothing gets changed.
I would expect the string to contain:

"line1
line2>"

But it still contains "line3"!!!

Why is this?

Because your regex does not match the "line3" portion of the string.
Any suggestions for how to do this in an other 8working) manner?

One way to remove everything after the '>' character would be:

$string =~ s/[^>]+$//;

However, that removes the newline between "line2>" and "line3" as well...

This removes everything after '>' but newlines:

$string =~ s{([^>]+)$}{
my $rm = $1;
$rm =~ s/.+//g;
$rm;
}e;
 
D

Dr.Ruud

Dr.Ruud schreef:
d99alu:

s/(?:<=>).*//s;

See perldoc perlre, search for "look-behind".

I also forgot the newline. Maybe this does what you need:

s/(?<=>).*/\n/s;

(doesn't keep any of the original newlines; even adds one when none was
there)
 
G

Gunnar Hjalmarsson

Gunnar said:
I have a string, and I want to remove everything behind the ">"
character. The string contains new line characters that I don't want
to remove.

my $string = "line1
line2>
line3";

Why don't I get a match and replacement with this?

$string =~ s/^([^>]*>)/$1/;

It does match, but since you capture everything, and insert the captured
string using $1, nothing gets changed.

I have a feeling that the code above actually is an attempt to do:

if ( $string =~ /^([^>]*>)/ ) {
$string = $1;
}

That replaces the content of _$string_ with what was captured in the
regex. However, it's accomplished via the m// operator, while you were
using the s/// operator.

I recommend that you read up on both those operators in "perldoc perlop".
 
D

Dr.Ruud

Gunnar Hjalmarsson schreef:
Petr Vileta:
$string =~ s/^([^>]*>).*$/$1/s;

The '$' character is redundant after .*

Yes, in this case (because of the s-modfier) it is.

$ echo "abcd" |perl -pe 's/(.*)$/"<".++$i."=$1:".length($1).">\n"/ge'
<1=abcd:4>
<2=:0>

<3=:0>


$ echo "abcd" |perl -pe 's/(.*)$/"<".++$i."=$1:".length($1).">\n"/sge'
<1=abcd
:5>
<2=:0>
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,773
Messages
2,569,594
Members
45,124
Latest member
JuniorPell
Top