Regexp to search over several lines in one string

Discussion in 'Perl Misc' started by d99alu@efd.lth.se, Jan 27, 2008.

  1. Guest

    Hi!

    I have a string, and I want to remove everything behind the ">"
    character. The string contains new line characters that I don't want
    to remove.

    my $string = "line1
    line2>
    line3";

    Why don't I get a match and replacement with this?

    $string =~ s/^([^>]*>)/$1/;

    I would expect the string to contain:

    "line1
    line2>"

    But it still contains "line3"!!!

    Why is this?
    Any suggestions for how to do this in an other 8working) manner?

    Best Regards,
    Andreas - Sweden
     
    , Jan 27, 2008
    #1
    1. Advertising

  2. Dr.Ruud Guest

    schreef:

    > I have a string, and I want to remove everything behind the ">"
    > character. The string contains new line characters that I don't want
    > to remove.


    s/(?:<=>).*//s;

    See perldoc perlre, search for "look-behind".

    --
    Affijn, Ruud

    "Gewoon is een tijger."
     
    Dr.Ruud, Jan 27, 2008
    #2
    1. Advertising

  3. wrote:
    > I have a string, and I want to remove everything behind the ">"
    > character. The string contains new line characters that I don't want
    > to remove.
    >
    > my $string = "line1
    > line2>
    > line3";
    >
    > Why don't I get a match and replacement with this?
    >
    > $string =~ s/^([^>]*>)/$1/;


    It does match, but since you capture everything, and insert the captured
    string using $1, nothing gets changed.

    > I would expect the string to contain:
    >
    > "line1
    > line2>"
    >
    > But it still contains "line3"!!!
    >
    > Why is this?


    Because your regex does not match the "line3" portion of the string.

    > Any suggestions for how to do this in an other 8working) manner?


    One way to remove everything after the '>' character would be:

    $string =~ s/[^>]+$//;

    However, that removes the newline between "line2>" and "line3" as well...

    This removes everything after '>' but newlines:

    $string =~ s{([^>]+)$}{
    my $rm = $1;
    $rm =~ s/.+//g;
    $rm;
    }e;

    --
    Gunnar Hjalmarsson
    Email: http://www.gunnar.cc/cgi-bin/contact.pl
     
    Gunnar Hjalmarsson, Jan 27, 2008
    #3
  4. Dr.Ruud wrote:
    > schreef:
    >
    >> I have a string, and I want to remove everything behind the ">"
    >> character. The string contains new line characters that I don't want
    >> to remove.

    >
    > s/(?:<=>).*//s;


    ITYM: s/(?<=>).*//s;


    John
    --
    Perl isn't a toolbox, but a small machine shop where you
    can special-order certain sorts of tools at low cost and
    in short order. -- Larry Wall
     
    John W. Krahn, Jan 27, 2008
    #4
  5. Dr.Ruud Guest

    John W. Krahn schreef:
    > Dr.Ruud:
    >> d99alu:


    >>> I have a string, and I want to remove everything behind the ">"
    >>> character. The string contains new line characters that I don't want
    >>> to remove.

    >>
    >> s/(?:<=>).*//s;

    >
    > ITYM: s/(?<=>).*//s;


    Yes. (aaargh, oops again)

    --
    Affijn, Ruud

    "Gewoon is een tijger."
     
    Dr.Ruud, Jan 27, 2008
    #5
  6. Dr.Ruud Guest

    Dr.Ruud schreef:
    > d99alu:


    >> I have a string, and I want to remove everything behind the ">"
    >> character. The string contains new line characters that I don't want
    >> to remove.

    >
    > s/(?:<=>).*//s;
    >
    > See perldoc perlre, search for "look-behind".


    I also forgot the newline. Maybe this does what you need:

    s/(?<=>).*/\n/s;

    (doesn't keep any of the original newlines; even adds one when none was
    there)

    --
    Affijn, Ruud

    "Gewoon is een tijger."
     
    Dr.Ruud, Jan 27, 2008
    #6
  7. Gunnar Hjalmarsson, Jan 27, 2008
    #7
  8. Gunnar Hjalmarsson wrote:
    > wrote:
    >> I have a string, and I want to remove everything behind the ">"
    >> character. The string contains new line characters that I don't want
    >> to remove.
    >>
    >> my $string = "line1
    >> line2>
    >> line3";
    >>
    >> Why don't I get a match and replacement with this?
    >>
    >> $string =~ s/^([^>]*>)/$1/;

    >
    > It does match, but since you capture everything, and insert the captured
    > string using $1, nothing gets changed.


    I have a feeling that the code above actually is an attempt to do:

    if ( $string =~ /^([^>]*>)/ ) {
    $string = $1;
    }

    That replaces the content of _$string_ with what was captured in the
    regex. However, it's accomplished via the m// operator, while you were
    using the s/// operator.

    I recommend that you read up on both those operators in "perldoc perlop".

    --
    Gunnar Hjalmarsson
    Email: http://www.gunnar.cc/cgi-bin/contact.pl
     
    Gunnar Hjalmarsson, Jan 27, 2008
    #8
  9. Dr.Ruud Guest

    Gunnar Hjalmarsson schreef:
    > Petr Vileta:


    >> $string =~ s/^([^>]*>).*$/$1/s;

    >
    > The '$' character is redundant after .*


    Yes, in this case (because of the s-modfier) it is.

    $ echo "abcd" |perl -pe 's/(.*)$/"<".++$i."=$1:".length($1).">\n"/ge'
    <1=abcd:4>
    <2=:0>

    <3=:0>


    $ echo "abcd" |perl -pe 's/(.*)$/"<".++$i."=$1:".length($1).">\n"/sge'
    <1=abcd
    :5>
    <2=:0>

    --
    Affijn, Ruud

    "Gewoon is een tijger."
     
    Dr.Ruud, Jan 28, 2008
    #9
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Anand Pillai

    String search vs regexp search

    Anand Pillai, Oct 12, 2003, in forum: Python
    Replies:
    10
    Views:
    600
    Anand Pillai
    Oct 15, 2003
  2. Dominik Kaspar

    writing code over several lines

    Dominik Kaspar, Oct 17, 2003, in forum: Python
    Replies:
    20
    Views:
    609
    Peter Hansen
    Oct 21, 2003
  3. Jan Ask
    Replies:
    6
    Views:
    113
    Jan Ask
    Aug 6, 2007
  4. Joao Silva
    Replies:
    16
    Views:
    363
    7stud --
    Aug 21, 2009
  5. Rick
    Replies:
    1
    Views:
    97
    Gunnar Hjalmarsson
    Oct 31, 2006
Loading...

Share This Page