Stupid regex problem, s/// catching extra letter

Discussion in 'Perl Misc' started by Jason C, Jul 18, 2012.

  1. Jason C

    Jason C Guest

    I know better than to work late at night, but sometimes it just can't be helped :)

    I'm doing a simple s///, converting "www." to "http://www." when "www." occurs without a preceding "http://". Here's what I'm doing:

    $text = "www.example.com";
    $text =~ s#[^(http://)]www\.#http://www\.#gi;
    print $text;

    If $text is this, though:

    $text = "<div>www.example.com</div>";

    the regex is catching the > in <div>, printing:

    <divhttp://www.example.com</div>

    Where am I screwing up?
     
    Jason C, Jul 18, 2012
    #1
    1. Advertising

  2. Jason C

    Jason C Guest

    On Wednesday, July 18, 2012 12:57:00 AM UTC-4, thepoet wrote:
    > What you're trying to do is a zero width negative look-behind
    > assertion.
    > s#(?<!http://)www\.#http://www.#gi should do the trick.
    > The "(?<!...)" tells the regex engine to only match the following
    > pattern if it is not preceded by the pattern in the look-behind,
    > without capturing anything.
    >
    > "perldoc perlre" has good explanations for character classes
    > and look-around assertions.
    >
    > -Chris


    Thanks for the help, Chris. Character classes aren't exactly intuitive when a symbol changes definition completely based on context, so I'm still struggling with that a little.

    The modification you suggested was perfect, though! Thanks again :)
     
    Jason C, Jul 18, 2012
    #2
    1. Advertising

  3. Jason C <> writes:
    > On Wednesday, July 18, 2012 12:57:00 AM UTC-4, thepoet wrote:
    >> What you're trying to do is a zero width negative look-behind
    >> assertion.
    >> s#(?<!http://)www\.#http://www.#gi should do the trick.
    >> The "(?<!...)" tells the regex engine to only match the following
    >> pattern if it is not preceded by the pattern in the look-behind,
    >> without capturing anything.
    >>
    >> "perldoc perlre" has good explanations for character classes
    >> and look-around assertions.
    >>
    >> -Chris

    >
    > Thanks for the help, Chris. Character classes aren't exactly
    > intuitive when a symbol changes definition completely based on
    > context, so I'm still struggling with that a little.


    A character class denotes an unordered set of characters, meaning

    [^http://]
    [^htp:/]
    [^:pppppth/]
    [^:/hpt]
    [^h:t/p]

    all represent identical sets and they all match a single character.
    But you wanted to match the string http:// and a regex matching a
    string is just the string itself, IOW, THIS sequence of characters.
     
    Rainer Weikusat, Jul 18, 2012
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. vertigo

    big letter -> small letter

    vertigo, Jul 6, 2004, in forum: Python
    Replies:
    4
    Views:
    759
    Reinhold Birkenfeld
    Jul 6, 2004
  2. Tony Meyer

    RE: big letter -> small letter

    Tony Meyer, Jul 6, 2004, in forum: Python
    Replies:
    0
    Views:
    507
    Tony Meyer
    Jul 6, 2004
  3. Andrew McNamara

    Re: big letter -> small letter

    Andrew McNamara, Jul 6, 2004, in forum: Python
    Replies:
    2
    Views:
    753
    Scott David Daniels
    Jul 6, 2004
  4. Brandon McCombs
    Replies:
    4
    Views:
    519
    Richard Wheeldon
    Aug 28, 2006
  5. mathieu
    Replies:
    3
    Views:
    599
    Bo Persson
    Sep 4, 2009
Loading...

Share This Page