match a long string in Regex..

Discussion in 'Perl Misc' started by Alont, Sep 29, 2004.

  1. Alont

    Alont Guest

    original string:(it's one line):
    javascript:if(confirm('http://validator.w3.org/ \n\nThis file was not
    retrieved by Teleport Pro, because it is addressed on a domain or path
    outside the boundaries set for its Starting Address. \n\nDo you want
    to open it from the server?'))window.location='

    because 'http://validator.w3.org/' is variable, so I wrote:

    javascript:if\(confirm\('(.*|(\\n)*) \\n\\nThis file was not
    retrieved by Teleport Pro, because it is addressed on a domain or path
    outside the boundaries set for its Starting Address\. \\n\\nDo you
    want to open it from the server?'))window.location='

    I have tried/change again and again,
    but it can't match, why?

    this is all the code:

    sub deleteTrash {
    my $original = "javascript:if\(confirm\('(.*|(\\n)*) \\n\\nThis file
    was not retrieved by Teleport Pro, because it is addressed on a domain
    or path outside the boundaries set for its Starting Address\.
    \\n\\nDo you want to open it from the
    server\?'\)\)window\.location='";
    my $body = shift;
    if($body =~ s/$original//g)
    {
    return 0;
    }
    else
    {
    return $body;
    }
    }

    --
    Your fault as a Government is My failure as a Citizen.
     
    Alont, Sep 29, 2004
    #1
    1. Advertising

  2. Alont

    Paul Lalli Guest

    "Alont" <> wrote in message
    news:415aa574.30926250@130.133.1.4...
    > I have tried/change again and again,
    > but it can't match, why?
    >
    > this is all the code:
    >
    > sub deleteTrash {
    > my $original = "javascript:if\(confirm\('(.*|(\\n)*) \\n\\nThis file
    > was not retrieved by Teleport Pro, because it is addressed on a domain
    > or path outside the boundaries set for its Starting Address\.
    > \\n\\nDo you want to open it from the
    > server\?'\)\)window\.location='";


    The four embedded newlines hear actually need to be doubly escaped.
    That is, the string should contain: \\\\n\\\\n rather than \\n\\n. This
    is because pattern matching passes through two sets of interpolation.
    First for double-quotish interpolation, then for regular expression
    interpolation.

    Additionally, because you are using double quotes to create $original,
    \( gets treated as an actual parentheses, which is then interpolated by
    the RegExp engine to mean "start capture". I would suggest changing to
    single-quotes here:
    my $original = q[javascript:if\(confirm\('(.*|\\n*) \\\\n\\\\nThis file
    was not retrieved by Teleport Pro, because it is addressed on a domain
    or path outside the boundaries set for its Starting Address\.
    \\\\n\\\\nDo you want to open it from the
    server\?'\)\)window\.location='];
    #all the above on one line

    One more thing - I don't think you're doing what you think you're doing
    with (.*|\n*). That will match any sequence of non-newlines, or any
    sequence of newlines. It will not match any sequence of characters
    which contains newlines. For that, stick with .* and add the /s switch
    to the pattern match.

    Paul Lalli


    > my $body = shift;
    > if($body =~ s/$original//g)
    > {
    > return 0;
    > }
    > else
    > {
    > return $body;
    > }
    > }
    >
    > --
    > Your fault as a Government is My failure as a Citizen.
     
    Paul Lalli, Sep 29, 2004
    #2
    1. Advertising

  3. Alont

    Alont Guest

    "Paul Lalli" <>Wrote at Wed, 29 Sep 2004 13:11:43 GMT:
    >
    >The four embedded newlines hear actually need to be doubly escaped.
    >That is, the string should contain: \\\\n\\\\n rather than \\n\\n. This
    >is because pattern matching passes through two sets of interpolation.
    >First for double-quotish interpolation, then for regular expression
    >interpolation.
    >
    >Additionally, because you are using double quotes to create $original,
    >\( gets treated as an actual parentheses, which is then interpolated by
    >the RegExp engine to mean "start capture". I would suggest changing to
    >single-quotes here:
    >my $original = q[javascript:if\(confirm\('(.*|\\n*) \\\\n\\\\nThis file
    >was not retrieved by Teleport Pro, because it is addressed on a domain
    >or path outside the boundaries set for its Starting Address\.
    >\\\\n\\\\nDo you want to open it from the
    >server\?'\)\)window\.location='];
    >#all the above on one line
    >
    >One more thing - I don't think you're doing what you think you're doing
    >with (.*|\n*). That will match any sequence of non-newlines, or any
    >sequence of newlines. It will not match any sequence of characters
    >which contains newlines. For that, stick with .* and add the /s switch
    >to the pattern match.
    >
    >Paul Lalli
    >


    now it works, thank you :)
    --
    Your fault as a Government is My failure as a Citizen.
     
    Alont, Oct 1, 2004
    #3
  4. Alont wrote:
    > original string:(it's one line):
    > javascript:if(confirm('http://validator.w3.org/ \n\nThis file was not
    > retrieved by Teleport Pro, because it is addressed on a domain or path
    > outside the boundaries set for its Starting Address. \n\nDo you want
    > to open it from the server?'))window.location='
    >
    > because 'http://validator.w3.org/' is variable, so I wrote:
    >
    > javascript:if\(confirm\('(.*|(\\n)*) \\n\\nThis file was not
    > retrieved by Teleport Pro, because it is addressed on a domain or path
    > outside the boundaries set for its Starting Address\. \\n\\nDo you
    > want to open it from the server?'))window.location='
    >
    > I have tried/change again and again,
    > but it can't match, why?
    >
    > this is all the code:
    >
    > sub deleteTrash {
    > my $original = "javascript:if\(confirm\('(.*|(\\n)*) \\n\\nThis file
    > was not retrieved by Teleport Pro, because it is addressed on a domain
    > or path outside the boundaries set for its Starting Address\.
    > \\n\\nDo you want to open it from the
    > server\?'\)\)window\.location='";
    > my $body = shift;
    > if($body =~ s/$original//g)
    > {
    > return 0;
    > }
    > else
    > {
    > return $body;
    > }
    > }
    >


    A day late and a dollar short

    FYI
    What I've found to help me GREATLY with large complex regexp is to use
    the x option. It makes reading MUCH easier.

    instead of having
    $line =~ /^\s*(some_regx) (another_regex) .../; you get hte picture

    I like
    $line =~ /^\s*
    (some_regex)\s+ # looking for this
    (another_regex)\s+ # looking for that
    ...
    /x;

    I've have if/then/else where regex fill one complete page
    Without the x option this would be almost impossible to write or debug.
    What the x does is
    1. ignore white space
    2. ignore comments
    3. ignore <cr>

    For big harry regexp's use the x option

    --
    ___ _ ____ ___ __ __
    / _ )(_) / /_ __ / _ \___ _/ /_/ /____ ___
    / _ / / / / // / / ___/ _ `/ __/ __/ _ \/ _ \
    /____/_/_/_/\_, / /_/ \_,_/\__/\__/\___/_//_/
    /___/
    Texas Instruments ASIC Circuit Design Methodlogy Group
    Dallas, Texas, 214-480-4455,
     
    Billy N. Patton, Oct 4, 2004
    #4
  5. Billy N. Patton <> wrote:

    > What I've found to help me GREATLY with large complex regexp is to use
    > the x option. It makes reading MUCH easier.



    Which is why it will be "on" by default in Perl 6.


    --
    Tad McClellan SGML consulting
    Perl programming
    Fort Worth, Texas
     
    Tad McClellan, Oct 4, 2004
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. hiwa
    Replies:
    0
    Views:
    640
  2. George Marsaglia

    Assigning unsigned long to unsigned long long

    George Marsaglia, Jul 8, 2003, in forum: C Programming
    Replies:
    1
    Views:
    686
    Eric Sosman
    Jul 8, 2003
  3. Daniel Rudy

    unsigned long long int to long double

    Daniel Rudy, Sep 19, 2005, in forum: C Programming
    Replies:
    5
    Views:
    1,201
    Peter Shaggy Haywood
    Sep 20, 2005
  4. Mathieu Dutour

    long long and long

    Mathieu Dutour, Jul 17, 2007, in forum: C Programming
    Replies:
    4
    Views:
    483
    santosh
    Jul 24, 2007
  5. veryhotsausage
    Replies:
    1
    Views:
    1,813
    veryhotsausage
    Jul 4, 2008
Loading...

Share This Page