Simple REGEX-Question

Discussion in 'Perl Misc' started by Reinhard Glauber, Jan 21, 2006.

  1. ok, I have read the perldoc about regex but I don't understand (maybe because I'm German ;-) )
    the thing with the interrogation mark

    I search for 'München' in a string and want also the 4 lines before.

    I was told to use:

    $html =~ /(.*\n)(?:.*\n){4}.*München /;


    but why not simple say:

    $html =~ /(.*\n){4}.*München /;

    thanks
     
    Reinhard Glauber, Jan 21, 2006
    #1
    1. Advertising

  2. Reinhard Glauber

    Xicheng Guest

    Reinhard Glauber wrote:
    > ok, I have read the perldoc about regex but I don't understand (maybe because I'm German ;-) )
    > the thing with the interrogation mark
    >
    > I search for 'München' in a string and want also the 4 lines before.
    >
    > I was told to use:
    >
    > $html =~ /(.*\n)(?:.*\n){4}.*München /;

    this is not true, it returns 6 lines to $&, and the last line is
    terminated with 'München '(no newline"\n"). I guess this is not what
    you wanted.

    > but why not simple say:
    > $html =~ /(.*\n){4}.*München /;

    If you need totally 5 lines exactly, and meanwhile print out all the
    contents on the 'München ' line, you may try something like:
    $html =~ /((?:.*\n){4}.*München.*\n)/ and print $1;
    Better use backreference instead of '$&' to print your data.

    Xicheng
     
    Xicheng, Jan 21, 2006
    #2
    1. Advertising

  3. Reinhard Glauber

    Anno Siegel Guest

    Xicheng <> wrote in comp.lang.perl.misc:
    > Reinhard Glauber wrote:
    > > ok, I have read the perldoc about regex but I don't understand (maybe

    > because I'm German ;-) )
    > > the thing with the interrogation mark
    > >
    > > I search for 'München' in a string and want also the 4 lines before.
    > >
    > > I was told to use:
    > >
    > > $html =~ /(.*\n)(?:.*\n){4}.*München /;

    > this is not true, it returns 6 lines to $&, and the last line is
    > terminated with 'München '(no newline"\n"). I guess this is not what
    > you wanted.
    >
    > > but why not simple say:
    > > $html =~ /(.*\n){4}.*München /;

    > If you need totally 5 lines exactly, and meanwhile print out all the
    > contents on the 'München ' line, you may try something like:
    > $html =~ /((?:.*\n){4}.*München.*\n)/ and print $1;
    > Better use backreference instead of '$&' to print your data.


    Terminology alert!

    "Backreferences" means to the use of the escapes "\1", \2", etc. inside
    a regex to refer back to earlier captures in the current match. The
    variables $1, $2, etc. are variously called "capture variables", "digit
    variables" or other things, but "backreference" is best reserved for the
    escaped form.

    Anno
    --
    If you want to post a followup via groups.google.com, don't use
    the broken "Reply" link at the bottom of the article. Click on
    "show options" at the top of the article, then click on the
    "Reply" at the bottom of the article headers.
     
    Anno Siegel, Jan 23, 2006
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Saad Malik
    Replies:
    5
    Views:
    402
    John C. Bollinger
    May 2, 2005
  2. John Salerno

    a simple regex question

    John Salerno, Apr 1, 2006, in forum: Python
    Replies:
    6
    Views:
    324
    Paddy
    Apr 2, 2006
  3. johnny

    Simple Python REGEX Question

    johnny, May 11, 2007, in forum: Python
    Replies:
    4
    Views:
    422
    James T. Dennis
    May 12, 2007
  4. Replies:
    3
    Views:
    835
    Reedick, Andrew
    Jul 1, 2008
  5. Sam Kong
    Replies:
    8
    Views:
    129
    Csaba Henk
    Mar 25, 2005
Loading...

Share This Page