Can somebody explain?

Discussion in 'Perl Misc' started by Vittal, Aug 10, 2004.

  1. Vittal

    Vittal Guest

    Hello All,

    I am new to Perl and I have been going through some of the perl code.

    In one of the file I found the following snippet:

    $temp = qr{
    \(
    (?:
    (?>[^()]+ )
    |
    (??{ $temp })
    )*
    \)
    }x;

    $cchat = qr/((\W)?(\*?\*?\w+)\s*($temp))/;

    Can somebody explain me what these two lines do?

    Thanks
    -Vittal
     
    Vittal, Aug 10, 2004
    #1
    1. Advertising

  2. Vittal wrote:
    > In one of the file I found the following snippet:
    >
    > $temp = qr{
    > \(
    > (?:
    > (?>[^()]+ )
    > |
    > (??{ $temp })
    > )*
    > \)
    > }x;
    >
    > $cchat = qr/((\W)?(\*?\*?\w+)\s*($temp))/;
    >
    > Can somebody explain me what these two lines do?


    Can't you read the explanation in the file where you found the code?

    Just a thought.

    Otherwise, the first (and the most complicated) part, is explained in
    "perldoc perlre":
    http://www.perldoc.com/perl5.8.4/pod/perlre.html#(--{-code-})

    --
    Gunnar Hjalmarsson
    Email: http://www.gunnar.cc/cgi-bin/contact.pl
     
    Gunnar Hjalmarsson, Aug 10, 2004
    #2
    1. Advertising

  3. Vittal wrote:
    > I am new to Perl and I have been going through some of the perl code.
    >
    > In one of the file I found the following snippet:
    >
    > $temp = qr{
    > \(
    > (?:
    > (?>[^()]+ )
    > |
    > (??{ $temp })
    > )*
    > \)
    > }x;


    This defines a precompiled regex such that pattern /$temp/ will match a
    string starting with a '(' and ending at the _matching_ ')'. (i.e.
    there can be any number of nested (...) in between).

    To understand why you'd have to understand the (??{...}) (?:...) and
    (?>...) regex constructs these are explained better in 'perlre' than I
    could do. So look there.

    The qr// operator is used to define precompiled regex and is documented
    in perlop. (Note as with any quoute-like operator in Perl the so called
    qr// operator can use alternate delimiters so can also be qr{}).

    The /x regex qualifier on the end of the qr// makes unqouted whitespace
    inside the regex non-significant so allows the regex to be laid out more
    readably.


    > $cchat = qr/((\W)?(\*?\*?\w+)\s*($temp))/;


    This defines another pre-complied regex that matches something followed
    by a ballenced (...).

    The "something" is rather odd but there's no point trying to explain it.
    All I'd be doing it telling you what each of the constructs mean.
    Much better to just look up the constructs used in perlre. If there's
    something you don't understand then come back and say what it is that
    you don't understand.
     
    Brian McCauley, Aug 10, 2004
    #3
  4. On Tue, 10 Aug 2004, Brian McCauley wrote:

    >> $cchat = qr/((\W)?(\*?\*?\w+)\s*($temp))/;

    >
    >This defines another pre-complied regex that matches something followed
    >by a ballenced (...).
    >
    >The "something" is rather odd but there's no point trying to explain it.


    I'd say it looks like it's matching a C function definition (although
    ignoring the return *type* of the function, and only capturing the level
    of dereferencing needed). But it does a poor job; it's too specific in
    its formulation.

    --
    Jeff "japhy" Pinyan % How can we ever be the sold short or
    RPI Acacia Brother #734 % the cheated, we who for every service
    RPI Corporation Secretary % have long ago been overpaid?
    http://japhy.perlmonk.org/ %
    http://www.perlmonks.org/ % -- Meister Eckhart
     
    Jeff 'japhy' Pinyan, Aug 10, 2004
    #4
  5. Vittal

    J. Romano Guest

    (Vittal) wrote in message news:<>...
    >
    > In one of the file I found the following snippet:
    >
    > $temp = qr{
    > \(
    > (?:
    > (?>[^()]+ )
    > |
    > (??{ $temp })
    > )*
    > \)
    > }x;
    >
    > $cchat = qr/((\W)?(\*?\*?\w+)\s*($temp))/;
    >
    > Can somebody explain me what these two lines do?


    Dear Vittal,

    The good news is that the programmer who wrote that script did not
    create the first line him/herself. He/She just copied it straight
    from the "prelre" documentation. To read what it does, type "perldoc
    perlre" at a DOS or Unix prompt and search for the word "postponed".
    You'll see the exact same piece of code and the explanation that this
    regular expression matches a parenthesized group.

    In other words, if you can use $temp like this:

    if ("I saw a color (blue)." =~ m/$temp/)
    {
    print $&; # this prints the match "(blue)"
    }

    It will also work with nested parentheses, like this:

    if ("I already ate (I ate one (1) pizza)." =~ m/$temp/)
    {
    print $&; # this prints "(I ate one (1) pizza)"
    }

    The bad news is that the programmer who wrote that script didn't feel
    the need to add comments to explain the purpose of those regular
    expressions. It has been said that it's almost always easier to
    create your own regular expressions than to understand one that's
    already written, and in this case that's definitely true. If it
    wasn't for the fact that the first line was specifically mentioned in
    "perldoc perlre", I'm sure that I would not have been able to figure
    out what it was looking for.

    The second regular expression is a little easier to figure out.
    Let's take it a bit at a time:

    > $cchat = qr/((\W)?(\*?\*?\w+)\s*($temp))/;


    The only parts of the $cchat regular expression that are not optional
    are the \w+ part (which matches at least one "word" character (that
    is, a letter, digit, or underscore)) and $temp, which matches a
    parenthesized expression. Optionally, there may be whitespace (any
    amount) between the non-optional parts. Also, there may be one or two
    asterisks before the \w+ part. There could also be an optional
    non-"word" character before the word characters (which would appear
    before the asterisks, if the asterisks happen to exist).

    Was my explanation confusing? If you didn't think so, then you're
    a super genuis. I didn't expect it to be very easy to follow (like it
    is said, it's not very easy to understand a regular expression that
    you didn't write), so I always recommend writing a few comments (with
    any non-simple regular expression) that shows a few sample matches.
    The writer of that program you're reading should have included a few
    comments like this:

    # The $temp regular expression was taken right out of
    # the "perldoc perlre" documentation. It matches a
    # parenthesized expression (that may or may not contain
    # nested parentheses):
    $ temp = ... ;

    # The purpose of the $cchat regular expression is ...
    # It matches all of the following lines:
    # some_text(parenthesized expression)
    # some_text (parenthesized expression)
    # *some_text(parenthesized expression)
    # **some_text (parenthesized expression)
    # %*some_text (parenthesized expression)
    # ^**some_text(parenthesized expression)
    # &some_text (parenthesized expression)
    # !some_text(parenthesized expression)
    $cchat = qr/((\W)?(\*?\*?\w+)\s*($temp))/;

    Because the original programmer didn't explain the purpose of the
    $cchat regular expression, it's difficult for us to figure it out for
    sure. The closest we can come to figuring it out is to examine sample
    matches and deduce the purpose from there.

    If you ever add more code to this program, do yourself and the
    future maintainers of the program a favor and add comments to document
    your regular expressions. Include the purpose of the regular
    expression (in plain English or whatever language is the main language
    spoken at work) and include a few sample matches (because sometimes
    looking at sample matches helps a person understand much better than
    looking at the regular expression itself). It also helps the
    debugging process a lot.

    When you write code, please put comments in your code that explains
    to anyone who didn't write the code what the code is doing and its
    purpose. A lot of coders avoid doing this, giving many excuses as to
    why they shouldn't. Some of the excuses are:

    * I don't need to include comments because I write
    non-esoteric (simple) code that anyone can understand.

    * Comments eventually become outdated (and outdated comments
    are wrong) and wrong comments are worse than no comments
    at all (because they are misleading).

    * I don't need to include comments because I write
    "self-documenting" code. Comments are a sign that the
    code is impossible to understand without outside help,
    and I don't write code like that.

    Don't fall into those traps! I may be offending some die-hard
    programmers here who adhere to one or more of the above traps I
    listed, but I sincerely believe that comments and documentation are
    vital to writing programs (especially when writing programs that will
    be read by other people) -- even if the comments and documentation
    become outdated (outdated comments and documentation may not be
    correct in everything they say, but at least they provide important
    hints to anyone trying to understand, debug, and maintain the code).

    I hope this helps, Vittal.

    -- Jean-Luc Romano
     
    J. Romano, Aug 11, 2004
    #5
  6. Vittal

    Anno Siegel Guest

    J. Romano <> wrote in comp.lang.perl.misc:
    > (Vittal) wrote in message
    > news:<>...


    [...]

    > When you write code, please put comments in your code that explains
    > to anyone who didn't write the code what the code is doing and its
    > purpose. A lot of coders avoid doing this, giving many excuses as to
    > why they shouldn't. Some of the excuses are:
    >
    > * I don't need to include comments because I write
    > non-esoteric (simple) code that anyone can understand.
    >
    > * Comments eventually become outdated (and outdated comments
    > are wrong) and wrong comments are worse than no comments
    > at all (because they are misleading).
    >
    > * I don't need to include comments because I write
    > "self-documenting" code. Comments are a sign that the
    > code is impossible to understand without outside help,
    > and I don't write code like that.
    >
    > Don't fall into those traps! I may be offending some die-hard
    > programmers


    I am one of the die-hard programmers who has occasionally offered points
    of view that resemble those you misrepresent as "excuses" and "traps".

    > here who adhere to one or more of the above traps I
    > listed, but I sincerely believe that comments and documentation are
    > vital to writing programs (especially when writing programs that will
    > be read by other people) -- even if the comments and documentation
    > become outdated (outdated comments and documentation may not be
    > correct in everything they say, but at least they provide important
    > hints to anyone trying to understand, debug, and maintain the code).


    What is offensive is not your opposition but your misrepresentation.

    Let me first set the scope. We are talking about *comments*, (not
    documentation in general, as you chose to drag in), and, specifically,
    I'm talking about micro-commenting single statements of code. So-called
    block comments (as might precede a sub definition or a group of such)
    are another issue altogether. Further, we are talking about comments
    in Perl, or a similarly high-level language.

    Within that scope, I maintain that comments should be the rare exception,
    not the rule.

    What you present as a stance of hubris ("I don't write code like that")
    is really an exhortation not to write code that needs comments. That may
    not be possible in assembler, but in Perl and similar languages it is.
    Perl has complex data structures that can be treated as units and all
    house-keeping (length of strings, number of elements in an array, keys
    of a hash, etc.) is taken care of.

    That allows a programmer to work in units that are meaningful in terms
    of the overall process and not fiddle with stuff below that level. Usually,
    what you do on the process level is obvious. If you feel the need to
    explain some code, that is usually a sign that you haven't found the
    right data structure and/or algorithm yet. So don't paste it over
    with an explanatory comment, rewrite it so that it doesn't need one.

    Anno
     
    Anno Siegel, Aug 11, 2004
    #6
  7. Vittal

    Vittal Guest

    Hello Jean,

    Thank you very much for replying in deatil.

    Yes, the person who has written the code has not give much comments
    other than saying "matches the parenthesis".

    I tried to break the two statements in samller chunks but could not.
    These two statements took my night sleep away :)

    Now I have got little understanding about these lines.

    Thanks again!
    -Vittal
    (J. Romano) wrote in message news:<>...
    > (Vittal) wrote in message news:<>...
    > >
    > > In one of the file I found the following snippet:
    > >
    > > $temp = qr{
    > > \(
    > > (?:
    > > (?>[^()]+ )
    > > |
    > > (??{ $temp })
    > > )*
    > > \)
    > > }x;
    > >
    > > $cchat = qr/((\W)?(\*?\*?\w+)\s*($temp))/;
    > >
    > > Can somebody explain me what these two lines do?

    >
    > Dear Vittal,
    >
    > The good news is that the programmer who wrote that script did not
    > create the first line him/herself. He/She just copied it straight
    > from the "prelre" documentation. To read what it does, type "perldoc
    > perlre" at a DOS or Unix prompt and search for the word "postponed".
    > You'll see the exact same piece of code and the explanation that this
    > regular expression matches a parenthesized group.
    >
    > In other words, if you can use $temp like this:
    >
    > if ("I saw a color (blue)." =~ m/$temp/)
    > {
    > print $&; # this prints the match "(blue)"
    > }
    >
    > It will also work with nested parentheses, like this:
    >
    > if ("I already ate (I ate one (1) pizza)." =~ m/$temp/)
    > {
    > print $&; # this prints "(I ate one (1) pizza)"
    > }
    >
    > The bad news is that the programmer who wrote that script didn't feel
    > the need to add comments to explain the purpose of those regular
    > expressions. It has been said that it's almost always easier to
    > create your own regular expressions than to understand one that's
    > already written, and in this case that's definitely true. If it
    > wasn't for the fact that the first line was specifically mentioned in
    > "perldoc perlre", I'm sure that I would not have been able to figure
    > out what it was looking for.
    >
    > The second regular expression is a little easier to figure out.
    > Let's take it a bit at a time:
    >
    > > $cchat = qr/((\W)?(\*?\*?\w+)\s*($temp))/;

    >
    > The only parts of the $cchat regular expression that are not optional
    > are the \w+ part (which matches at least one "word" character (that
    > is, a letter, digit, or underscore)) and $temp, which matches a
    > parenthesized expression. Optionally, there may be whitespace (any
    > amount) between the non-optional parts. Also, there may be one or two
    > asterisks before the \w+ part. There could also be an optional
    > non-"word" character before the word characters (which would appear
    > before the asterisks, if the asterisks happen to exist).
    >
    > Was my explanation confusing? If you didn't think so, then you're
    > a super genuis. I didn't expect it to be very easy to follow (like it
    > is said, it's not very easy to understand a regular expression that
    > you didn't write), so I always recommend writing a few comments (with
    > any non-simple regular expression) that shows a few sample matches.
    > The writer of that program you're reading should have included a few
    > comments like this:
    >
    > # The $temp regular expression was taken right out of
    > # the "perldoc perlre" documentation. It matches a
    > # parenthesized expression (that may or may not contain
    > # nested parentheses):
    > $ temp = ... ;
    >
    > # The purpose of the $cchat regular expression is ...
    > # It matches all of the following lines:
    > # some_text(parenthesized expression)
    > # some_text (parenthesized expression)
    > # *some_text(parenthesized expression)
    > # **some_text (parenthesized expression)
    > # %*some_text (parenthesized expression)
    > # ^**some_text(parenthesized expression)
    > # &some_text (parenthesized expression)
    > # !some_text(parenthesized expression)
    > $cchat = qr/((\W)?(\*?\*?\w+)\s*($temp))/;
    >
    > Because the original programmer didn't explain the purpose of the
    > $cchat regular expression, it's difficult for us to figure it out for
    > sure. The closest we can come to figuring it out is to examine sample
    > matches and deduce the purpose from there.
    >
    > If you ever add more code to this program, do yourself and the
    > future maintainers of the program a favor and add comments to document
    > your regular expressions. Include the purpose of the regular
    > expression (in plain English or whatever language is the main language
    > spoken at work) and include a few sample matches (because sometimes
    > looking at sample matches helps a person understand much better than
    > looking at the regular expression itself). It also helps the
    > debugging process a lot.
    >
    > When you write code, please put comments in your code that explains
    > to anyone who didn't write the code what the code is doing and its
    > purpose. A lot of coders avoid doing this, giving many excuses as to
    > why they shouldn't. Some of the excuses are:
    >
    > * I don't need to include comments because I write
    > non-esoteric (simple) code that anyone can understand.
    >
    > * Comments eventually become outdated (and outdated comments
    > are wrong) and wrong comments are worse than no comments
    > at all (because they are misleading).
    >
    > * I don't need to include comments because I write
    > "self-documenting" code. Comments are a sign that the
    > code is impossible to understand without outside help,
    > and I don't write code like that.
    >
    > Don't fall into those traps! I may be offending some die-hard
    > programmers here who adhere to one or more of the above traps I
    > listed, but I sincerely believe that comments and documentation are
    > vital to writing programs (especially when writing programs that will
    > be read by other people) -- even if the comments and documentation
    > become outdated (outdated comments and documentation may not be
    > correct in everything they say, but at least they provide important
    > hints to anyone trying to understand, debug, and maintain the code).
    >
    > I hope this helps, Vittal.
    >
    > -- Jean-Luc Romano
     
    Vittal, Aug 12, 2004
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Shapper
    Replies:
    3
    Views:
    558
    Scott Allen
    Jul 9, 2005
  2. dd711
    Replies:
    6
    Views:
    904
    Alex Hunsley
    Oct 1, 2004
  3. John Dean
    Replies:
    4
    Views:
    266
    John Dean
    Sep 15, 2003
  4. asdfghjk
    Replies:
    17
    Views:
    620
    Keith Thompson
    Aug 19, 2010
  5. Steve
    Replies:
    3
    Views:
    208
    C.DeRykus
    Feb 24, 2010
Loading...

Share This Page