In one of the file I found the following snippet:
$temp = qr{
\(
(?:
(?>[^()]+ )
|
(??{ $temp })
)*
\)
}x;
$cchat = qr/((\W)?(\*?\*?\w+)\s*($temp))/;
Can somebody explain me what these two lines do?
Dear Vittal,
The good news is that the programmer who wrote that script did not
create the first line him/herself. He/She just copied it straight
from the "prelre" documentation. To read what it does, type "perldoc
perlre" at a DOS or Unix prompt and search for the word "postponed".
You'll see the exact same piece of code and the explanation that this
regular expression matches a parenthesized group.
In other words, if you can use $temp like this:
if ("I saw a color (blue)." =~ m/$temp/)
{
print $&; # this prints the match "(blue)"
}
It will also work with nested parentheses, like this:
if ("I already ate (I ate one (1) pizza)." =~ m/$temp/)
{
print $&; # this prints "(I ate one (1) pizza)"
}
The bad news is that the programmer who wrote that script didn't feel
the need to add comments to explain the purpose of those regular
expressions. It has been said that it's almost always easier to
create your own regular expressions than to understand one that's
already written, and in this case that's definitely true. If it
wasn't for the fact that the first line was specifically mentioned in
"perldoc perlre", I'm sure that I would not have been able to figure
out what it was looking for.
The second regular expression is a little easier to figure out.
Let's take it a bit at a time:
$cchat = qr/((\W)?(\*?\*?\w+)\s*($temp))/;
The only parts of the $cchat regular expression that are not optional
are the \w+ part (which matches at least one "word" character (that
is, a letter, digit, or underscore)) and $temp, which matches a
parenthesized expression. Optionally, there may be whitespace (any
amount) between the non-optional parts. Also, there may be one or two
asterisks before the \w+ part. There could also be an optional
non-"word" character before the word characters (which would appear
before the asterisks, if the asterisks happen to exist).
Was my explanation confusing? If you didn't think so, then you're
a super genuis. I didn't expect it to be very easy to follow (like it
is said, it's not very easy to understand a regular expression that
you didn't write), so I always recommend writing a few comments (with
any non-simple regular expression) that shows a few sample matches.
The writer of that program you're reading should have included a few
comments like this:
# The $temp regular expression was taken right out of
# the "perldoc perlre" documentation. It matches a
# parenthesized expression (that may or may not contain
# nested parentheses):
$ temp = ... ;
# The purpose of the $cchat regular expression is ...
# It matches all of the following lines:
# some_text(parenthesized expression)
# some_text (parenthesized expression)
# *some_text(parenthesized expression)
# **some_text (parenthesized expression)
# %*some_text (parenthesized expression)
# ^**some_text(parenthesized expression)
# &some_text (parenthesized expression)
# !some_text(parenthesized expression)
$cchat = qr/((\W)?(\*?\*?\w+)\s*($temp))/;
Because the original programmer didn't explain the purpose of the
$cchat regular expression, it's difficult for us to figure it out for
sure. The closest we can come to figuring it out is to examine sample
matches and deduce the purpose from there.
If you ever add more code to this program, do yourself and the
future maintainers of the program a favor and add comments to document
your regular expressions. Include the purpose of the regular
expression (in plain English or whatever language is the main language
spoken at work) and include a few sample matches (because sometimes
looking at sample matches helps a person understand much better than
looking at the regular expression itself). It also helps the
debugging process a lot.
When you write code, please put comments in your code that explains
to anyone who didn't write the code what the code is doing and its
purpose. A lot of coders avoid doing this, giving many excuses as to
why they shouldn't. Some of the excuses are:
* I don't need to include comments because I write
non-esoteric (simple) code that anyone can understand.
* Comments eventually become outdated (and outdated comments
are wrong) and wrong comments are worse than no comments
at all (because they are misleading).
* I don't need to include comments because I write
"self-documenting" code. Comments are a sign that the
code is impossible to understand without outside help,
and I don't write code like that.
Don't fall into those traps! I may be offending some die-hard
programmers here who adhere to one or more of the above traps I
listed, but I sincerely believe that comments and documentation are
vital to writing programs (especially when writing programs that will
be read by other people) -- even if the comments and documentation
become outdated (outdated comments and documentation may not be
correct in everything they say, but at least they provide important
hints to anyone trying to understand, debug, and maintain the code).
I hope this helps, Vittal.
-- Jean-Luc Romano