Can somebody explain?

Vittal · Aug 10, 2004

Hello All,

I am new to Perl and I have been going through some of the perl code.

In one of the file I found the following snippet:

$temp = qr{
$
(?:
(?>[^()]+ )
|
(??{ $temp })
)*
$
}x;

$cchat = qr/((\W)?(\*?\*?\w+)\s*($temp))/;

Can somebody explain me what these two lines do?

Thanks
-Vittal

Gunnar Hjalmarsson · Aug 10, 2004

Vittal said:
In one of the file I found the following snippet:

$temp = qr{
$
(?:
(?>[^()]+ )
|
(??{ $temp })
)*
$
}x;

$cchat = qr/((\W)?(\*?\*?\w+)\s*($temp))/;

Can somebody explain me what these two lines do?

Can't you read the explanation in the file where you found the code?

Just a thought.

Otherwise, the first (and the most complicated) part, is explained in
"perldoc perlre":
http://www.perldoc.com/perl5.8.4/pod/perlre.html#(--{-code-})

Brian McCauley · Aug 10, 2004

Vittal said:
I am new to Perl and I have been going through some of the perl code.

In one of the file I found the following snippet:

$temp = qr{
$
(?:
(?>[^()]+ )
|
(??{ $temp })
)*
$
}x;

This defines a precompiled regex such that pattern /$temp/ will match a
string starting with a '(' and ending at the _matching_ ')'. (i.e.
there can be any number of nested (...) in between).

To understand why you'd have to understand the (??{...}) (?:...) and
(?>...) regex constructs these are explained better in 'perlre' than I
could do. So look there.

The qr// operator is used to define precompiled regex and is documented
in perlop. (Note as with any quoute-like operator in Perl the so called
qr// operator can use alternate delimiters so can also be qr{}).

The /x regex qualifier on the end of the qr// makes unqouted whitespace
inside the regex non-significant so allows the regex to be laid out more
readably.

$cchat = qr/((\W)?(\*?\*?\w+)\s*($temp))/;

This defines another pre-complied regex that matches something followed
by a ballenced (...).

The "something" is rather odd but there's no point trying to explain it.
All I'd be doing it telling you what each of the constructs mean.
Much better to just look up the constructs used in perlre. If there's
something you don't understand then come back and say what it is that
you don't understand.

Jeff 'japhy' Pinyan · Aug 10, 2004

This defines another pre-complied regex that matches something followed
by a ballenced (...).

The "something" is rather odd but there's no point trying to explain it.

I'd say it looks like it's matching a C function definition (although
ignoring the return *type* of the function, and only capturing the level
of dereferencing needed). But it does a poor job; it's too specific in
its formulation.

--
Jeff "japhy" Pinyan % How can we ever be the sold short or
RPI Acacia Brother #734 % the cheated, we who for every service
RPI Corporation Secretary % have long ago been overpaid?
http://japhy.perlmonk.org/ %
http://www.perlmonks.org/ % -- Meister Eckhart

J. Romano · Aug 11, 2004

In one of the file I found the following snippet:

$temp = qr{
$
(?:
(?>[^()]+ )
|
(??{ $temp })
)*
$
}x;

$cchat = qr/((\W)?(\*?\*?\w+)\s*($temp))/;

Can somebody explain me what these two lines do?

Dear Vittal,

The good news is that the programmer who wrote that script did not
create the first line him/herself. He/She just copied it straight
from the "prelre" documentation. To read what it does, type "perldoc
perlre" at a DOS or Unix prompt and search for the word "postponed".
You'll see the exact same piece of code and the explanation that this
regular expression matches a parenthesized group.

In other words, if you can use $temp like this:

if ("I saw a color (blue)." =~ m/$temp/)
{
print $&; # this prints the match "(blue)"
}

It will also work with nested parentheses, like this:

if ("I already ate (I ate one (1) pizza)." =~ m/$temp/)
{
print $&; # this prints "(I ate one (1) pizza)"
}

The bad news is that the programmer who wrote that script didn't feel
the need to add comments to explain the purpose of those regular
expressions. It has been said that it's almost always easier to
create your own regular expressions than to understand one that's
already written, and in this case that's definitely true. If it
wasn't for the fact that the first line was specifically mentioned in
"perldoc perlre", I'm sure that I would not have been able to figure
out what it was looking for.

The second regular expression is a little easier to figure out.
Let's take it a bit at a time:

$cchat = qr/((\W)?(\*?\*?\w+)\s*($temp))/;

The only parts of the $cchat regular expression that are not optional
are the \w+ part (which matches at least one "word" character (that
is, a letter, digit, or underscore)) and $temp, which matches a
parenthesized expression. Optionally, there may be whitespace (any
amount) between the non-optional parts. Also, there may be one or two
asterisks before the \w+ part. There could also be an optional
non-"word" character before the word characters (which would appear
before the asterisks, if the asterisks happen to exist).

Was my explanation confusing? If you didn't think so, then you're
a super genuis. I didn't expect it to be very easy to follow (like it
is said, it's not very easy to understand a regular expression that
you didn't write), so I always recommend writing a few comments (with
any non-simple regular expression) that shows a few sample matches.
The writer of that program you're reading should have included a few
comments like this:

# The $temp regular expression was taken right out of
# the "perldoc perlre" documentation. It matches a
# parenthesized expression (that may or may not contain
# nested parentheses):
$ temp = ... ;

# The purpose of the $cchat regular expression is ...
# It matches all of the following lines:
# some_text(parenthesized expression)
# some_text (parenthesized expression)
# *some_text(parenthesized expression)
# **some_text (parenthesized expression)
# %*some_text (parenthesized expression)
# ^**some_text(parenthesized expression)
# &some_text (parenthesized expression)
# !some_text(parenthesized expression)
$cchat = qr/((\W)?(\*?\*?\w+)\s*($temp))/;

Because the original programmer didn't explain the purpose of the
$cchat regular expression, it's difficult for us to figure it out for
sure. The closest we can come to figuring it out is to examine sample
matches and deduce the purpose from there.

If you ever add more code to this program, do yourself and the
future maintainers of the program a favor and add comments to document
your regular expressions. Include the purpose of the regular
expression (in plain English or whatever language is the main language
spoken at work) and include a few sample matches (because sometimes
looking at sample matches helps a person understand much better than
looking at the regular expression itself). It also helps the
debugging process a lot.

When you write code, please put comments in your code that explains
to anyone who didn't write the code what the code is doing and its
purpose. A lot of coders avoid doing this, giving many excuses as to
why they shouldn't. Some of the excuses are:

* I don't need to include comments because I write
non-esoteric (simple) code that anyone can understand.

* Comments eventually become outdated (and outdated comments
are wrong) and wrong comments are worse than no comments
at all (because they are misleading).

* I don't need to include comments because I write
"self-documenting" code. Comments are a sign that the
code is impossible to understand without outside help,
and I don't write code like that.

Don't fall into those traps! I may be offending some die-hard
programmers here who adhere to one or more of the above traps I
listed, but I sincerely believe that comments and documentation are
vital to writing programs (especially when writing programs that will
be read by other people) -- even if the comments and documentation
become outdated (outdated comments and documentation may not be
correct in everything they say, but at least they provide important
hints to anyone trying to understand, debug, and maintain the code).

I hope this helps, Vittal.

-- Jean-Luc Romano

Anno Siegel · Aug 11, 2004

J. Romano said:
(e-mail address removed) (Vittal) wrote in message

[...]

When you write code, please put comments in your code that explains
to anyone who didn't write the code what the code is doing and its
purpose. A lot of coders avoid doing this, giving many excuses as to
why they shouldn't. Some of the excuses are:

* I don't need to include comments because I write
non-esoteric (simple) code that anyone can understand.

* Comments eventually become outdated (and outdated comments
are wrong) and wrong comments are worse than no comments
at all (because they are misleading).

* I don't need to include comments because I write
"self-documenting" code. Comments are a sign that the
code is impossible to understand without outside help,
and I don't write code like that.

Don't fall into those traps! I may be offending some die-hard
programmers

I am one of the die-hard programmers who has occasionally offered points
of view that resemble those you misrepresent as "excuses" and "traps".

here who adhere to one or more of the above traps I
listed, but I sincerely believe that comments and documentation are
vital to writing programs (especially when writing programs that will
be read by other people) -- even if the comments and documentation
become outdated (outdated comments and documentation may not be
correct in everything they say, but at least they provide important
hints to anyone trying to understand, debug, and maintain the code).

What is offensive is not your opposition but your misrepresentation.

Let me first set the scope. We are talking about *comments*, (not
documentation in general, as you chose to drag in), and, specifically,
I'm talking about micro-commenting single statements of code. So-called
block comments (as might precede a sub definition or a group of such)
are another issue altogether. Further, we are talking about comments
in Perl, or a similarly high-level language.

Within that scope, I maintain that comments should be the rare exception,
not the rule.

What you present as a stance of hubris ("I don't write code like that")
is really an exhortation not to write code that needs comments. That may
not be possible in assembler, but in Perl and similar languages it is.
Perl has complex data structures that can be treated as units and all
house-keeping (length of strings, number of elements in an array, keys
of a hash, etc.) is taken care of.

That allows a programmer to work in units that are meaningful in terms
of the overall process and not fiddle with stuff below that level. Usually,
what you do on the process level is obvious. If you feel the need to
explain some code, that is usually a sign that you haven't found the
right data structure and/or algorithm yet. So don't paste it over
with an explanatory comment, rewrite it so that it doesn't need one.

Anno

Vittal · Aug 12, 2004

Hello Jean,

Thank you very much for replying in deatil.

Yes, the person who has written the code has not give much comments
other than saying "matches the parenthesis".

I tried to break the two statements in samller chunks but could not.
These two statements took my night sleep away

Now I have got little understanding about these lines.

Thanks again!
-Vittal

In one of the file I found the following snippet:

$temp = qr{
$
(?:
(?>[^()]+ )
|
(??{ $temp })
)*
$
}x;

$cchat = qr/((\W)?(\*?\*?\w+)\s*($temp))/;

Can somebody explain me what these two lines do?

Click to expand...

Dear Vittal,

The good news is that the programmer who wrote that script did not
create the first line him/herself. He/She just copied it straight
from the "prelre" documentation. To read what it does, type "perldoc
perlre" at a DOS or Unix prompt and search for the word "postponed".
You'll see the exact same piece of code and the explanation that this
regular expression matches a parenthesized group.

In other words, if you can use $temp like this:

if ("I saw a color (blue)." =~ m/$temp/)
{
print $&; # this prints the match "(blue)"
}

It will also work with nested parentheses, like this:

if ("I already ate (I ate one (1) pizza)." =~ m/$temp/)
{
print $&; # this prints "(I ate one (1) pizza)"
}

The bad news is that the programmer who wrote that script didn't feel
the need to add comments to explain the purpose of those regular
expressions. It has been said that it's almost always easier to
create your own regular expressions than to understand one that's
already written, and in this case that's definitely true. If it
wasn't for the fact that the first line was specifically mentioned in
"perldoc perlre", I'm sure that I would not have been able to figure
out what it was looking for.

The second regular expression is a little easier to figure out.
Let's take it a bit at a time:

$cchat = qr/((\W)?(\*?\*?\w+)\s*($temp))/;

Click to expand...

The only parts of the $cchat regular expression that are not optional
are the \w+ part (which matches at least one "word" character (that
is, a letter, digit, or underscore)) and $temp, which matches a
parenthesized expression. Optionally, there may be whitespace (any
amount) between the non-optional parts. Also, there may be one or two
asterisks before the \w+ part. There could also be an optional
non-"word" character before the word characters (which would appear
before the asterisks, if the asterisks happen to exist).

Was my explanation confusing? If you didn't think so, then you're
a super genuis. I didn't expect it to be very easy to follow (like it
is said, it's not very easy to understand a regular expression that
you didn't write), so I always recommend writing a few comments (with
any non-simple regular expression) that shows a few sample matches.
The writer of that program you're reading should have included a few
comments like this:

# The $temp regular expression was taken right out of
# the "perldoc perlre" documentation. It matches a
# parenthesized expression (that may or may not contain
# nested parentheses):
$ temp = ... ;

# The purpose of the $cchat regular expression is ...
# It matches all of the following lines:
# some_text(parenthesized expression)
# some_text (parenthesized expression)
# *some_text(parenthesized expression)
# **some_text (parenthesized expression)
# %*some_text (parenthesized expression)
# ^**some_text(parenthesized expression)
# &some_text (parenthesized expression)
# !some_text(parenthesized expression)
$cchat = qr/((\W)?(\*?\*?\w+)\s*($temp))/;

Because the original programmer didn't explain the purpose of the
$cchat regular expression, it's difficult for us to figure it out for
sure. The closest we can come to figuring it out is to examine sample
matches and deduce the purpose from there.

If you ever add more code to this program, do yourself and the
future maintainers of the program a favor and add comments to document
your regular expressions. Include the purpose of the regular
expression (in plain English or whatever language is the main language
spoken at work) and include a few sample matches (because sometimes
looking at sample matches helps a person understand much better than
looking at the regular expression itself). It also helps the
debugging process a lot.

When you write code, please put comments in your code that explains
to anyone who didn't write the code what the code is doing and its
purpose. A lot of coders avoid doing this, giving many excuses as to
why they shouldn't. Some of the excuses are:

* I don't need to include comments because I write
non-esoteric (simple) code that anyone can understand.

* Comments eventually become outdated (and outdated comments
are wrong) and wrong comments are worse than no comments
at all (because they are misleading).

* I don't need to include comments because I write
"self-documenting" code. Comments are a sign that the
code is impossible to understand without outside help,
and I don't write code like that.

Don't fall into those traps! I may be offending some die-hard
programmers here who adhere to one or more of the above traps I
listed, but I sincerely believe that comments and documentation are
vital to writing programs (especially when writing programs that will
be read by other people) -- even if the comments and documentation
become outdated (outdated comments and documentation may not be
correct in everything they say, but at least they provide important
hints to anyone trying to understand, debug, and maintain the code).

I hope this helps, Vittal.

-- Jean-Luc Romano

Please explain: #define MOVE 0x05	1	Jun 4, 2023
Scipy install Problems	1	Oct 17, 2023
Lost in a sea of code	1	Jul 11, 2023
Can someone explain why i have to drag my mouse on one window and the shape to be printed on another	1	Feb 9, 2022
While loop unclear, can someone help?	4	Dec 6, 2023
Listing events not showing when date is today.	4	Sep 30, 2022
pack/unpack help please	6	Apr 5, 2014
Imager::QRCode-ing octet sequences vs. zbarimg(1)	17	Mar 13, 2013

Can somebody explain?

Vittal

Gunnar Hjalmarsson

Brian McCauley

Jeff 'japhy' Pinyan

J. Romano

Anno Siegel

Vittal

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads