Does anybody know...

O

Ondra

Hello,
does anybody know what is this:

my $tag
$line =~ /<(\S+)[^>]*?>/;
$tag = $1 || ""; /*****exactly this***/
 
P

Peter Hickman

Ondra said:
Hello,
does anybody know what is this:

my $tag
$line =~ /<(\S+)[^>]*?>/;
$tag = $1 || ""; /*****exactly this***/

Looks like a regex to match htlp tags and extract the body of the tag.
 
P

Peter Hickman

Peter said:
Ondra said:
Hello,
does anybody know what is this:
my $tag
$line =~ /<(\S+)[^>]*?>/;
$tag = $1 || ""; /*****exactly this***/


Looks like a regex to match htlp tags and extract the body of the tag.

Or even HTML! My fingers are cold this morning
 
T

Tad McClellan

Ondra said:
Subject: Does anybody know...


Please put the subject of your article in the Subject of your article.

does anybody know what is this:

my $tag
$line =~ /<(\S+)[^>]*?>/;
$tag = $1 || ""; /*****exactly this***/


I know what it is *not*. It is not Perl code, it has syntax errors.

It is a bug, you should never use the dollar-digit variables unless
you have first ensured that the match *succeeded*.

It is a bug since it tries to parse HTML with pattern matching
rather than with a real HTML parser. There are a bunch of examples
in the Perl FAQ of legal HTML that will break it...

The non-greediness serves no purpose, it is not needed.

It could be replaced with this:

my $tag = '';
$tag = $1 if $line =~ /<(\S+)[^>]*>/;


The

$tag = $1 || "";

is a common idiom for selecting a default value.

Note that it will do the wrong thing if the data contains "<0>"...
 
O

Ondra

Bernard El-Hagin said:
Ondra said:
Hello,
does anybody know what is this:

my $tag
$line =~ /<(\S+)[^>]*?>/;
$tag = $1 || ""; /*****exactly this***/

Looks like a regex to match htlp tags and extract the body of the tag.


That doesn't answer the OP's question.


$tag = $1 || "";


means "set $tag to the value of $1 if the value of $1 is true, otherwise
set $tag to the empty string".

Yes that's it. Thank you
 
D

David Oswald

Ondra said:
Hello,
does anybody know what is this:

my $tag
$line =~ /<(\S+)[^>]*?>/;
$tag = $1 || ""; /*****exactly this***/

Honestly, it's a mistake.
You should never rely on $1 in any way unless you first ensure that the
match succeeded. It is entirely possible that $1 may still contain an
artifact of a previous pattern match, even if the most recent match failed.
That being the case, you cannot assume that $1 will be 'undef' or 'false' if
the most recent pattern match failed, for, if there were previously
successful pattern matches with capturing parens, $1 will still contain the
results of that successful capture.

You really should be checking the success of the match before fiddling with
$1, et. al.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,764
Messages
2,569,566
Members
45,041
Latest member
RomeoFarnh

Latest Threads

Top