Ondra said:
Subject: Does anybody know...
Please put the subject of your article in the Subject of your article.
does anybody know what is this:
my $tag
$line =~ /<(\S+)[^>]*?>/;
$tag = $1 || ""; /*****exactly this***/
I know what it is *not*. It is not Perl code, it has syntax errors.
It is a bug, you should never use the dollar-digit variables unless
you have first ensured that the match *succeeded*.
It is a bug since it tries to parse HTML with pattern matching
rather than with a real HTML parser. There are a bunch of examples
in the Perl FAQ of legal HTML that will break it...
The non-greediness serves no purpose, it is not needed.
It could be replaced with this:
my $tag = '';
$tag = $1 if $line =~ /<(\S+)[^>]*>/;
The
$tag = $1 || "";
is a common idiom for selecting a default value.
Note that it will do the wrong thing if the data contains "<0>"...