two regexs

M

Matija Papec

my @str = (
'="foo bar" ..',
'=foobar ..',
);
for (@str) {
if (/="(.+?)"/) { print $1 }
elsif (/=(\S+)/) { print $1 }
print "\n";
}


I would like to match each value from @str with only one regex. I come to,
if ( /="(.+?)"|=(\S+)/ ) { print $1 || $2 }

but I'm not sure if this is the best matching solution?
 
G

Greg Bacon

: I would like to match each value from @str with only one regex. I come
: to,
: if ( /="(.+?)"|=(\S+)/ ) { print $1 || $2 }
:
: but I'm not sure if this is the best matching solution?

How about the following?

#! /usr/local/bin/perl

use warnings;
use strict;

my @str = (
'="foo bar" ..',
'=foobar ..',
);

for (@str) {
print $_, ":\n";

if (/="(.+?)"|=(\S+)/) {
my $hit = defined $1 ? $1 : $2;

print " got [$hit]\n";
}
else {
print " no match\n";
}
}

Greg
 
M

Matija Papec

X-Ftn-To: Greg Bacon

: I would like to match each value from @str with only one regex. I come
: to,
: if ( /="(.+?)"|=(\S+)/ ) { print $1 || $2 }
:
: but I'm not sure if this is the best matching solution?

How about the following?

It's fine but I was thinking about /alternative/ method for "match
everything between quotes OR match all consecutive non white chars". Since I
need this to parse html forms, I ended up with,

my $atr = qr/(?:["'](.*?)["']|([^>\s]+))/;


And here is the reinvented wheel with some known limitations(it doesn't
support group of checkboxes with same names).

=================================================
use strict;

my(%Q, $doc1);
{ local $/;
$doc1 = <DATA>;
}

$Q{IME} = 'user input';
$Q{_split} = ','; #for multiple select values

ParseForm(\$doc1);
print $doc1;

sub ParseForm {
##################################################
#
# popuni $$doc formu prema ulaznim parametrima
#
##################################################

use vars qw/%Q/;

my($doc, $i) = @_;
$doc ||= \$_;
$i ||= \%Q;

#atr regex
my $atr = qr/(?:["'](.*?)["']|([^>\s]+))/;

#text|hidden|password|checkbox|radio
$$doc =~ s{(<input.+?)(\s*/?>)}{

my($tag, $ending) = ($1, $2);
my $type = lc join'', $tag =~ /type=$atr/i;

if ($type) {
my $name = join'', $tag =~ /name=$atr/i;
my $value = $i->{$name};

#text|hidden|password
if ($type =~ /text|hidden|password/) {
$value = '' unless defined $value;
$value =~ s/"/&quot;/g;
$tag =~ s/\s+value=$atr//i;
$tag .= qq{ value="$value"};
}
#checkbox|radio
elsif ($type =~ /checkbox|radio/) {
$tag =~ s/\s+checked//i;
if (($type eq 'checkbox' and defined $value) or
($tag =~ /value=$atr/i and ($1||$2) eq $value))
{
$tag .= ' checked';
}
}
}

"$tag$ending";
}iges;
#select|textarea
$$doc =~ s{(<(select|textarea).+?>)(.*?)(</\2>)}{

my($tag, $type, $cont, $ending) = ($1, lc $2, $3, $4);
my $name = join'', $tag =~ /name=$atr/i;
my $value = $i->{$name};

#textarea
if ($type eq 'textarea') { $cont = $value }
#select
else {
$cont =~ s/\s+selected(?=>)//ig;
my $split = $i->{'_split'};
my @vals = $split ? (split /$split/, $value) : $value;
$cont =~ s/(<[^>]+?=(?:["']$_["']|$_))/$1 selected/ for @vals;
}

"$tag$cont$ending";
}iges;
}

__DATA__
<form ....>
<input type=text name=IME size=50 maxlength=50 value="Default">
</form>
 
G

Greg Bacon

: It's fine but I was thinking about /alternative/ method for "match
: everything between quotes OR match all consecutive non white chars". Since I
: need this to parse html forms, I ended up with,
:
: my $atr = qr/(?:["'](.*?)["']|([^>\s]+))/;
:
: And here is the reinvented wheel with some known limitations(it doesn't
: support group of checkboxes with same names).

Is there a reason you're not using CGI.pm or another CGI module
from the CPAN that doesn't have this limitation?

Greg
 
M

Matija Papec

X-Ftn-To: Greg Bacon

: everything between quotes OR match all consecutive non white chars". Since I
: need this to parse html forms, I ended up with,
:
: my $atr = qr/(?:["'](.*?)["']|([^>\s]+))/;
:
: And here is the reinvented wheel with some known limitations(it doesn't
: support group of checkboxes with same names).

Is there a reason you're not using CGI.pm or another CGI module
from the CPAN that doesn't have this limitation?

Afaik CGI.pm, can't deal with predefined forms, and solutions from cpan seem
bloated. :!
 
S

Steven Kuo

I would like to match each value from @str with only one regex. I come to,
if ( /="(.+?)"|=(\S+)/ ) { print $1 || $2 }

but I'm not sure if this is the best matching solution?




It's fine. Other alternatives are likely to be ugly. For example,

my @str = (
'="foo bar" ..',
'=foobar ..',
);

for (@str) {
if (/=(")?((??{ $1 ? q{.+?(?=")} : q{\\S+} }))/) {
print $2, "\n";
}
}
 
T

Tad McClellan

Matija Papec said:
(e-mail address removed) (Tad McClellan) wrote:

I have forms stored in html files and need to populate them with user or db
input.


That is client-side then?

If so, then the server-side CGI module is of course not the Right Tool.

The Perl FAQ points to the Right Tool:

perldoc -q " form "

How do I automate an HTML form submission?

(despite the fact that the question has nothing to do with HTML...)


Or are you talking about something else?

(maybe I still don't know what a "predefined form" is?)
 
M

Michael Budash

That is client-side then?

If so, then the server-side CGI module is of course not the Right Tool.

The Perl FAQ points to the Right Tool:

perldoc -q " form "

How do I automate an HTML form submission?

(despite the fact that the question has nothing to do with HTML...)


Or are you talking about something else?

(maybe I still don't know what a "predefined form" is?)

i think the o.p. is simply talking about html templates. HTML::Template
or Template Toolkit sound appropriate here...
 
K

ko

Matija Papec said:
X-Ftn-To: Greg Bacon

(e-mail address removed) (Greg Bacon) wrote:
[snip]
Is there a reason you're not using CGI.pm or another CGI module
from the CPAN that doesn't have this limitation?

Afaik CGI.pm, can't deal with predefined forms, and solutions from cpan seem
bloated. :!

I'm not sure how you're defining bloated - maybe not loading modules
if its not necessary? If you don't mind looking into HTML::TreeBuilder
(you also need HTML-Parser and HTML-Tagset, maybe that's what you mean
by bloated), the code itself isn't that bad:

====CODE

#!/usr/bin/perl -w
use strict;

use HTML::TreeBuilder;

my $html;
{
local $/;
$html = <DATA>;
}

my $root = HTML::TreeBuilder->new;
$root->parse($html); # parse_file method to pass in file/filehandle
$root->eof;
my @input_tags = $root->look_down('_tag', 'input'); # 'input'/any
other tag
foreach my $tag(@input_tags) {
if ( $tag->attr('type') =~ /text/i ) {
$tag->attr('value', 'user input'); # to delete, second argument
undef
} elsif ( $tag->attr('type') =~ /checkbox/i) {
# your test to determine whether is checked
$tag->attr('checked', 'checked');
} else {
# other stuff
}
print $tag->as_HTML;
}

__DATA__
<form>
<input maxlength=50 name="IME" size=50 type="text" value="Default">
<input name="Q1" type="CHECKBOX" value="A1">A1
<input name="Q1" type="CHECKBOX" value="A2">A2
<input name="Q1" type="CHECKBOX" value="A3">A3
</form>

====END

The attr() method allows you to read/manipulate any tag's attribute.
And it will also work for a group of checkboxes with the same name :)

HTH - keith
 
M

Matija Papec

It's fine. Other alternatives are likely to be ugly. For example,

my @str = (
'="foo bar" ..',
'=foobar ..',
);

for (@str) {
if (/=(")?((??{ $1 ? q{.+?(?=")} : q{\\S+} }))/) {

Tnx.. I did look at this for some time! :)
 
M

Matija Papec

I'm not sure how you're defining bloated - maybe not loading modules
if its not necessary? If you don't mind looking into HTML::TreeBuilder
(you also need HTML-Parser and HTML-Tagset, maybe that's what you mean
by bloated), the code itself isn't that bad:

Yes, maybe bloated is wrong word, overkill probably better describes
the situation. :)
__DATA__
<form>
<input maxlength=50 name="IME" size=50 type="text" value="Default">
<input name="Q1" type="CHECKBOX" value="A1">A1
<input name="Q1" type="CHECKBOX" value="A2">A2
<input name="Q1" type="CHECKBOX" value="A3">A3
</form>

====END

The attr() method allows you to read/manipulate any tag's attribute.
And it will also work for a group of checkboxes with the same name :)

Tnx, that is what I needed; I'll try some benchmarks and decide.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,768
Messages
2,569,574
Members
45,048
Latest member
verona

Latest Threads

Top