CGI input variable - regular expression question

G

G Klinedinst

Hi all,
I am writing a CGI program which will be receiving 5 variables via
the POST method and using them to query a DB. Everything is working
fine but I would like to limit the characters people can enter. My
problem is that perl's built in types aren't sufficient for me, and I
can't really figure out the proper way to do it.
I want to allow ONLY: A-Z a-z 1-9 _ / -
As you can see that is two extra characters than the Perl \w and !\W.
I can easily do something like this:

if( !( $name =~ /[\W]/ ))

but that excludes my hyphen and forward slash. I could also or it with
a search for the hyphen and forward slash, like this:

if( !( $name =~ /[\W]/ ) || ( $name =~ /[-\/]/ ) )

but that isn't what I want either because that will allow weird
characters as long as a hyphen or slash exists.

Can anyone point me in the right direction? Just maybe a suggestion of
a flag or something which would be useful and then I will figure it
out from there. Thanks a bunch.

-GPK
 
G

Gunnar Hjalmarsson

G said:
I would like to limit the characters people can enter. My problem
is that perl's built in types aren't sufficient for me, and I can't
really figure out the proper way to do it.
I want to allow ONLY: A-Z a-z 1-9 _ / -
As you can see that is two extra characters than the Perl \w and
!\W.

if ( $name !~ /[^\w\/-]/ ) {
# Okay
}
 
R

Richard Gration

Hi all,
I am writing a CGI program which will be receiving 5 variables via
the POST method and using them to query a DB. Everything is working fine
but I would like to limit the characters people can enter. My problem is
that perl's built in types aren't sufficient for me, and I can't really
figure out the proper way to do it.
I want to allow ONLY: A-Z a-z 1-9 _ / -


You can use character groups like this:

if ($name =~ m([-A-Za-z1-9_/]+) ) {

The hyphen must be the first character to avoid it being seen as a range
operator like the other hyphens. The regex above will match a string
which consists of 1 or more of the characters in the []'s.

BTW, are you sure you don't want to allow zero?

Rich
 
A

A. Sinan Unur

(e-mail address removed) (G Klinedinst) wrote in
I want to allow ONLY: A-Z a-z 1-9 _ / -
As you can see that is two extra characters than the Perl \w and !\W.
I can easily do something like this:

if( !( $name =~ /[\W]/ ))

First, as an aside:

\W already stands for a character class, no need to surround it with [ and
]. Also,

if($name =~ /\w\)

is more readily comprehensible and does the same thing as the statement
above. No need for clutter.

However, as you have shown it, the regex will match if $name contains at
least one character in the class whereas I assume you want to match strings
that consist only of the allowable characters.

So, what is wrong with:

if($name =~ m!([A-Za-z1-9_\-/])+!)

The parantheses are there because I am assuming your script will be running
in taint mode (as it should), and you'll need to untaint your variables.
Can anyone point me in the right direction?

perldoc perlre
 
G

Gunnar Hjalmarsson

Richard said:
G Klinedinst said:
I am writing a CGI program which will be receiving 5 variables
via the POST method and using them to query a DB. Everything is
working fine but I would like to limit the characters people can
enter. My problem is that perl's built in types aren't sufficient
for me, and I can't really figure out the proper way to do it.
I want to allow ONLY: A-Z a-z 1-9 _ / -

You can use character groups like this:

if ($name =~ m([-A-Za-z1-9_/]+) ) {

That tests something else. You need to anchor the beginning and the
end to use that approach:

if ($name =~ m(^[-A-Za-z1-9_/]+$) )
-------------------^---------------^
 
E

Eric Schwartz

A. Sinan Unur said:
Alan J. Flavell said:
if( !( $name =~ /[\W]/ )) [..]
if($name =~ /\w\)

is more readily comprehensible and does the same thing as the statement
above.

Would you like to reconsider that?

Unfortunately, I am not sure why I need to. Could you explain please?

You said:

if($name =~ /\w\)

ITYM:

if($name =~ /\w/)

Note important difference in the direction of the final slash.

-=Eric
 
A

A. Sinan Unur

You said:

if($name =~ /\w\)

ITYM:

if($name =~ /\w/)

Note important difference in the direction of the final slash.

Arrrrgh! I'll stop posting before I break the new record for the number of
errors in a single post.
 
A

Alan J. Flavell

if( !( $name =~ /[\W]/ )) [..]
if($name =~ /\w\)

is more readily comprehensible and does the same thing as the statement
above.

Would you like to reconsider that?

Unfortunately, I am not sure why I need to.

Aside from the backslash typo, it seems to me that the two tests
aren't equivalent to each other - consider the null string?
 
A

A. Sinan Unur

if( !( $name =~ /[\W]/ ))
[..]
if($name =~ /\w\)

is more readily comprehensible and does the same thing as the
statement above.

Would you like to reconsider that?

Unfortunately, I am not sure why I need to.

Aside from the backslash typo, it seems to me that the two tests
aren't equivalent to each other - consider the null string?

Point well taken. Given the empty string, ($name =~ /\w/) will be false
whereas !($name =~ /[\W]/) will be true by virtue of the fact that the
empty string does not match any characters.

Are you a mathematician by any chance?
 
G

G Klinedinst

\W already stands for a character class, no need to surround it with [ and

Thanks for the tip. I will use the way you demonstrated in the future.
is more readily comprehensible and does the same thing as the statement
above. No need for clutter.

Definitely. RegExps are hairy enough as it is.


Your 2nd post>if($name =~ m!^([A-Za-z1-9_\-/])+$!)

This expression works perfectly. My interpretation of this is that by
using the begin and end of the string symbols "^" and "&", and the "+"
character you are saying to match 1 or more of the listed characters
between the beginning and end of the string. And this states that
since no other characters( other than the ones we stated I mean ) are
present between the begin and end of line chars no others are allowed.
Got it, I think. :) Looks like I need to go back and read the fine
print on regexps again. Thanks again.
 
A

Alan J. Flavell

Are you a mathematician by any chance?

I think I'm going to take that as an obtuse compliment. ;-)

(My degree is in Physics[0], but Maths seemed to be an important part
of that too. Not that the genuine mathematicians would have been
satisfied with the sort that we did. "Far too little rigour" they
said.[1])

cheers

[0] "Physicists write FORTRAN in any language"

[1] please, let's skip the topical gag that springs inevitably to mind
at this spam-polluted juncture.
 
G

G Klinedinst

The hyphen must be the first character to avoid it being seen as a range
operator like the other hyphens. The regex above will match a string
which consists of 1 or more of the characters in the []'s.

I think that is what was giving me the issues. As well as not bounding it.
BTW, are you sure you don't want to allow zero?

Yes, I did. Typo. Thanks for the help.
 
A

A. Sinan Unur

(e-mail address removed) (G Klinedinst) wrote in
\W already stands for a character class, no need to surround it with
[ and ]. Also,

Thanks for the tip. I will use the way you demonstrated in the future.
is more readily comprehensible and does the same thing as the
statement above. No need for clutter.

Definitely. RegExps are hairy enough as it is.

Please note Alan's response, however. If $name = '',
($name =~ /\w/) will be false whereas !($name =~ /\W/) will be true.
Thanks again.

You are welcome.
 
A

Alan J. Flavell

Please note Alan's response, however. If $name = '',
($name =~ /\w/) will be false whereas !($name =~ /\W/) will be true.

But I forgot to mention the possibility of coding !~ as the opposite
of =~ , which seems useful here.
 
A

A. Sinan Unur

I think I'm going to take that as an obtuse compliment. ;-)

It was indeed for catching the implicit assumption that $name is nonempty.
Thank you indeed for the correction.

Sinan.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,743
Messages
2,569,478
Members
44,898
Latest member
BlairH7607

Latest Threads

Top