more stripping

M

Michael Hill

I have this input from a <textarea> object that is being submitted to a
script.

The input looks like:

<path fill="none" stroke="#000000" d="M0.437,185.49l87-156"/>
<path fill="none" stroke="#000000" d="M87.437,29.49l140-29"/>
<path fill="none" stroke="#000000" d="M227.437,0.49l39,118"/>
<path fill="none" stroke="#000000" d="M266.437,118.49l-104,160"/>
<path fill="none" stroke="#000000" d="M159.437,276.49l-32-101"/>
<path fill="none" stroke="#000000" d="M127.437,175.49l-127,10"/>

I'd like to get where the output for:
foreach $i (@arr)
{
($x, $y) = @$i;
print "d=$x,$y<br>";
}

should be:
d="0.437,185.49"
d="87.437,29.49"
d="227.437,0.49"
d="266.437,118.49"
d="159.437,276.49"
d="127.437,175.49"

This is where I am:
****************************************************************
read(STDIN, $buffer, $ENV{'CONTENT_LENGTH'});
@pairs = split(/&/, $buffer);
foreach $pair (@pairs)
{
($name, $value) = split(/=/, $pair);
$value =~ s/%09//g; #strip the tabs out all of them
$value =~ s/%3C//g; #strip the < all of them
$value =~ s/%2F%3E//g; #strip the /> all of them
$value =~ s/%22//g; #strip the " all of them
$value =~ s/%23//g; #strip the # all of them
$value =~ s/%3D/=/g; #change %3D to = all of them
$value =~ s/%2C/,/g; #change %2C to , all of them
$value =~ s/%0D%0A//g; #strip out the carriage returns
$value =~ s/path//g; #strip out the word path .....
hmmm what if i have 'PATH' or Path or paTH? Need mod here
if ( $name eq 'path' )
{
$path = $value;
}
}

@arr = split(/+/, $path);
foreach $i (@arr)
{
($x, $y) = @$i;
print "d=$x,$y<br>";
}

Any help is appreciated.

Mike
 
P

Paul Lalli

I have this input from a <textarea> object that is being submitted to a
script.

The input looks like:

<path fill="none" stroke="#000000" d="M0.437,185.49l87-156"/>
<path fill="none" stroke="#000000" d="M87.437,29.49l140-29"/>
<path fill="none" stroke="#000000" d="M227.437,0.49l39,118"/>
<path fill="none" stroke="#000000" d="M266.437,118.49l-104,160"/>
<path fill="none" stroke="#000000" d="M159.437,276.49l-32-101"/>
<path fill="none" stroke="#000000" d="M127.437,175.49l-127,10"/>

I'd like to get where the output for:
foreach $i (@arr)
{
($x, $y) = @$i;
print "d=$x,$y<br>";
}

should be:
d="0.437,185.49"
d="87.437,29.49"
d="227.437,0.49"
d="266.437,118.49"
d="159.437,276.49"
d="127.437,175.49"

This is where I am:
****************************************************************
read(STDIN, $buffer, $ENV{'CONTENT_LENGTH'});
@pairs = split(/&/, $buffer);
foreach $pair (@pairs)
{
($name, $value) = split(/=/, $pair);
$value =~ s/%09//g; #strip the tabs out all of them
$value =~ s/%3C//g; #strip the < all of them
$value =~ s/%2F%3E//g; #strip the /> all of them
$value =~ s/%22//g; #strip the " all of them
$value =~ s/%23//g; #strip the # all of them
$value =~ s/%3D/=/g; #change %3D to = all of them
$value =~ s/%2C/,/g; #change %2C to , all of them
$value =~ s/%0D%0A//g; #strip out the carriage returns
$value =~ s/path//g; #strip out the word path .....
hmmm what if i have 'PATH' or Path or paTH? Need mod here
if ( $name eq 'path' )
{
$path = $value;
}
}

@arr = split(/+/, $path);
foreach $i (@arr)
{
($x, $y) = @$i;
print "d=$x,$y<br>";
}

Any help is appreciated.

Mike

I don't understand why you're doing any of this. Why are you taking so
much effort to remove the stuff you don't want, instead of just taking
what you *do* want?

foreach $line (@pairs) {
($x, $y) = $line =~ /M(\d+\.\d+),(\d+\.\d{2})/;
push @arr, [$x, $y];
}

Now @arr is populated the way you claim to want it.

The above can really be shortened up even more, but I've left it like this
in the hope of clarity.

Paul Lalli
 
T

Tad McClellan

^^^^^^^^^^^^^^
^^^^^^^^^^^^^^ I worked there for 12 years
This is where I am:
read(STDIN, $buffer, $ENV{'CONTENT_LENGTH'});


You are in a bad place, and I don't mean Lockheed Martin. :)

Hand rolled "parsing" of form data is very easy to get wrong.

Let a module, such as CGI.pm, handle all of that for you.
 
T

Tore Aursand

Subject: more stripping

Could have been an excellent Subject. :)
The input looks like:

<path fill="none" stroke="#000000" d="M0.437,185.49l87-156"/>
<path fill="none" stroke="#000000" d="M87.437,29.49l140-29"/>
<path fill="none" stroke="#000000" d="M227.437,0.49l39,118"/>
<path fill="none" stroke="#000000" d="M266.437,118.49l-104,160"/>
<path fill="none" stroke="#000000" d="M159.437,276.49l-32-101"/>
<path fill="none" stroke="#000000" d="M127.437,175.49l-127,10"/>

Really? Where do you get this input from?
This is where I am:
****************************************************************
read(STDIN, $buffer, $ENV{'CONTENT_LENGTH'});
@pairs = split(/&/, $buffer);
foreach $pair (@pairs)
{
($name, $value) = split(/=/, $pair);
$value =~ s/%09//g; #strip the tabs out all of them
$value =~ s/%3C//g; #strip the < all of them
$value =~ s/%2F%3E//g; #strip the /> all of them
$value =~ s/%22//g; #strip the " all of them
$value =~ s/%23//g; #strip the # all of them
$value =~ s/%3D/=/g; #change %3D to = all of them
$value =~ s/%2C/,/g; #change %2C to , all of them
$value =~ s/%0D%0A//g; #strip out the carriage returns
$value =~ s/path//g; #strip out the word path .....
hmmm what if i have 'PATH' or Path or paTH? Need mod here
if ( $name eq 'path' )
{
$path = $value;
}
}

@arr = split(/+/, $path);
foreach $i (@arr)
{
($x, $y) = @$i;
print "d=$x,$y<br>";
}

Don't you _ever_ - and I really mean _ever_ - try to hand-roll this on
your own. Let CGI.pm do that for you (untested):

#!/usr/bin/perl
#
use strict;
use warnings;
use CGI;

my $CGI = CGI->new();
my $params = $CGI->Vars();
foreach ( keys %$params ) {
print $_ . ' = ' . $params->{$_} . "\n";
}

I never use the Vars() function, though, as I tend to always know what I'm
looking for in the CGI request.


--
Tore Aursand <[email protected]>
"Writing is a lot like sex. At first you do it because you like it.
Then you find yourself doing it for a few close friends and people you
like. But if you're any good at all, you end up doing it for money."
-- Unknown
 
M

Michael Hill

I don't understand why you're doing any of this. Why are you taking so
much effort to remove the stuff you don't want, instead of just taking
what you *do* want?

Good Point !
foreach $line (@pairs) {
($x, $y) = $line =~ /M(\d+\.\d+),(\d+\.\d{2})/;
push @arr, [$x, $y];
}

Now @arr is populated the way you claim to want it.

I am just getting

x: 0.437 y: 185.49

It just storing the first occurrence.

Mike
 
M

Michael Hill

&nbsp;
Don't you _ever_ - and I really mean _ever_ - try to hand-roll this on
your own.&nbsp; Let CGI.pm do that for you (untested):
&nbsp; Sorry, I'd rather "hand roll" it.
a) If I let modules do everything I need then I'll never remember anything,
b) Modules take up memory and unless I am working on something where I "have-to-have" some functionality, then I choose not to.

For obvious reasons the modules I use most are Date_Manip.pm and GD.pm

Mike
 
E

Eric Schwartz

Michael Hill said:
Don't you _ever_ - and I really mean _ever_ - try to hand-roll this on
your own. Let CGI.pm do that for you (untested):


Sorry, I'd rather "hand roll" it.
a) If I let modules do everything I need then I'll never remember
anything,

Why do you need to remember it? I've never done CGI in a language
where I needed to roll my own, and in every case so far, the native
CGI modules have done a better job parsing CGI than I would have done
on my own. Perhaps you have time to constantly read and keep up on
every little detail of the CGI spec; if so, more power to you. I
prefer to spend my time doing interesting and creative work, and leave
the drudgery of specifying parsing CGI to others.

Not to mention your code has at least two serious and obvious bugs
that took me about three seconds to spot, and ones that CGI.pm handles
for you without your even knowing about it. You may not have tripped
over them yet, but they're there-- would you rather debug it by hand
when the problem finally hits you, or use a module that will handle
CGI variables correctly? I know which I'd pick, but if you choose the
other alternative, you have no-one to blame but yourself.
b) Modules take up memory and unless I am working on something where I
"have-to-have" some functionality, then I choose not to.

CGI.pm compiles its functions on the fly, so it doesn't take up that
very much more. Furthermore, I find that 90% of my time spent
maintaining a program is reading through for intent; CGI.pm makes that
vastly clearer at the expense of a little more memory, and that's
worth it to me-- RAM is cheap; my time isn't. And finally, if you're
really hard-up stressed for memory, there's always CGI::Lite.pm, which
still has your hand-rolled code beat both for lack of bugs and
presence of features.
For obvious reasons the modules I use most are Date_Manip.pm and GD.pm

I suggest you add a CGI module of some kind to that list. But hey,
have fun with it if you don't. Just don't expect very many people to
care. :)

-=Eric
 
T

Tad McClellan

HTML is for machines, we are people.

Please do not post in HTML.

Don't you _ever_ - and I really mean _ever_ - try to hand-roll this
on
your own. Let CGI.pm do that for you (untested):

Sorry, I'd rather "hand roll" it.


You are too silly to spend time on then, so long.

*plonk*



There are many common security exploits that rely on hand-rolled
form parsing.

If you are not an expert in security, then you should perhaps
consider using code from such a person. Else we'll be hearing
about your system crash on the 6 o'clock news...
 
G

gnari

[please post in text-only]

Michael Hill said:
Don't you _ever_ - and I really mean _ever_ - try to hand-roll this on
your own. Let CGI.pm do that for you (untested):

you should work on your attributions.
Sorry, I'd rather "hand roll" it.
a) If I let modules do everything I need then I'll never remember
anything,

Au contraire, if you use modules to do complex but standard stuff, so you
can remember
what your program is about. your program is not about the intricacies of
HTTP and CGI,
it is about your interaction with the user, and the content of your output.
b) Modules take up memory ...
For obvious reasons the modules I use most are Date_Manip.pm and GD.pm

this is really funny. Date::Manip is a real dinosaur of a module. (i like
it, but you
just complained about memory usage of modules).
and also, the idea that you would not use CGI.pm, that takes care of a
complicated
task that one would prefer not to have to know all the details of, but you
use an
enormous module for date manipulations.

gnari
 
J

Joe Smith

Michael said:
a) If I let modules do everything I need then I'll never remember anything,

If you let modules do everything you need then you'll be getting results
that are quicker, more accurate (better error handling), and more
maintainable. "Hand rolled" code has been proven over and over again
to be more buggy. Especially when malicious users can (and will)
try to find deficiencies in your CGI code.
-Joe
 
J

Jürgen Exner

Michael said:
Well, I definitely have been receiving some rather fierce remarks
about using the cgi modules. There are many good points that were
spoken most deal with the ease of coding but the compelling ones deal
with security.

Can anyone talk to what security holes may be there?

Google may be very educational (hint: this topic has been discussed a few
times before).

jue
 
M

Michael Hill

Google may be very educational (hint: this topic has been discussed a few
times before).

jue

I'm not asking for a comprehensive listing of security holes using cgi,
but rather inherent security holes because I am using *hand rolled* cgi
instead of the cgi module.
 
T

Tore Aursand

I'm not asking for a comprehensive listing of security holes using cgi,
but rather inherent security holes because I am using *hand rolled* cgi
instead of the cgi module.

And that _has_ been discussed before. Try searching for 'cgi security' in
the *perl* groups from <http://groups.google.com/>.
 
G

gnari

Michael Hill said:
Well, I definitely have been receiving some rather fierce remarks about
using the cgi modules. There are many good points that were spoken most deal
with the ease of coding but the compelling ones deal with security.

Can anyone talk to what security holes may be there?

for example, I have seen roll-your-owners use unsafe methods to
import querystring/form params into program symboltable.
like cgi?foo=1&bar=x being evaled into $foo and $bar.

gnari
 
M

Michael Hill

for example, I have seen roll-your-owners use unsafe methods to
import querystring/form params into program symboltable.
like cgi?foo=1&bar=x being evaled into $foo and $bar.

gnari

And how does the cgi.pm module help there?
If those params are unsafe the developer should be using 'post' instead
of a 'querystring'.
 
G

gnari

Michael Hill said:
And how does the cgi.pm module help there?
by not using those unsafe methods, of course
If those params are unsafe the developer should be using 'post' instead
of a 'querystring'.

that makes of no difference at all

gnari
 
E

Eric Schwartz

Michael Hill said:
And how does the cgi.pm module help there?

By not doing it. You can get the values of parameters with the
param() function, or by using Vars(). Personally, I prefer param(),
as I (almost) always know what parameters I'm supposed to be
accepting.
If those params are unsafe the developer should be using 'post' instead
of a 'querystring'.

It's not the params that are unsafe, it's automatically vififying
variables based on them that's unsafe. Even PHP developers realized
the error of their ways quite some time ago, and have turned that
behaviour off by default.

Furthermore, any idiot can spend about 30 seconds with LWP::Simple and
fake a POST request as easily as a GET. If you think POST is somehow
'more secure', that's another good indication you should be using
CGI.pm instead of rolling your own, as it indicates (to me) that
you're less likely to check POST variables for sanity. Which brings
up another nice thing about CGI.pm: it doesn't care if you use GET or
POST; its API is the same either way.

Also, your code doesn't correctly handle URI decoding, multi-valued
parameters, or a failed or incomplete read()s either. All of which
are pretty common for hand-rolled code, and all of which are handled
correctly in CGI.pm. Frankly, as someone else has pointed out, if you
can afford to use Date::Manip, you can afford to use CGI.pm, and doing
that will vastly clean up your code, allowing it to focus more on what
it does, and less on how it does it.

-=Eric
 
T

thumb_42

Michael Hill said:
I'm not asking for a comprehensive listing of security holes using cgi,
but rather inherent security holes because I am using *hand rolled* cgi
instead of the cgi module.

In general I don't see many *security problems* with rolling your own parsing.
(except maybe someone giving you bogus data or posting way more data
than they really ought to, or feeding you data that some how chokes up your
regex's into doing things it shouldn't.)

Where you may run into trouble is in how you use those parsed values, but
the exact same things can be said for CGI.pm, or indeed reading it straight
from the terminal.

Incidently, I think parsing it yourself is a good excercise. Maybe not that
great all the time, or in production code, or w/out having a good reason for
it, but.. for simple stuff if you want to parse it youself purely for the
fun of it, go ahead. (I should hope we've all tried it once or twice, just
to see what it's like.. I've hand-parsed a few times for kicks, but I
usually use CGI.pm)

The only "good reasons" I've ever had (and they really weren't that great of
'reasons') were times before CGI.pm was standard, and for file uploading.
Part of it, I'll admit was purely because I enjoyed it and wanted to explore
models that were different than CGI.pm.

I wondered what it'd be like if each parameter had attributes associated to
it, or if regardless of file upload or regular CGI it would save over a
certain chunk-size to a temp file, and treat file uploads the same as
regular form variables. (unless the sizes exceeded a certain amount) Or,
what if I wanted the exact order of the form variables, or... wanted some
sort of callback subrouting for each line of input, or something akin to
XML::parser with a handler for start, chardata, end I really don't see
anything wrong with exploring those possibilities and kicking them around a
bit to see the pros and cons.

If someone tells you to have a closed mind and not explore stuff.. Well, I'd
close my mind toward those people. :)

Just keep in mind that Lincoln Stein has probably encountered and addressed
all the problems (and a LOT more) than anyone else has in the area of
parsing CGI variables, he knows his stuff. But that shouldn't prevent you
from exploring it yourself if you're interested in it.

Jamie
 
B

Ben Morrow

In general I don't see many *security problems* with rolling your own parsing.
(except maybe someone giving you bogus data or posting way more data
than they really ought to, or feeding you data that some how chokes up your
regex's into doing things it shouldn't.)

err... and what are these if not security problems? And pretty serious
ones at that.

Ben
 
G

Gunnar Hjalmarsson

Ben said:
err... and what are these if not security problems? And pretty
serious ones at that.

They are. But once again a thread in this group has degenerated into
giving the impression that you are automatically protected from such
problems if you use CGI.pm, which AFAIK is not true. Such issues
reasonably need to be addressed irrespective of the method for parsing
CGI data.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,774
Messages
2,569,596
Members
45,143
Latest member
DewittMill
Top