arrange form data in same order as on form

T

Tintin

Gunnar Hjalmarsson said:
To me, a piece of code that does what it's _intended_ to do is not
"buggy". It may have _limitations_, but limitations and bugs are not
the same thing.

If I want my program to print today's date in ISO 8601 format, I may
use this code:

my $time = time;
sub myDate {
my @t = (gmtime $time)[3..5];
sprintf '%d-%02d-%02d', $t[2] += 1900, ++$t[1], $t[0];
}
print myDate();

I could have used your Time::Format module instead, but if I don't
need a variety of date and time formats in my program, I wouldn't
likely have done so.

Time::Format includes some nice tools for time formating, no doubt.
Nevertheless, that fact wouldn't make you claim that my myDate()
function is "buggy", right?

Your analogy is not a good one. An ISO8601 date format has very rigid
parameters, whereas CGI data is by its very nature, variable.
 
G

Gunnar Hjalmarsson

Tintin said:
Gunnar said:
To me, a piece of code that does what it's _intended_ to do is
not "buggy". It may have _limitations_, but limitations and bugs
are not the same thing.

If I want my program to print today's date in ISO 8601 format, I
may use this code:

my $time = time;
sub myDate {
my @t = (gmtime $time)[3..5];
sprintf '%d-%02d-%02d', $t[2] += 1900, ++$t[1], $t[0];
}
print myDate();

I could have used your Time::Format module instead, but if I
don't need a variety of date and time formats in my program, I
wouldn't likely have done so.

Time::Format includes some nice tools for time formating, no
doubt. Nevertheless, that fact wouldn't make you claim that my
myDate() function is "buggy", right?

Your analogy is not a good one. An ISO8601 date format has very
rigid parameters, whereas CGI data is by its very nature, variable.

True, but all potential variations are not applicable in all programs
that parse CGI data. For instance, if you want that a program only
parses POSTed data, it's not buggy because it isn't prepared to handle
potential variations in data submitted via GET. Limited? Yes.
Unflexible? Yes. Buggy? No.

The only point with my example was to illustrate that distinction.
Call a spade a spade! :)
 
T

Tintin

Gunnar Hjalmarsson said:
Tintin said:
Gunnar said:
To me, a piece of code that does what it's _intended_ to do is
not "buggy". It may have _limitations_, but limitations and bugs
are not the same thing.

If I want my program to print today's date in ISO 8601 format, I
may use this code:

my $time = time;
sub myDate {
my @t = (gmtime $time)[3..5];
sprintf '%d-%02d-%02d', $t[2] += 1900, ++$t[1], $t[0];
}
print myDate();

I could have used your Time::Format module instead, but if I
don't need a variety of date and time formats in my program, I
wouldn't likely have done so.

Time::Format includes some nice tools for time formating, no
doubt. Nevertheless, that fact wouldn't make you claim that my
myDate() function is "buggy", right?

Your analogy is not a good one. An ISO8601 date format has very
rigid parameters, whereas CGI data is by its very nature, variable.

True, but all potential variations are not applicable in all programs
that parse CGI data. For instance, if you want that a program only
parses POSTed data, it's not buggy because it isn't prepared to handle
potential variations in data submitted via GET. Limited? Yes.
Unflexible? Yes. Buggy? No.

The only point with my example was to illustrate that distinction.
Call a spade a spade! :)

I agree with you about not calling code buggy to a certain degree. I
suppose you could argue that various Microsoft products that don't conform
to standards are limited and not buggy because they are deliberately
designed that way, however, the typical newbie or person that writes
"limited" CGI parsing code, generally does not write it deliberately with
limitations. In most cases, I think it is fair to say they are writing code
which they think works for all occasions.
 
E

Eric J. Roode

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

True, but all potential variations are not applicable in all programs
that parse CGI data. For instance, if you want that a program only
parses POSTed data, it's not buggy because it isn't prepared to handle
potential variations in data submitted via GET. Limited? Yes.
Unflexible? Yes. Buggy? No.

I just can't believe that anyone would advocate writing one's own limited
CGI parsing code from scratch, against using the robust, flexible CGI.pm
off the shelf.

- --
Eric
$_ = reverse sort $ /. r , qw p ekca lre uJ reh
ts p , map $ _. $ " , qw e p h tona e and print

-----BEGIN PGP SIGNATURE-----
Version: PGPfreeware 7.0.3 for non-commercial use <http://www.pgp.com>

iQA/AwUBP7YPeGPeouIeTNHoEQL4WwCcDCElVH4KVgAhcWhfYDH5SIAquzUAoJWF
JAjzO/Q+EBQWtA9mhvGYZslH
=dmtq
-----END PGP SIGNATURE-----
 
G

Gunnar Hjalmarsson

Tintin said:
I agree with you about not calling code buggy to a certain degree.
... however, the typical newbie or person that writes "limited" CGI
parsing code, generally does not write it deliberately with
limitations. In most cases, I think it is fair to say they are
writing code which they think works for all occasions.

Probably true. In those cases they have probably copied and tweaked
code that they don't fully understand. *That* is what's blameworthy,
not necessarily the code in itself.
 
G

Gunnar Hjalmarsson

Eric said:
I just can't believe that anyone would advocate writing one's own
limited CGI parsing code from scratch, against using the robust,
flexible CGI.pm off the shelf.

One situation where doing so makes sense is when efficiency matters.

I have a program, where I believe it would be indefensible to have it
load CGI.pm. Maybe that's why I'm so sensible about this. :)
 
R

Randal L. Schwartz

Gunnar> One situation where doing so makes sense is when efficiency matters.

Gunnar> I have a program, where I believe it would be indefensible to have it
Gunnar> load CGI.pm. Maybe that's why I'm so sensible about this. :)

You *do* realize that CGI.pm uses a "compile as you go" mechanism?
Very little of the file is loaded unless you specifically ask for it.
Do not be confused by its sheer size.

I'd bet it'd be hard to get something that is even *twice* as efficient
that has all the security provisions and knowledge accumulated over
the years in CGI.pm.

Please show me your code that is more than twice as efficient as CGI.pm,
and yet as still as secure.
 
A

Alan J. Flavell

I have a program, where I believe it would be indefensible to have it
load CGI.pm.

Then it's probably indefensible to run it from the traditional CGI in
the first place: you should be looking to run it from mod_perl or
other persistent environment, where the overhead of loading CGI.pm is
no longer of any relevance since it's not being done per-invocation
any more.
Maybe that's why I'm so sensible about this. :)

"sensitive", maybe. "sensible"? - I'd have to reserve judgment until
I saw the full implications, including the security review and some
sensible assessment of the implications for long-term maintainability.

But since I probably couldn't afford the effort to do that security
review and maintainability assessment, I'd probably go with CGI.pm
anyway. I fear this is going to stir up the trolls again, but they're
fairly well plonked, so I'm just going to have my say and then leave
it at that.

cheers
 
G

Gunnar Hjalmarsson

Randal said:
You *do* realize that CGI.pm uses a "compile as you go" mechanism?

Yep. Actually it was you who called my attention to it a few months
ago. :)
I'd bet it'd be hard to get something that is even *twice* as efficient
that has all the security provisions and knowledge accumulated over
the years in CGI.pm.

Please show me your code that is more than twice as efficient as CGI.pm,
and yet as still as secure.

I don't claim it to be as secure as CGI.pm, but I believe that the
security of the program *as a whole* is sufficient. (Neither do I
claim it to serve as a general purpose code for parsing CGI data, of
course.)

This is the code I'm currently using *in that particular program*:

if ($ENV{'REQUEST_METHOD'} eq 'POST') {
read (STDIN, $rlmain::data, $ENV{'CONTENT_LENGTH'});
} else {
$rlmain::data = $ENV{'QUERY_STRING'};
}
$rlmain::data =~ tr/+/ /;
for (split /[&;]/, $rlmain::data) {
my ($name, $value) = split /=/;
$name = 'ringid' if lc $name eq 'ringid';
$name = 'siteid' if lc $name eq 'siteid';
$name = 'offset' if lc $name eq 'offset';
$value =~ s/%(..)/pack("c",hex($1))/ge;
$value =~ tr/\r//d; # Windows fix
$rlmain::data{$name} = $value;
}

Comments:

- I should probably have it check the size of STDIN and whether the
read() statement is successful.

- The program does not acknowledge any field names that don't match
/^\w+$/, so I don't unescape the names.

- The program does not contain any multi-value fields.

- The program is run in taint mode.

This is the web site for the program: http://www.ringlink.org/
 
G

Gunnar Hjalmarsson

Alan said:
Then it's probably indefensible to run it from the traditional CGI
in the first place:

Some may claim it is. (For some reason that comment wasn't unexpected.
;-) )
you should be looking to run it from mod_perl or other persistent
environment, where the overhead of loading CGI.pm is no longer of
any relevance since it's not being done per-invocation any more.

I have already done that, so the program is prepared to be (and is
actually in a few cases) run under mod_perl. However, there are
hundreds or 1,000+ users, and most of them don't have access to
mod_perl...
"sensitive", maybe.

Hmm.. Yes, of course. It wasn't my intention to claim that I'm
sensible, even if *I* think I am. :)
"sensible"? - I'd have to reserve judgment until I saw the full
implications, including the security review and some sensible
assessment of the implications for long-term maintainability.

Even if I provided a link in my reply to Randal, I ask you to please
not do that, Alan, at least not yet...

I started to write that program more than three years ago, and at that
time my programming experience basically consisted of having modified
a couple of Matt's Scripts. :) One thing that bothers me is all those
global scalar variables, so I'm sure you wouldn't find the program
easily maintained. Sooner or later I'll do a redesign, but I'll wait
until I have learned the basics of OOP.
 
G

Gunnar Hjalmarsson

Thanks for your comments!

Purl said:
A quick comment on this line above. All browsers submit

\r\n

when ENTER is pressed and the cursor is inside a text area box.
This is not specific to Windows, under those conditions.

Blank lines were added on Windows, unlike Unix/Linux, when submitting
multiple-line entries via textarea fields, also when I wasn't dealing
with data that had been read from a file. I haven't digged into it
very deep, but the above line does make a difference.
Adding security features to your method, is so very easy. A quick
example for html enabled form input,

$value =~ s/`/`/g;

As regards that aspect of security, maybe I should have added that
data submitted by users who are not logged-in, and which ends up on
generated HTML pages, is converted by this sub:

sub htmlize {
$_[0] =~ s/&/&amp;/g;
$_[0] =~ s/"/&quot;/g;
$_[0] =~ s/</&lt;/g;
$_[0] =~ s/>/&gt;/g;
return $_[0];
}

That conversion is done at a later stage, since (non-converted) HTML
is occationally included in email messages.

Wouldn't that take care of the risk with backticks as well?
 
G

Gunnar Hjalmarsson

Purl said:
Gunnar said:
maybe I should have added that data submitted by users who are
not logged-in, and which ends up on generated HTML pages, is
converted by this sub:

sub htmlize {
$_[0] =~ s/&/&amp;/g;
$_[0] =~ s/"/&quot;/g;
$_[0] =~ s/</&lt;/g;
$_[0] =~ s/>/&gt;/g;
return $_[0];
}

That conversion is done at a later stage, since (non-converted)
HTML is occationally included in email messages.

Wouldn't that take care of the risk with backticks as well?

As you know, the degree of risk of input data is directly related
to "what" a program does. If a program does not call any functions
susceptable to backtick syntax, no problem.

Contrasting this, our Chahta Chat is susceptable to hostile html
tags. Nonetheless, we want our visitors to be able to enjoy fancy
fonts, colors, pictures and all that.

For html, like you, processing outside a read and parse takes care
of this.

my (@bad_word_list) = ("<applet", "<blockquote", "<body", "<dl", "<form",
"<head", "<html", "<ol", "<object", "<plaintext",
"<script", "<strike", "<xmp", "<ul", "<h1", "<h2",
"<h3", "<h4", "<h5", "<h6", "|/", "<embed",
"face=symbol", "face=system", "strnps", ... others

Now you are talking about a desire to allow users to modify the
program generated HTML, which reminds me about this ciwac thread:

http://groups.google.se/groups?th=eeb2ba0a37e50722

Even if this is an important security matter as regards CGI scripts, I
suppose it's off topic for this group. Nevertheless, it's worth
noticing that it needs to be handled outside the initial CGI parsing
routine, whether that is done by help of CGI.pm or not.
 
E

Eric J. Roode

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

One situation where doing so makes sense is when efficiency matters.

I have a program, where I believe it would be indefensible to have it
load CGI.pm. Maybe that's why I'm so sensible about this. :)

If I recall correctly, CGI.pm has only about 200 lines of code that gets
compiled when the module is first loaded. If the time it takes to
compile those 200 lines makes a difference in the execution of your
program, then I suspect Perl/CGI is the wrong technology to be using.
:) I'd suggest mod_perl, FastCGI, or maybe even writing the CGI input
parsing code in C and loading it via XS.

What sort of timing did you use to determine that CGI.pm was slowing you
down? I keep hearing that CGI.pm is slow and inefficient, but I have
never seen any numbers to back it up, and in my (admittedly anecdotal)
experience, I haven't seen a problem with it.

- --
Eric
$_ = reverse sort $ /. r , qw p ekca lre uJ reh
ts p , map $ _. $ " , qw e p h tona e and print

-----BEGIN PGP SIGNATURE-----
Version: PGPfreeware 7.0.3 for non-commercial use <http://www.pgp.com>

iQA/AwUBP7ayEmPeouIeTNHoEQJBJwCgr5zedNlpZf1OdzUPrl3kGEmb0+MAoNvP
+M/xpcGcBuUXU0lupBZ1755w
=oAwu
-----END PGP SIGNATURE-----
 
E

Eric J. Roode

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

This method you display will be a minimum eight-hundred percent
more efficient than CGI.pm and will be an average, a very large
average, one-thousand-three-hundred percent more efficient.
That is one-hundred-thirty times faster, not just twice as fast.

How do you calculate that?

(by the way, 1300% is 13 times faster, not 130 times).

- --
Eric
$_ = reverse sort $ /. r , qw p ekca lre uJ reh
ts p , map $ _. $ " , qw e p h tona e and print

-----BEGIN PGP SIGNATURE-----
Version: PGPfreeware 7.0.3 for non-commercial use <http://www.pgp.com>

iQA/AwUBP7a0DWPeouIeTNHoEQLpPQCgk4YMkSBVXTPlF4bpzsWAl07tbPAAnimH
0ogqfySe3i7T5fJq56mzHGAW
=h3pt
-----END PGP SIGNATURE-----
 
G

Gunnar Hjalmarsson

Eric said:
If I recall correctly, CGI.pm has only about 200 lines of code that
gets compiled when the module is first loaded. If the time it
takes to compile those 200 lines makes a difference in the
execution of your program, then I suspect Perl/CGI is the wrong
technology to be using. :) I'd suggest mod_perl, FastCGI, or
maybe even writing the CGI input parsing code in C and loading it
via XS.

Please see my reply to Alan about that.
What sort of timing did you use to determine that CGI.pm was
slowing you down?

None.

It's just that the program by its very nature may be used in such a
way that repeated calls put quite some load on the server. For that
reason I'm trying to avoid unnecessary load, and since I have already
"reinvented the wheel", keeping to not using CGI.pm is an easy
contribution to that goal.
 
E

Eric J. Roode

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

None.

It's just that the program by its very nature may be used in such a
way that repeated calls put quite some load on the server. For that
reason I'm trying to avoid unnecessary load, and since I have already
"reinvented the wheel", keeping to not using CGI.pm is an easy
contribution to that goal.

Wait, let me get this straight -- you have no idea whether CGI.pm is faster
or slower than your own code, yet you choose to stick to your own code in
the belief that it contributes to your goal of avoiding unnecessary load?

- --
Eric
$_ = reverse sort $ /. r , qw p ekca lre uJ reh
ts p , map $ _. $ " , qw e p h tona e and print

-----BEGIN PGP SIGNATURE-----
Version: PGPfreeware 7.0.3 for non-commercial use <http://www.pgp.com>

iQA/AwUBP7bAAmPeouIeTNHoEQLMKQCaApd4QCCvLJVFyXPuLm/beJRuSJAAoLzo
0dKs9LXp4Ld7eY0KGVncX2+K
=LKAc
-----END PGP SIGNATURE-----
 
G

Gunnar Hjalmarsson

Eric said:
Wait, let me get this straight -- you have no idea whether CGI.pm
is faster or slower than your own code, yet you choose to stick to
your own code in the belief that it contributes to your goal of
avoiding unnecessary load?

Yes, I have an idea. I'm sure that CGI.pm is slower. Thought the code
I posted made that apparent to anybody who has an idea of what CGI.pm
is about.

However, I can't tell *how much* slower since I haven't measured it.

Why are you making such a fuss about that?
 
B

Ben Morrow

Gunnar Hjalmarsson said:
Yes, I have an idea. I'm sure that CGI.pm is slower. Thought the code
I posted made that apparent to anybody who has an idea of what CGI.pm
is about.

By no means. In general, guessing that a particular piece of code will
run slower or faster than another is a dodgy business. The *only* way
to tell is to run benchmarks.

Premature optimisation is the root of all evil, &c...

Ben
 
G

Gunnar Hjalmarsson

Ben said:
By no means.

Please, Ben, how about reading the code before making such a comment?
In general, guessing that a particular piece of code will run
slower or faster than another is a dodgy business. The *only* way
to tell is to run benchmarks.

Yes, in general.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,754
Messages
2,569,521
Members
44,995
Latest member
PinupduzSap

Latest Threads

Top