PB with euro sign and checkbox in multipart/form-data

Y

Yohan N. Leder

Hi,

What do you think about the pb explained in this test script called
form2dump.pl :

#!/usr/bin/perl -w
# Script written to solve the bug explained below :
# PB : € sign in any form field corrupt beginning of multipart/form-data
# in STDIN (1st lines with boundary & 1st field declar truncated)
# CAUSE : checkbox without any value (uncheckd) cause this pb
# - without <input type='checkbox' name='chk'>, it works
# - with <input type='checkbox' name='chk'> checked, it works
# NB : strange because an unchecked box shouldn't be sent !
# IDEA : I've tried to provide an hidden field with same name as
# checkbox which would submit an 'off' value when checkbox is
# unchecked, but both values are sent when checkbox is checked
# SOL : ?

print "Content-type: text/html; charset=iso-8859-1\n\n";
if ($ENV{'QUERY_STRING'} =~ /add/)
{
read STDIN, my $buff, $ENV{'CONTENT_LENGTH'};
print "<b>Multipart/form-data (ok because no binary data inside)
</b><hr>$buff";
}
else
{
print <<FORM;
<form action='/cgi-bin/form2dump.pl?add'
method='post' enctype='multipart/form-data' accept-charset='iso-8859-
1'>
<input type='text' name='txt1'><br>
<input type='text' name='txt2'><br>
<input type='text' name='txt3'><br>
<input type='text' name='txt4'><br>
<input type='text' name='txt5'><br>
<input type='submit'>
<input type='checkbox' name='chk' value='on'>
</form>
FORM
}
exit 0;
 
B

Brian McCauley

Yohan said:
What do you think about the pb explained in this test script

I think it appears that you are talking about a web browser bug.
# PB : € sign in any form field corrupt beginning of multipart/form-data
<form action='/cgi-bin/form2dump.pl?add'
method='post' enctype='multipart/form-data' accept-charset='iso-8859-
1'>

There is no € (Euro) in iso-8859-1
<input type='text' name='txt1'><br>

If your browser is allowing you to type a € into a one of those
fields then it is broken and you should file a bug report.

Having let you enter invalid data you say your browser then gets
confused about constructing the multipart/form-data response but
obviously there would be no point my attempting to reproduce this as my
browser dosen't exhibit the first bug so can't be the one you are
using.

Anyhow I can't see haw any of this relates to Perl.
 
B

Brian McCauley

Brian said:
I think it appears that you are talking about a web browser bug.



There is no € (Euro) in iso-8859-1


If your browser is allowing you to type a € into a one of those
fields then it is broken and you should file a bug report.

I have just noticed this bug is present in the browser I'm using after
all.
Having let you enter invalid data you say your browser then gets
confused about constructing the multipart/form-data response but
obviously there would be no point my attempting to reproduce this as my
browser dosen't exhibit the first bug so can't be the one you are
using.

My browser does not exibit the second bug. I'm guessing it may be to
do with multi-byte encodings and the "Content-length" header.
Anyhow I can't see haw any of this relates to Perl.

I stand by that.
 
T

Tad McClellan

Yohan N Leder said:
What do you think about the pb explained in this test script
^^
^^

I read the whole program, but I did not see any explanation
of Peanut Butter...
 
Y

Yohan N. Leder

I stand by that.

It's just related to Perl because I'm using a Perl tet script to expose
the bug and, obviously, if I'll go to post this same script in a groups
not related to Perl, I'll heard : "no no, it's not related to our group,
because your script is in Perl and we don't know Perl"... Well, I'm
trying all the same in another group...

Also, my test are done using IE6 FR under Win2K Pro and ActivePerl 5.8.7
 
Y

Yohan N. Leder

There is no =80 (Euro) in iso-8859-1

Right but I'm able to enter it using the [Alt-Gr] + E combination on
Azerty keyboard. I've tried this on three stations using IE6 and Win2K
or XP and, don't know why, but it works.
 
D

Dr.Ruud

Yohan N. Leder schreef:
# PB : € sign in any form field corrupt beginning of

You are including a character >127 in a posting that has no Content-Type
header field that explains which character is meant there.
Also there is no Content-Transfer-Encoding header field about 8bits. The
default is 7bits (ASCII), so newsclients will set the highest bit of all
out-of-band characters to zero.

To find the error in the HTML that you generate, read
http://en.wikipedia.org/wiki/ISO_8859-1
 
T

Tad McClellan

Yohan N Leder said:
Did you ran the script ?


No.


You seem to have missed my point.

Please don't use cutsie spellings in Usenet posts.

"PB" might mean "peanut butter".

"problem" clearly means "problem", so use that instead.
 
Y

Yohan N. Leder

You are including a character >127 in a posting that has no Content-Type

The HTML page I generated contains "Content-type: text/html;
charset=iso-8859-1"
 
D

Dr.Ruud

Yohan N. Leder schreef:
The HTML page I generated contains "Content-type: text/html;
charset=iso-8859-1"

As you have been told before, an as you can check on the
wikipedia.org-page about that charset, ISO-8859-1 does not contain a
Euro-sign.
 
Y

Yohan N. Leder

Yohan N. Leder schreef:


As you have been told before, an as you can check on the
wikipedia.org-page about that charset, ISO-8859-1 does not contain a
Euro-sign.

Agreed, but the fact is that with some browser (a known IE bug as stated
by someone in alt.html), more than just send an unrepresentable
character it corrupt the full posted content.

Because of this and after some discuss here in two threads (including
this current one) and a one in alt.html, I've added some trivial
checkings on the multipart/form-data content found at the arrival in
STDIN. Thus, even if the bug stays (corrupted content due to the
presence of outside charset char + uncheckd checkbox), I'm not trying to
parse a malformed multipart content. Result of these basics checkings
are visible in the sentence starting with "Integrity :" after submission
of form here : <http://yohannl.tripod.com/cgi-bin/form2dump.pl>.

So, now, I have to choose to use the ISO-8859-15 charset or go to UTF-8
(sounds a nightmare with Perl 5.00503 I have to keep in mind ; eben if
I'm developing under 5.6 and 5.8)... to be able to accept this euro
sign.

What do you think about ISO-8859-15 ? What major drawbacks if any ?
 
D

Dr.Ruud

Yohan N. Leder schreef:
rvtol:

OK, solved (added DOCTYPE and html/head/body tags) : however, it
doesn't change anything about form submission behavior.

The response page should use "&euro;" to disambiguate the character
0xA4, because in the active encoding of the page (ISO-8859-1), the
character 0xA4 is the currency-character "&curren;" and not specifically
the Euro-sign.

See http://en.wikipedia.org/wiki/ISO_8859-15 about the 8 differences
between it and Latin1.
 
Y

Yohan N. Leder

The response page should use "&euro;" to disambiguate the character
0xA4, because in the active encoding of the page (ISO-8859-1), the
character 0xA4 is the currency-character "&curren;" and not specifically
the Euro-sign.

See http://en.wikipedia.org/wiki/ISO_8859-15 about the 8 differences
between it and Latin1.

OK, thanks I'll read-it asap. Nevertheless, what do you think ? Awaiting
to go to UTF-8 using a more recent Perl interpreter, do you think it's
better to use ISO-8859-1 and convert euro sign as html entity on receive
(ie. after reading of STDIN), or use ISO-8859-15 instead (without html
entity conversion) ?
 
D

Dr.Ruud

Yohan N. Leder schreef:
rvtol:

OK, thanks I'll read-it asap. Nevertheless, what do you think ?
Awaiting to go to UTF-8 using a more recent Perl interpreter, do you
think it's better to use ISO-8859-1 and convert euro sign as html
entity on receive (ie. after reading of STDIN), or use ISO-8859-15
instead (without html entity conversion) ?

In this specific (border) case, I would certainly convert from 0xA4 to
"&euro;" or even to "EUR&nbsp;". I have never seen semantic usage of the
"&curren;"; it is mainly used for ASCII-art (and of course in texts and
tables about HTML).
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top