PB with euro sign and checkbox in multipart/form-data

Discussion in 'Perl Misc' started by Yohan N. Leder, May 18, 2006.

  1. Hi,

    What do you think about the pb explained in this test script called
    form2dump.pl :

    #!/usr/bin/perl -w
    # Script written to solve the bug explained below :
    # PB : € sign in any form field corrupt beginning of multipart/form-data
    # in STDIN (1st lines with boundary & 1st field declar truncated)
    # CAUSE : checkbox without any value (uncheckd) cause this pb
    # - without <input type='checkbox' name='chk'>, it works
    # - with <input type='checkbox' name='chk'> checked, it works
    # NB : strange because an unchecked box shouldn't be sent !
    # IDEA : I've tried to provide an hidden field with same name as
    # checkbox which would submit an 'off' value when checkbox is
    # unchecked, but both values are sent when checkbox is checked
    # SOL : ?

    print "Content-type: text/html; charset=iso-8859-1\n\n";
    if ($ENV{'QUERY_STRING'} =~ /add/)
    {
    read STDIN, my $buff, $ENV{'CONTENT_LENGTH'};
    print "<b>Multipart/form-data (ok because no binary data inside)
    </b><hr>$buff";
    }
    else
    {
    print <<FORM;
    <form action='/cgi-bin/form2dump.pl?add'
    method='post' enctype='multipart/form-data' accept-charset='iso-8859-
    1'>
    <input type='text' name='txt1'><br>
    <input type='text' name='txt2'><br>
    <input type='text' name='txt3'><br>
    <input type='text' name='txt4'><br>
    <input type='text' name='txt5'><br>
    <input type='submit'>
    <input type='checkbox' name='chk' value='on'>
    </form>
    FORM
    }
    exit 0;
    Yohan N. Leder, May 18, 2006
    #1
    1. Advertising

  2. Yohan N. Leder wrote:

    > What do you think about the pb explained in this test script


    I think it appears that you are talking about a web browser bug.

    > # PB : € sign in any form field corrupt beginning of multipart/form-data


    > <form action='/cgi-bin/form2dump.pl?add'
    > method='post' enctype='multipart/form-data' accept-charset='iso-8859-
    > 1'>


    There is no € (Euro) in iso-8859-1

    > <input type='text' name='txt1'><br>


    If your browser is allowing you to type a € into a one of those
    fields then it is broken and you should file a bug report.

    Having let you enter invalid data you say your browser then gets
    confused about constructing the multipart/form-data response but
    obviously there would be no point my attempting to reproduce this as my
    browser dosen't exhibit the first bug so can't be the one you are
    using.

    Anyhow I can't see haw any of this relates to Perl.
    Brian McCauley, May 18, 2006
    #2
    1. Advertising

  3. Brian McCauley wrote:
    > Yohan N. Leder wrote:
    >
    > > What do you think about the pb explained in this test script

    >
    > I think it appears that you are talking about a web browser bug.
    >
    > > # PB : € sign in any form field corrupt beginning of multipart/form-data

    >
    > > <form action='/cgi-bin/form2dump.pl?add'
    > > method='post' enctype='multipart/form-data' accept-charset='iso-8859-
    > > 1'>

    >
    > There is no € (Euro) in iso-8859-1
    >
    > > <input type='text' name='txt1'><br>

    >
    > If your browser is allowing you to type a € into a one of those
    > fields then it is broken and you should file a bug report.


    I have just noticed this bug is present in the browser I'm using after
    all.

    > Having let you enter invalid data you say your browser then gets
    > confused about constructing the multipart/form-data response but
    > obviously there would be no point my attempting to reproduce this as my
    > browser dosen't exhibit the first bug so can't be the one you are
    > using.


    My browser does not exibit the second bug. I'm guessing it may be to
    do with multi-byte encodings and the "Content-length" header.

    > Anyhow I can't see haw any of this relates to Perl.


    I stand by that.
    Brian McCauley, May 18, 2006
    #3
  4. Yohan N Leder <> wrote:

    > What do you think about the pb explained in this test script

    ^^
    ^^

    I read the whole program, but I did not see any explanation
    of Peanut Butter...


    --
    Tad McClellan SGML consulting
    Perl programming
    Fort Worth, Texas
    Tad McClellan, May 18, 2006
    #4
  5. In article <>,
    says...
    > > Anyhow I can't see haw any of this relates to Perl.

    >
    > I stand by that.
    >


    It's just related to Perl because I'm using a Perl tet script to expose
    the bug and, obviously, if I'll go to post this same script in a groups
    not related to Perl, I'll heard : "no no, it's not related to our group,
    because your script is in Perl and we don't know Perl"... Well, I'm
    trying all the same in another group...

    Also, my test are done using IE6 FR under Win2K Pro and ActivePerl 5.8.7
    Yohan N. Leder, May 18, 2006
    #5
  6. In article <>,
    says...
    > I read the whole program, but I did not see any explanation
    > of Peanut Butter...
    >


    Did you ran the script ?
    Yohan N. Leder, May 19, 2006
    #6
  7. In article <>,
    says...
    > There is no =80 (Euro) in iso-8859-1
    >


    Right but I'm able to enter it using the [Alt-Gr] + E combination on
    Azerty keyboard. I've tried this on three stations using IE6 and Win2K
    or XP and, don't know why, but it works.
    Yohan N. Leder, May 19, 2006
    #7
  8. Yohan N. Leder

    Dr.Ruud Guest

    Yohan N. Leder schreef:

    > # PB : € sign in any form field corrupt beginning of


    You are including a character >127 in a posting that has no Content-Type
    header field that explains which character is meant there.
    Also there is no Content-Transfer-Encoding header field about 8bits. The
    default is 7bits (ASCII), so newsclients will set the highest bit of all
    out-of-band characters to zero.

    To find the error in the HTML that you generate, read
    http://en.wikipedia.org/wiki/ISO_8859-1

    --
    Affijn, Ruud

    "Gewoon is een tijger."
    Dr.Ruud, May 19, 2006
    #8
  9. Yohan N Leder <> wrote:
    > In article <>,
    > says...
    >> I read the whole program, but I did not see any explanation
    >> of Peanut Butter...
    >>

    >
    > Did you ran the script ?



    No.


    You seem to have missed my point.

    Please don't use cutsie spellings in Usenet posts.

    "PB" might mean "peanut butter".

    "problem" clearly means "problem", so use that instead.


    --
    Tad McClellan SGML consulting
    Perl programming
    Fort Worth, Texas
    Tad McClellan, May 19, 2006
    #9
  10. In article <>,
    says...
    > "PB" might mean "peanut butter".
    >


    Sorry, it was out of habit while 'pb' means usually 'problem' in french
    Yohan N. Leder, May 22, 2006
    #10
  11. In article <>,
    says...
    > You are including a character >127 in a posting that has no Content-Type
    >


    The HTML page I generated contains "Content-type: text/html;
    charset=iso-8859-1"
    Yohan N. Leder, May 22, 2006
    #11
  12. Yohan N. Leder

    Dr.Ruud Guest

    Yohan N. Leder schreef:

    > The HTML page I generated contains "Content-type: text/html;
    > charset=iso-8859-1"


    As you have been told before, an as you can check on the
    wikipedia.org-page about that charset, ISO-8859-1 does not contain a
    Euro-sign.

    --
    Affijn, Ruud

    "Gewoon is een tijger."
    Dr.Ruud, May 22, 2006
    #12
  13. In article <>,
    says...
    > Yohan N. Leder schreef:
    >
    > > The HTML page I generated contains "Content-type: text/html;
    > > charset=iso-8859-1"

    >
    > As you have been told before, an as you can check on the
    > wikipedia.org-page about that charset, ISO-8859-1 does not contain a
    > Euro-sign.
    >
    >


    Agreed, but the fact is that with some browser (a known IE bug as stated
    by someone in alt.html), more than just send an unrepresentable
    character it corrupt the full posted content.

    Because of this and after some discuss here in two threads (including
    this current one) and a one in alt.html, I've added some trivial
    checkings on the multipart/form-data content found at the arrival in
    STDIN. Thus, even if the bug stays (corrupted content due to the
    presence of outside charset char + uncheckd checkbox), I'm not trying to
    parse a malformed multipart content. Result of these basics checkings
    are visible in the sentence starting with "Integrity :" after submission
    of form here : <http://yohannl.tripod.com/cgi-bin/form2dump.pl>.

    So, now, I have to choose to use the ISO-8859-15 charset or go to UTF-8
    (sounds a nightmare with Perl 5.00503 I have to keep in mind ; eben if
    I'm developing under 5.6 and 5.8)... to be able to accept this euro
    sign.

    What do you think about ISO-8859-15 ? What major drawbacks if any ?
    Yohan N. Leder, May 22, 2006
    #13
  14. Yohan N. Leder

    Dr.Ruud Guest

    Yohan N. Leder schreef:

    > Result of these basics
    > checkings are visible in the sentence starting with "Integrity :"
    > after submission of form here :
    > <http://yohannl.tripod.com/cgi-bin/form2dump.pl>.


    That page isn't valid html.

    <http://validator.w3.org/check?uri=http://yohannl.tripod.com/cgi-bin/for
    m2dump.pl&verbose=1&doctype=Inline>

    --
    Affijn, Ruud

    "Gewoon is een tijger."
    Dr.Ruud, May 22, 2006
    #14
  15. Yohan N. Leder, May 23, 2006
    #15
  16. Yohan N. Leder

    Dr.Ruud Guest

    Yohan N. Leder schreef:
    > rvtol:


    >> That page isn't valid html.
    >>
    >>

    <http://validator.w3.org/check?uri=http://yohannl.tripod.com/cgi-bin/for
    >> m2dump.pl&verbose=1&doctype=Inline>

    >
    > OK, solved (added DOCTYPE and html/head/body tags) : however, it
    > doesn't change anything about form submission behavior.


    The response page should use "&euro;" to disambiguate the character
    0xA4, because in the active encoding of the page (ISO-8859-1), the
    character 0xA4 is the currency-character "&curren;" and not specifically
    the Euro-sign.

    See http://en.wikipedia.org/wiki/ISO_8859-15 about the 8 differences
    between it and Latin1.

    --
    Affijn, Ruud

    "Gewoon is een tijger."
    Dr.Ruud, May 23, 2006
    #16
  17. In article <>,
    says...
    > The response page should use "&euro;" to disambiguate the character
    > 0xA4, because in the active encoding of the page (ISO-8859-1), the
    > character 0xA4 is the currency-character "&curren;" and not specifically
    > the Euro-sign.
    >
    > See http://en.wikipedia.org/wiki/ISO_8859-15 about the 8 differences
    > between it and Latin1.
    >


    OK, thanks I'll read-it asap. Nevertheless, what do you think ? Awaiting
    to go to UTF-8 using a more recent Perl interpreter, do you think it's
    better to use ISO-8859-1 and convert euro sign as html entity on receive
    (ie. after reading of STDIN), or use ISO-8859-15 instead (without html
    entity conversion) ?
    Yohan N. Leder, May 23, 2006
    #17
  18. Yohan N. Leder

    Dr.Ruud Guest

    Yohan N. Leder schreef:
    > rvtol:


    >> The response page should use "&euro;" to disambiguate the character
    >> 0xA4, because in the active encoding of the page (ISO-8859-1), the
    >> character 0xA4 is the currency-character "&curren;" and not
    >> specifically the Euro-sign.
    >>
    >> See http://en.wikipedia.org/wiki/ISO_8859-15 about the 8 differences
    >> between it and Latin1.

    >
    > OK, thanks I'll read-it asap. Nevertheless, what do you think ?
    > Awaiting to go to UTF-8 using a more recent Perl interpreter, do you
    > think it's better to use ISO-8859-1 and convert euro sign as html
    > entity on receive (ie. after reading of STDIN), or use ISO-8859-15
    > instead (without html entity conversion) ?


    In this specific (border) case, I would certainly convert from 0xA4 to
    "&euro;" or even to "EUR&nbsp;". I have never seen semantic usage of the
    "&curren;"; it is mainly used for ASCII-art (and of course in texts and
    tables about HTML).

    --
    Affijn, Ruud

    "Gewoon is een tijger."
    Dr.Ruud, May 23, 2006
    #18
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. rphil

    Euro sign in .Net

    rphil, Apr 26, 2005, in forum: ASP .Net
    Replies:
    4
    Views:
    3,098
    Joerg Jooss
    Apr 28, 2005
  2. kingski

    Problem: Euro sign in sending email !

    kingski, Mar 3, 2006, in forum: ASP .Net
    Replies:
    7
    Views:
    691
    Juan T. Llibre
    Mar 4, 2006
  3. kingski

    Problem: Euro sign in send mail.

    kingski, Mar 3, 2006, in forum: ASP .Net
    Replies:
    0
    Views:
    417
    kingski
    Mar 3, 2006
  4. Marco W
    Replies:
    1
    Views:
    598
    David Carlisle
    Jun 8, 2005
  5. Yohan N. Leder
    Replies:
    11
    Views:
    987
    Jukka K. Korpela
    May 20, 2006
Loading...

Share This Page