mod_perl/cgi character encoding issues


sy crisp

I have a web page with a textarea into which users can enter text
including symbols such as á é £.

When I enter the text 'a á £ $' and submit the form under CGI I get
'a \xc3\xa1 \xc2\xa3 $' returned via cgi->param.

Running the same code under mod_perl returns 'a á £ $' which
looks as though every utf8 byte has been translated into latin1.

Below is an example script to demonstrate the problem.

Can anyone tell me what I need to do to stop this double encoding of
form data under mod_perl?

Or failing that, how I can convert from 'a á £ $' back to 'a
\xc3\xa1 \xc2\xa3 $'?



#!/usr/local/bin/perl -w

use strict;
use CGI;

binmode STDERR;

my $q = new CGI;
print $q->header("text/html; charset=utf-8");
print "<HTML><BODY>\n";

my $msg = $q->param('AAAA');
print "<br>AAAA: $msg\n" if $msg;
print STDERR "AAAA: $msg\n" if $msg;

print $q->start_form(-method=>'GET');
print $q->textarea(-name=>'AAAA');
print $q->submit(-name=>'DOIT');
print $q->end_form();
print $q->end_html();

CGI-> AAAA: a \xc3\xa1 \xc2\xa3 $

MOD_PERL-> AAAA: a á £ $



sy crisp

Some further information that might me useful;

Apache version 2.0.52 with AddDefaultCharset set to UTF-8

mod_perl is version 2.0.0

Perl is version 5.8.6 is version 3.10

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question