display VARCHAR(mysql) and special chars in html

J

Jonas Meurer

hello,

my script selects a comment saved as VARCHAR in MySQL and displays it
inside an html page.

the problem is, that the comment contains several special characters, as
mysterious utf-8 hyphens, german umlauts, etc.

i could write a function to parse the comment and substitute special
chars with the relevant html code, but maybe this already exists in some
module?

if not, it'll be hard work, as i've to consider many special chars, and
at least iso-8859-1* and utf-8 as charmaps.

bye
jonas
 
R

Radovan Garabik

Jonas Meurer said:
hello,

my script selects a comment saved as VARCHAR in MySQL and displays it
inside an html page.

the problem is, that the comment contains several special characters, as
mysterious utf-8 hyphens, german umlauts, etc.

i could write a function to parse the comment and substitute special
chars with the relevant html code, but maybe this already exists in some
module?

just make the page in utf-8, and you'll save you a lot of troubles

if not, it'll be hard work, as i've to consider many special chars, and
at least iso-8859-1* and utf-8 as charmaps.

if you insist...
a = u'\u010c'
a.encode('ascii', 'xmlcharrefreplace')


--
-----------------------------------------------------------
| Radovan Garabík http://melkor.dnp.fmph.uniba.sk/~garabik/ |
| __..--^^^--..__ garabik @ kassiopeia.juls.savba.sk |
-----------------------------------------------------------
Antivirus alert: file .signature infected by signature virus.
Hi! I'm a signature virus! Copy me into your signature file to help me spread!
 
J

Jonas Meurer

just make the page in utf-8, and you'll save you a lot of troubles

ok, how do i do this? simply add a second line with this:?
# -*- encoding: utf-8 -*-

i use utf8 locales on my machine anyway.
if you insist...
a = u'\u010c'
a.encode('ascii', 'xmlcharrefreplace')

this fails as the comment contained several chars that couldn't be
converted.

i've changed my plans, and now will transform the comments to html
before saving them in mysql. this way, the comment never contains
special chars except they weren't filtered out when safed in mysql.

do any filters exist, to transform plain text to html? otherwise i might
use third-party products, as text2html.

what do you think?

bye
jonas
 
S

Steve Holden

Jonas said:
ok, how do i do this? simply add a second line with this:?
# -*- encoding: utf-8 -*-

i use utf8 locales on my machine anyway.




this fails as the comment contained several chars that couldn't be
converted.

i've changed my plans, and now will transform the comments to html
before saving them in mysql. this way, the comment never contains
special chars except they weren't filtered out when safed in mysql.

do any filters exist, to transform plain text to html? otherwise i might
use third-party products, as text2html.

what do you think?
I think you should store your data with a known encoding, then encode it
as necessary for transmission. That way you can provide it in the forms
most relevant to different clients.

regards
Steve
 
W

Wolfram Kraus

Jonas said:
hello,

my script selects a comment saved as VARCHAR in MySQL and displays it
inside an html page.

the problem is, that the comment contains several special characters, as
mysterious utf-8 hyphens, german umlauts, etc.

i could write a function to parse the comment and substitute special
chars with the relevant html code, but maybe this already exists in some
module?

if not, it'll be hard work, as i've to consider many special chars, and
at least iso-8859-1* and utf-8 as charmaps.

bye
jonas
If I understand you correctly, just put

<meta http-equiv="CONTENT-TYPE" content="text/html; charset=utf-8">

somewhere in the <head>-section of you HTML-Page.

HTH,
Wolfram
 
R

Radovan Garabik

Wolfram Kraus said:
If I understand you correctly, just put

<meta http-equiv="CONTENT-TYPE" content="text/html; charset=utf-8">

somewhere in the <head>-section of you HTML-Page.

.... and make sure the deault charset of your HTTP server is *OFF* (or
UTF-8), since it overrides the per-page setting (most unfortunate).

--
-----------------------------------------------------------
| Radovan Garabík http://melkor.dnp.fmph.uniba.sk/~garabik/ |
| __..--^^^--..__ garabik @ kassiopeia.juls.savba.sk |
-----------------------------------------------------------
Antivirus alert: file .signature infected by signature virus.
Hi! I'm a signature virus! Copy me into your signature file to help me spread!
 
D

deelan

Jonas Meurer wrote:
(...)
i've changed my plans, and now will transform the comments to html
before saving them in mysql. this way, the comment never contains
special chars except they weren't filtered out when safed in mysql.

do any filters exist, to transform plain text to html? otherwise i might
use third-party products, as text2html.

what do you think?

as you may known mysql 4.1 offers utf-8 support. il would be
wise to keep everything as utf-8: db, html generation and finally
serve, with correct HTTP headers, pages encoded as utf-8.

to do this you might have to fiddle with mysql settings and make
sure that issuing a:

show varibles;

almost all of these settings:

character_set_client latin1
character_set_connection latin1

character_set_database latin1

character_set_results latin1
character_set_server latin1
character_set_system utf8


use utf-8 (as you can see my copy of mysql does not), otherwise i
think bad things will occur.

if you prefer to filter out weird characters and
encode as html &#xxxx entities textile[1] does the job
just fine, you can specify input and output encoding.

cheers,
deelan.

[1] http://dealmeida.net/en/Projects/PyTextile/
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top