Strange string formatting error

S

sjp

I've written some helper perl scripts to pull data from a mysql database
and output the data to a textfile for inclusion on a web front page that
I am using.

When I output the data directly from a script, I get strange garbled
characters rather than expected punctuation.

http://test.bluesites.net/topstory.html

When the page is filtered through drupal's rendering engine, the problem
gets solved:

http://test.bluesites.net

I suspect that this has something to do with the strings being encoded in
some format that my perl interpreter isn't set up to recognize, but the
pho interpreter used by drupal appears to. I haven't really been able to
describe the problem in a way that has given me useful leads to fixing
the problem in google or google groups, can someone point me in the right
direction?

TIA
 
S

sjp

What do you see when you run the script in a terminal window instead of
a browser?

Depends on the terminal. If I'm on my linux box, everything looks fine.
Putty in windows gives me an umlaut over an 'a' every place where I have
a single quotation mark.
 
J

Joe Smith

sjp said:
I've written some helper perl scripts to pull data from a mysql database
and output the data to a textfile for inclusion on a web front page that
I am using.

When I output the data directly from a script, I get strange garbled
characters rather than expected punctuation.

http://test.bluesites.net/topstory.html

Using Firefox with View -> Character Encoding -> Western (ISO-8859-1):
the “rule of law†everywhere...This isn’t a trivial matter.

Using Firefox with View -> Character Encoding -> Unicode (UTF-8):
the “rule of law” everywhere ... This isn’t a trivial matter.

Getting rid of the “smart quotes” and using plain " and ' instead:
the "rule of law" everywhere ... This isn't a trivial matter.

The latter is what the drupal rendering engine is doing.

-Joe



http://www.inwap.com/mybin/list-files.pl?unsmartquote

#!/usr/local/bin/perl -p
# Name: unsmartquote 04-Sep-2001
# Purpose: Translates charset=windows-1252 to charset=iso-8859-1
# Fixes apostrophes, open and close double quotes, the British pound
# sign, and long dashes. (When using MSIE's "Save As" function, the
# saved HTML is converted from the standard international character set
# to Microsoft's 1252 code page. This script undoes the damage.)

tr/\306\364\366\372/'""\243/; # Single&double quotes, British pound
s/\371/--/g; # em-dash
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top