Displaying utf8 text in perl -d


L

lbova99

Please forgive the cross-post. I asked this question originally
on perl.beginners, because that's what I am. But since I haven't
had any feedback, I thought I'd ask here ...

I work with utf8 non-English text frequently, using perl 5.8.8 on
Solaris-10. I can include the following lines in my code and work
successfully ( I'm writing these from memory, so please forgive my
syntax.):

binmode STDIN, ':utf8';
binmode STDOUT, ':utf8';
binmode STDERR, ':utf8';
use utf8;

This allows my apps to work gracefully with utf8 data. However, when
I use the debugger (perl -d), I have always had some problems.
Basically, I could see the utf8 literals in my code, but anytime I "p"
or "x", the characters above the low ASCII range simply disappeared.
They were still in the variables, which I could tell by counting the
length of strings, but they vanished from the visible output.

Recently, I found a clue to this problem, and now I include the
following in my app:

binmode $DB::OUT, ':utf8';

Eureka! Now the utf8 data is visible when I "p" or "x". This is a
great improvement. However, now the utf8 literals in my code are
mangled. They display with some form of "^_" instead of displaying as
"themselves".

I am a lot better off this way than I used to be, but I'm sure there's
some more magic to be applied to this problem. As a beginner, I spend
a lot of time in the debugger, so a solution would be most helpful.

Thanks,
Lou
 
Ad

Advertisements

I

Ilya Zakharevich

[A complimentary Cc of this posting was sent to

binmode $DB::OUT, ':utf8';

Yes, as expected.
Eureka! Now the utf8 data is visible when I "p" or "x". This is a
great improvement. However, now the utf8 literals in my code are
mangled. They display with some form of "^_" instead of displaying as
"themselves".

Under debugger, the code is stored in special arrays, see the docs.
You need to find out in which format it is stored. I expect that
internally (as accessible from C) it is stored in utf8 C strings; but
the code which translates these C strings to Perl strings does not
mark them with HAVE-UTF8 flag.

Please report,
Ilya

P.S. You need something like

print join q( ), map ord, split //, $string;

to get the understandable info.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top