A
Alan J. Flavell
In response to an emailed question which I received from a contributor
here[1], which involved the DATA filehandle, with some character data
provided in utf-8, I have noticed that if I "use utf8;" in the
program, then it appears that the DATA filehandle is treated as utf8
also; if I don't "use utf8;" in the code, then I can still get the
DATA filehandle treated as utf8 if I issue an appropriate binmode() on
it, such as:
binmode DATA, ':utf8'; # if one follows the documentation
binmode DATA, ':encoding(utf8)'; # if one follows Ben Morrow[2]
[2] for well-argued reasons, I should add!
In this particular case, the Perl source code is in us-ascii, so
the distinction between utf-8 and iso-8859-1 is invisible; but I
suppose it would not always be so.
I don't recall reading any guidance on correct use of these features.
It might be over-ambitious to try to have program code containing
actual high-half iso-8859-1 characters, followed by DATA containing
utf-8 encoded data, and hoping to process both correctly? Anyone
familiar with this area please?
[1]p.s
To my email correspondent - both of the addresses I have for you are
spitting back my email to you today, with some confused babble about
address error or virus. I haven't got a virus, I assure you, though
maybe they have!!!
here[1], which involved the DATA filehandle, with some character data
provided in utf-8, I have noticed that if I "use utf8;" in the
program, then it appears that the DATA filehandle is treated as utf8
also; if I don't "use utf8;" in the code, then I can still get the
DATA filehandle treated as utf8 if I issue an appropriate binmode() on
it, such as:
binmode DATA, ':utf8'; # if one follows the documentation
binmode DATA, ':encoding(utf8)'; # if one follows Ben Morrow[2]
[2] for well-argued reasons, I should add!
In this particular case, the Perl source code is in us-ascii, so
the distinction between utf-8 and iso-8859-1 is invisible; but I
suppose it would not always be so.
I don't recall reading any guidance on correct use of these features.
It might be over-ambitious to try to have program code containing
actual high-half iso-8859-1 characters, followed by DATA containing
utf-8 encoded data, and hoping to process both correctly? Anyone
familiar with this area please?
[1]p.s
To my email correspondent - both of the addresses I have for you are
spitting back my email to you today, with some confused babble about
address error or virus. I haven't got a virus, I assure you, though
maybe they have!!!