How to detect text file encoding in Perl

A

Alan J. Flavell

Ilya> To the contrary. There seems to be a substantial number of
Ilya> non-specialists who still believe in that the term "ASCII" has
some Ilya> unique meaning nowadays. It does not.

There's an ANSI standard that authoritatively refutes that claim; an
IANA assigment (http://www.iana.org/assignments/character-sets citing
RFC1345) which codifies the meaning of the term as it's to be used on
the Internet; and, in more practical terms, there's a whole body of
IETF standards-track RFCs whose meaning would be destroyed if "ASCII"
did not mean what it means: a formally-defined 7-bit encoding
*standard*, ANSI X3.4, and its ISO 646 counterpart.

Other (mis)usages of the term by non-specialists are widespread, I
know, but they're still authoritatively wrong, whatever you or I might
happen to think personally.
Wikipedia *seriously* disagrees with you:

In this case I'd agree with it; but that's hardly the world's most
authoritiative source of information.

I thought that needed to be placed on the record, but now I'll try to
resist any further trolling attempts. :-{
 
D

Dr.Ruud

Ilya Zakharevich schreef:
[My wishfull thinking is the same as for the authors of this entry;
the difference is that I understand that it is hopeless to fight the
"new wave" of M$/Apple-derived jargon.]


http://en.wikipedia.org/wiki/American_National_Standards_Institute

<quote>
In Microsoft Windows, the phrase "ANSI" refers to the Windows ANSI code
pages. Most of these are fixed width though there are some variable
width ones for ideographic languages. Some of these are very close to
the ISO-8859 series leading many to falsely assume that they are
identical.
</quote>
 
I

Ilya Zakharevich

[A complimentary Cc of this posting was NOT [per weedlist] sent to
Dr.Ruud
<quote>
In Microsoft Windows, the phrase "ANSI" refers to the Windows ANSI code
pages. Most of these are fixed width though there are some variable
width ones for ideographic languages. Some of these are very close to
the ISO-8859 series leading many to falsely assume that they are
identical.
</quote>

Not enough. When I was trying to find why a friend's Mac "won't
work", I found that OS X docs also mention ASCII to mean something
unfathomable... [BTW, the solution was that OS X file system(s) just
does not accept non-UTF-8 file names - but I did not find it
documented anywhere when it was important.]

So it is not only M$ which murks the water...

Yours,
Ilya
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,774
Messages
2,569,596
Members
45,135
Latest member
VeronaShap
Top