Detecting UniCode encoding

A

Aryeh M. Friedman

If I have an arbitary character (primative type not Character) array is
it possible to detect the encoding used for any given charcter in the
array. Specifically I am writting a parser that excepts abritary
strings of UniCode and it needs to know what character set to parse
against (decorater pattern and reflection). Please note I am brand
new to UniCode so if I asked the wrong way or what ever forgive me.
 
M

Mickey Segal

I am having trouble figuring out what your data looks like. Is it straight
Unicode with two bytes for every character? Or is it a variable length
Unicode encoding such as UTF-8? Or is it a restricted one byte encoding
where you are trying to guess the encoding?
 
H

HK

Aryeh said:
If I have an arbitary character (primative type not Character) array is
it possible to detect the encoding used for any given charcter in the

If you are talking about a char[], there is no encoding
involved. It contains just characters.

If you are talking about a byte[] or an InputStream,
then indeed character encoding is involved. But then
you cannot derive from a single byte, which
encoding was used to encode characters. Looking at
several bytes you may be able to get a hunch of
which encoding is involved, but only if you
know beforehand that only a limited and known
number of encodings are possible. The reason is,
that in principle I can define my own encoding
with complete obtuse mappings between bytes
and characters.

Harald.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads

files.py (encoding error) 0
unicode + xml 0
files.py (weird encoding error) 0
Unicode data loss 0
Ascii to Unicode. 4
javac -encoding problem and/or glaring bug ? 6
Ruby1.9 Encoding 2
a simple unicode question 21

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top