convert CharArray to ByteArray

Mike Schilling · Jun 15, 2009

markspace said:
Mike Schilling wrote:

Well you can't catch a RuntimeException with out having to catch ALL
runtime exceptions and then try to sort out which one it actually
was.

You'd never write code to catch this one.

Sure it never "should" happen but weird things happen in test
environments, and sometimes there are funky customer environments
too.

I supose you'd have to ask them to upgrade to a non-broken JVM.

markspace · Jun 15, 2009

Mike said:
You'd never write code to catch this one.

That's the assumption I'm challenging. Given the presumption "no one
will ever do this," someone will. Someone somewhere will find a need
or reason to catch it. Code defensively. Programming sometimes is like
a Vaudeville skit and the trick it to make sure it's not your face with
pie on it.

Lew · Jun 15, 2009

Would you mind to elaborate? java.lang.Character's javadoc seems to
indicate that chars are UTF-16, and therefore it is enforced by the char
type itself.

It seems to me like making it possibly otherwise would cause rather
serious regression on out-of-BMP-enabled Java applications (where you at
least need to use java.lang.Character methods that depend on chars and
Strings to be UTF-16 or something close enough.)

From the JLS, s. 4.2.1:

The values of the integral types are integers in the following ranges:
... For char, from '\u0000' to '\uffff' inclusive, that is, from 0 to 65535

A 'char' is an unsigned short integer, in other words. The primitive
type enforces nothing character-ish; only the methods of 'Character'
and 'String' do that. The primitive type itself is numeric, not
inherently UTF-16.

Consider:

char schuss = 0xDF;
char div = 0xF7;
char x = (char)( schuss + div );

That makes no sense in terms of Unicode, but is perfectly legal. What
is the value of 'ÃŸ' + 'Ã·' in Unicode? Would you expect it to be
'Ç–' (Latin small letter U with diÃ¦resis and macron)?

'char' is a *numeric* type. Its use to represent code points
(including surrogate pairs comprising more than one 'char') is a
matter of correspondence between the numeric value and the character
it represents, and is not intrinsic.

Mike Schilling · Jun 15, 2009

markspace said:
That's the assumption I'm challenging. Given the presumption "no one
will ever do this," someone will.

It throws only when the JVM is broken. You fix that by replacing the JVM.

Mayeul · Jun 16, 2009

Lew said:
From the JLS, s. 4.2.1:

A 'char' is an unsigned short integer, in other words. The primitive
type enforces nothing character-ish; only the methods of 'Character'
and 'String' do that. The primitive type itself is numeric, not
inherently UTF-16.

Consider:

char schuss = 0xDF;
char div = 0xF7;
char x = (char)( schuss + div );

That makes no sense in terms of Unicode, but is perfectly legal. What
is the value of 'ÃŸ' + 'Ã·' in Unicode? Would you expect it to be
'Ç–' (Latin small letter U with diÃ¦resis and macron)?

'char' is a *numeric* type. Its use to represent code points
(including surrogate pairs comprising more than one 'char') is a
matter of correspondence between the numeric value and the character
it represents, and is not intrinsic.

OK. Since Arne was pointing out the method he suggested would still
work, I had the impression you wanted to point out the existence of use
cases where it wouldn't.

Glad it's clarified.

Cannot convert (double) to (double*)	1	Sep 5, 2022
retriving escape unicode sequences from files ...	1	Aug 4, 2012
Convert binary char array to integer with reordering	16	Oct 25, 2005
JPasswordField, charset and char[] to byte[]	0	Apr 13, 2004
Converting String to byte array	3	Oct 5, 2006
Convert a raw string to an array of big-endian words	1	May 2, 2010
How to convert byte[] into a SINGLE integer ?	16	Sep 4, 2003
convert string of hex characters to char	11	Oct 7, 2008

convert CharArray to ByteArray

Mike Schilling

markspace

Lew

Mike Schilling

Mayeul

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads