JAXB and Arabic encoding

B

Brahim Machkour

Hello,

I'm using JAXB to export/import data from/to a DB through XML files.
Everything is working well. I would like also to be able to edit
the file using an editor, typically XMLSpy. The problem is that
within the XML the arabic text is encoded with strings such as
"& #1578;& #1575;& #1585"... and using the editor, arabic does not
appear, only the sequence of ascii strings i just mentioned. I edit
manually and replace with arabic charater, then they show up in
XMLSpy. I guess it's an encoding problem at marshalling ?

I've tried anything I can at the marshalling step using :
Marshaller m = jc.createMarshaller();
m.setProperty( Marshaller.JAXB_FORMATTED_OUTPUT, Boolean.TRUE );
m.setProperty( Marshaller.JAXB_ENCODING, "ISO-8859-6" );
m.marshal(xmldata, new FileOutputStream(xmlfile));

still the same thing. tries UTF-8, CP1256, Windows-1256, ...

Is there a way to have arabic charaters directly showing up in
the XML ?

Thank you for any help

Brahim.
 
M

Mark Thornton

Brahim said:
Hello,

I'm using JAXB to export/import data from/to a DB through XML files.
Everything is working well. I would like also to be able to edit
the file using an editor, typically XMLSpy. The problem is that
within the XML the arabic text is encoded with strings such as
"& #1578;& #1575;& #1585"... and using the editor, arabic does not
appear, only the sequence of ascii strings i just mentioned. I edit
manually and replace with arabic charater, then they show up in
XMLSpy. I guess it's an encoding problem at marshalling ?

I've tried anything I can at the marshalling step using :




still the same thing. tries UTF-8, CP1256, Windows-1256, ...

Is there a way to have arabic charaters directly showing up in
the XML ?

Thank you for any help

Brahim.

You have to use a 'transcoder' which knows what characters can be
represented directly in the selected encoding. Many transcoders will
take the easy way out and just put everything not in ASCII as a
character reference. Note that prior to the addition of the Charset
classes in 1.4 it was tedious to determine if a given character set
supported a character.

Mark Thornton
 
J

Jon A. Cruz

Roedy said:
If they did, it would not be XML any more. XML is designed to make
handling un-American characters difficult.


Not at all.

The one thing an XML parser must support to claim it is an XML parser is
Unicode.

Just Use UTF-8 on the XML file and view it in Notepad. You can even have
Arabic identifiers.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,904
Latest member
HealthyVisionsCBDPrice

Latest Threads

Top