unicode + xml

L

Laurent Luce

Hello,

I am trying to do the following:

- read list of folders in a specific directory: os.listdir() - some folders have Japanese characters
- post list of folders as xml to a web server: I used content-type 'text/xml' and I use '<?xml version="1.0" encoding="utf-8"?>' to start the xml data.
- on the server side (Django), I get the data using post_data and I use minidom.parseString() to parse it. I get an exception because of the following in the xml for one of the folder name:
'/ufffdX/ufffd^/ufffd[/ufffdg /ufffd/ufffd/ufffdj/ufffd/ufffd/ufffd['

The weird thing is that I see 5 bytes for each unicode character: ie: /ufffdX

Should I format the data differently inside the xml so minidom is happy ?

Laurent
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,066
Latest member
VytoKetoReviews

Latest Threads

Top