xmlrpclib and carriagereturn (\r)

J

Jonathan Ballet

Hello,

I have developped a XMLRPC server, which runs under Gnu/Linux with
python2.3.
This server receives method calls from Windows client. The server got some
parameters which are string, which contains carriage return characters,
just after the line feed character; like "bla\n\rbla".


The problem is, xmlrpclib "eats" those carriage return characters when
loading the XMLRPC request, and replace it by "\n". So I got "bla\n\nbla".

When I sent back those parameters to others Windows clients (they are
doing some kind of synchronisation throught the XMLRPC server), I send
to them only "\n\n", which makes problems when rendering strings.



It seems that XMLRPC spec doesn't propose to eat carriage return
characters : (from http://www.xmlrpc.com/spec)
"""
What characters are allowed in strings? Non-printable characters?
Null characters? Can a "string" be used to hold an arbitrary chunk
of binary data?

Any characters are allowed in a string except < and &, which are
encoded as &lt; and &amp;. A string can be used to encode binary
data.
"""



Here is an example which described the problem :
$ python
Python 2.3.5 (#2, Sep 4 2005, 22:01:42)
[GCC 3.3.5 (Debian 1:3.3.5-13)] on linux2
Type "help", "copyright", "credits" or "license" for more information.<methodName>testmethod</methodName>\n
<params>\n <param>\n <value>\n

... whereas I expected to have (('bla\n\rbla\n\r\tbla',), u'testmethod')


It seems to be a rather strange comportement from xmlrpclib. Is it known ?
So, what happens here ? How could I solve this problem ?

Thanks for any answers,
Jonathan
 
F

Fredrik Lundh

Jonathan said:
The problem is, xmlrpclib "eats" those carriage return characters when
loading the XMLRPC request, and replace it by "\n". So I got "bla\n\nbla".

When I sent back those parameters to others Windows clients (they are
doing some kind of synchronisation throught the XMLRPC server), I send
to them only "\n\n", which makes problems when rendering strings.

It seems that XMLRPC spec doesn't propose to eat carriage return
characters : (from http://www.xmlrpc.com/spec)

XMLRPC is XML, and XML normalizes line feeds:

http://www.w3.org/TR/2004/REC-xml-20040204/#sec-line-ends

relying on non-standard line terminators in text is fragile and horridly out-
dated; if the data you're sending qualifies as text, send it as text, and do
the conversion at the end points (if at all necessary).

if it's not text, use the binary wrapper.
I send to them only "\n\n", which makes problems when rendering strings.

how are you rendering things if you have problems treating a line feed
as a line feed? that's rather unusual; most Windows applications have
no problems handling line feeds properly, so I'm not sure why it has to
be a problem in your application...

</F>
 
B

Ben Cartwright

Jonathan said:
The problem is, xmlrpclib "eats" those carriage return characters when
loading the XMLRPC request, and replace it by "\n". So I got "bla\n\nbla".

When I sent back those parameters to others Windows clients (they are
doing some kind of synchronisation throught the XMLRPC server), I send
to them only "\n\n", which makes problems when rendering strings.

Did you develop the Windows client, too? If so, the client-side fix is
trivial: replace \n with \r\n in all renderable strings. Or update
both the client and the server to encode the strings, also trivial
using the base64 module.

If not, and you're in the unfortunate position of being forced to
support buggy third-party clients, read on.
It seems that XMLRPC spec doesn't propose to eat carriage return
characters : (from http://www.xmlrpc.com/spec) (snip)
It seems to be a rather strange comportement from xmlrpclib. Is it known ?
So, what happens here ? How could I solve this problem ?

The XMLRPC spec doesn't say anything about CRs one way or the other.
Newline handling is necessarily left to the XML parser implementation.
In Python's case, xmlrpclib uses the xml.parsers.expat module, which
reads universal newlines and writes Unix-style newlines (\n). There's
no option to disable this feature.

You could modify xmlrpclib to use a different parser, but it would be
much easier to just hack the XML response right before it's sent out.
I'm assuming you used the SimpleXMLRPCServer module. Example:

from SimpleXMLRPCServer import *

class MyServer(SimpleXMLRPCServer):
def _marshaled_dispatch(self, data, dm=None):
response = SimpleXMLRPCDispatcher._marshaled_dispatch(self,
data, dm)
return response.replace('\n', '\r\n')

server = MyServer(('localhost', 8000))
server.register_introspection_functions()
server.register_function(lambda x: x, 'echo')
server.serve_forever()

Hope that helps,
--Ben
 
B

Ben Cartwright

Jonathan said:
The problem is, xmlrpclib "eats" those carriage return characters when
loading the XMLRPC request, and replace it by "\n". So I got "bla\n\nbla".

When I sent back those parameters to others Windows clients (they are
doing some kind of synchronisation throught the XMLRPC server), I send
to them only "\n\n", which makes problems when rendering strings.


Whoops, just realized we're talking about "\n\r" here, not "\r\n".
Most of my previous reply doesn't apply to your situation, then.

As far as Python's expat parser is concerned, "\n\r" is two newlines:
one Unix-style and one Mac-style. It correctly (per XML specs)
normalizes both to Unix-style.

Is "\n\r" being used as a newline by your Windows clients, or is it a
control code? If the former, I'd sure like to know why. If the
latter, then you're submitting binary data and you shouldn't be using
<string> to begin with. Try <base64>.

If worst comes to worst and you have to stick with sending "\n\r"
intact in a <string> param, you'll need to modify xmlrpclib to use a
different (and technically noncompliant) XML parser. Here's an ugly
hack to do that out of the box:

# In your server code:
import xmlrpclib
# This forces xmlrpclib to fall back on the obsolete xmllib module:
xmlrpclib.ExpatParser = None

xmllib doesn't normalize newlines, so it's noncompliant. But this is
actually what you want.

--Ben
 
J

Jonathan Ballet

Le Sat, 18 Mar 2006 02:17:36 -0800, Ben Cartwright a écrit :

[Just replying to this message. See F. Lundh reply too]
Whoops, just realized we're talking about "\n\r" here, not "\r\n".
Most of my previous reply doesn't apply to your situation, then.

As far as Python's expat parser is concerned, "\n\r" is two newlines:
one Unix-style and one Mac-style. It correctly (per XML specs)
normalizes both to Unix-style.

Is "\n\r" being used as a newline by your Windows clients, or is it a
control code? If the former, I'd sure like to know why.

We are developping the Windows client. I think my teammates keep \n\r,
which is the defaut line terminator they had when getting string from text
input, and because ".net framework does the right things for you, etc. etc."
If the
latter, then you're submitting binary data and you shouldn't be using
<string> to begin with. Try <base64>.

If worst comes to worst and you have to stick with sending "\n\r"
intact in a <string> param, you'll need to modify xmlrpclib to use a
different (and technically noncompliant) XML parser. Here's an ugly
hack to do that out of the box:

# In your server code:
import xmlrpclib
# This forces xmlrpclib to fall back on the obsolete xmllib module:
xmlrpclib.ExpatParser = None

xmllib doesn't normalize newlines, so it's noncompliant. But this is
actually what you want.

Well, I thought to use something like that. Currently, we are stucking
with the kind ugly hack you sent in your previous message (replace("\n",
"\n\r")
However, now that I know that xmlrpclib handle line terminator correctly
(regarding XML spec. ), I will try to see if we can handle line feed
correctly in our Windows application.
I think it should be the better solution.

Anyway, thanks a lot for your answers (both of them ;)

J.
 
J

Jonathan Ballet

Le Sat, 18 Mar 2006 08:54:49 +0100, Fredrik Lundh a écrit :
Jonathan Ballet wrote:
[snip]

XMLRPC is XML, and XML normalizes line feeds:

http://www.w3.org/TR/2004/REC-xml-20040204/#sec-line-ends

relying on non-standard line terminators in text is fragile and horridly out-
dated; if the data you're sending qualifies as text, send it as text, and do
the conversion at the end points (if at all necessary).

Ah, you send me the right link. So xmlrpclib handle those line
terminators correctly. That's good.
if it's not text, use the binary wrapper.


how are you rendering things if you have problems treating a line feed
as a line feed? that's rather unusual; most Windows applications have
no problems handling line feeds properly, so I'm not sure why it has to
be a problem in your application...

</F>

The problem recently pointed out, so we were searching where those
carriage return disapeared.
However, if we can throw them away and render line feed as line feed, it
would be the best solution imho. I'll look into this.

Thanks for your answer,
J.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,535
Members
45,007
Latest member
obedient dusk

Latest Threads

Top