xmlrpclib and carriagereturn (\r)

Discussion in 'Python' started by Jonathan Ballet, Mar 17, 2006.

  1. Hello,

    I have developped a XMLRPC server, which runs under Gnu/Linux with
    python2.3.
    This server receives method calls from Windows client. The server got some
    parameters which are string, which contains carriage return characters,
    just after the line feed character; like "bla\n\rbla".


    The problem is, xmlrpclib "eats" those carriage return characters when
    loading the XMLRPC request, and replace it by "\n". So I got "bla\n\nbla".

    When I sent back those parameters to others Windows clients (they are
    doing some kind of synchronisation throught the XMLRPC server), I send
    to them only "\n\n", which makes problems when rendering strings.



    It seems that XMLRPC spec doesn't propose to eat carriage return
    characters : (from http://www.xmlrpc.com/spec)
    """
    What characters are allowed in strings? Non-printable characters?
    Null characters? Can a "string" be used to hold an arbitrary chunk
    of binary data?

    Any characters are allowed in a string except < and &, which are
    encoded as &lt; and &amp;. A string can be used to encode binary
    data.
    """



    Here is an example which described the problem :
    $ python
    Python 2.3.5 (#2, Sep 4 2005, 22:01:42)
    [GCC 3.3.5 (Debian 1:3.3.5-13)] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import xmlrpclib
    >>>
    >>> data = """<?xml version="1.0" encoding="utf-8"?>\n<methodCall>\n

    <methodName>testmethod</methodName>\n
    <params>\n <param>\n <value>\n
    <string>bla\n\rbla\n\r bla</string>\n </value>\n </param>\n
    </params>\n</methodCall>"""
    >>>
    >>> xmlrpclib.loads(data)

    (('bla\n\nbla\n\n\tbla',), u'testmethod')
    >>>


    ... whereas I expected to have (('bla\n\rbla\n\r\tbla',), u'testmethod')


    It seems to be a rather strange comportement from xmlrpclib. Is it known ?
    So, what happens here ? How could I solve this problem ?

    Thanks for any answers,
    Jonathan
    Jonathan Ballet, Mar 17, 2006
    #1
    1. Advertising

  2. Jonathan Ballet wrote:

    > The problem is, xmlrpclib "eats" those carriage return characters when
    > loading the XMLRPC request, and replace it by "\n". So I got "bla\n\nbla".
    >
    > When I sent back those parameters to others Windows clients (they are
    > doing some kind of synchronisation throught the XMLRPC server), I send
    > to them only "\n\n", which makes problems when rendering strings.
    >
    > It seems that XMLRPC spec doesn't propose to eat carriage return
    > characters : (from http://www.xmlrpc.com/spec)


    XMLRPC is XML, and XML normalizes line feeds:

    http://www.w3.org/TR/2004/REC-xml-20040204/#sec-line-ends

    relying on non-standard line terminators in text is fragile and horridly out-
    dated; if the data you're sending qualifies as text, send it as text, and do
    the conversion at the end points (if at all necessary).

    if it's not text, use the binary wrapper.

    > I send to them only "\n\n", which makes problems when rendering strings.


    how are you rendering things if you have problems treating a line feed
    as a line feed? that's rather unusual; most Windows applications have
    no problems handling line feeds properly, so I'm not sure why it has to
    be a problem in your application...

    </F>
    Fredrik Lundh, Mar 18, 2006
    #2
    1. Advertising

  3. Jonathan Ballet wrote:
    > The problem is, xmlrpclib "eats" those carriage return characters when
    > loading the XMLRPC request, and replace it by "\n". So I got "bla\n\nbla".
    >
    > When I sent back those parameters to others Windows clients (they are
    > doing some kind of synchronisation throught the XMLRPC server), I send
    > to them only "\n\n", which makes problems when rendering strings.


    Did you develop the Windows client, too? If so, the client-side fix is
    trivial: replace \n with \r\n in all renderable strings. Or update
    both the client and the server to encode the strings, also trivial
    using the base64 module.

    If not, and you're in the unfortunate position of being forced to
    support buggy third-party clients, read on.

    > It seems that XMLRPC spec doesn't propose to eat carriage return
    > characters : (from http://www.xmlrpc.com/spec)

    (snip)
    > It seems to be a rather strange comportement from xmlrpclib. Is it known ?
    > So, what happens here ? How could I solve this problem ?


    The XMLRPC spec doesn't say anything about CRs one way or the other.
    Newline handling is necessarily left to the XML parser implementation.
    In Python's case, xmlrpclib uses the xml.parsers.expat module, which
    reads universal newlines and writes Unix-style newlines (\n). There's
    no option to disable this feature.

    You could modify xmlrpclib to use a different parser, but it would be
    much easier to just hack the XML response right before it's sent out.
    I'm assuming you used the SimpleXMLRPCServer module. Example:

    from SimpleXMLRPCServer import *

    class MyServer(SimpleXMLRPCServer):
    def _marshaled_dispatch(self, data, dm=None):
    response = SimpleXMLRPCDispatcher._marshaled_dispatch(self,
    data, dm)
    return response.replace('\n', '\r\n')

    server = MyServer(('localhost', 8000))
    server.register_introspection_functions()
    server.register_function(lambda x: x, 'echo')
    server.serve_forever()

    Hope that helps,
    --Ben
    Ben Cartwright, Mar 18, 2006
    #3
  4. Jonathan Ballet wrote:
    > The problem is, xmlrpclib "eats" those carriage return characters when
    > loading the XMLRPC request, and replace it by "\n". So I got "bla\n\nbla".
    >
    > When I sent back those parameters to others Windows clients (they are
    > doing some kind of synchronisation throught the XMLRPC server), I send
    > to them only "\n\n", which makes problems when rendering strings.



    Whoops, just realized we're talking about "\n\r" here, not "\r\n".
    Most of my previous reply doesn't apply to your situation, then.

    As far as Python's expat parser is concerned, "\n\r" is two newlines:
    one Unix-style and one Mac-style. It correctly (per XML specs)
    normalizes both to Unix-style.

    Is "\n\r" being used as a newline by your Windows clients, or is it a
    control code? If the former, I'd sure like to know why. If the
    latter, then you're submitting binary data and you shouldn't be using
    <string> to begin with. Try <base64>.

    If worst comes to worst and you have to stick with sending "\n\r"
    intact in a <string> param, you'll need to modify xmlrpclib to use a
    different (and technically noncompliant) XML parser. Here's an ugly
    hack to do that out of the box:

    # In your server code:
    import xmlrpclib
    # This forces xmlrpclib to fall back on the obsolete xmllib module:
    xmlrpclib.ExpatParser = None

    xmllib doesn't normalize newlines, so it's noncompliant. But this is
    actually what you want.

    --Ben
    Ben Cartwright, Mar 18, 2006
    #4
  5. Le Sat, 18 Mar 2006 02:17:36 -0800, Ben Cartwright a écrit :

    > Jonathan Ballet wrote:
    >> The problem is, xmlrpclib "eats" those carriage return characters when
    >> loading the XMLRPC request, and replace it by "\n". So I got "bla\n\nbla".
    >>
    >> When I sent back those parameters to others Windows clients (they are
    >> doing some kind of synchronisation throught the XMLRPC server), I send
    >> to them only "\n\n", which makes problems when rendering strings.

    >


    [Just replying to this message. See F. Lundh reply too]

    >
    > Whoops, just realized we're talking about "\n\r" here, not "\r\n".
    > Most of my previous reply doesn't apply to your situation, then.
    >
    > As far as Python's expat parser is concerned, "\n\r" is two newlines:
    > one Unix-style and one Mac-style. It correctly (per XML specs)
    > normalizes both to Unix-style.
    >
    > Is "\n\r" being used as a newline by your Windows clients, or is it a
    > control code? If the former, I'd sure like to know why.


    We are developping the Windows client. I think my teammates keep \n\r,
    which is the defaut line terminator they had when getting string from text
    input, and because ".net framework does the right things for you, etc. etc."

    > If the
    > latter, then you're submitting binary data and you shouldn't be using
    > <string> to begin with. Try <base64>.
    >
    > If worst comes to worst and you have to stick with sending "\n\r"
    > intact in a <string> param, you'll need to modify xmlrpclib to use a
    > different (and technically noncompliant) XML parser. Here's an ugly
    > hack to do that out of the box:
    >
    > # In your server code:
    > import xmlrpclib
    > # This forces xmlrpclib to fall back on the obsolete xmllib module:
    > xmlrpclib.ExpatParser = None
    >
    > xmllib doesn't normalize newlines, so it's noncompliant. But this is
    > actually what you want.


    Well, I thought to use something like that. Currently, we are stucking
    with the kind ugly hack you sent in your previous message (replace("\n",
    "\n\r")
    However, now that I know that xmlrpclib handle line terminator correctly
    (regarding XML spec. ), I will try to see if we can handle line feed
    correctly in our Windows application.
    I think it should be the better solution.

    Anyway, thanks a lot for your answers (both of them ;)

    >
    > --Ben


    J.
    Jonathan Ballet, Mar 18, 2006
    #5
  6. Le Sat, 18 Mar 2006 08:54:49 +0100, Fredrik Lundh a écrit :

    > Jonathan Ballet wrote:
    >

    [snip]
    >
    > XMLRPC is XML, and XML normalizes line feeds:
    >
    > http://www.w3.org/TR/2004/REC-xml-20040204/#sec-line-ends
    >
    > relying on non-standard line terminators in text is fragile and horridly out-
    > dated; if the data you're sending qualifies as text, send it as text, and do
    > the conversion at the end points (if at all necessary).


    Ah, you send me the right link. So xmlrpclib handle those line
    terminators correctly. That's good.

    >
    > if it's not text, use the binary wrapper.
    >
    >> I send to them only "\n\n", which makes problems when rendering strings.

    >
    > how are you rendering things if you have problems treating a line feed
    > as a line feed? that's rather unusual; most Windows applications have
    > no problems handling line feeds properly, so I'm not sure why it has to
    > be a problem in your application...
    >
    > </F>


    The problem recently pointed out, so we were searching where those
    carriage return disapeared.
    However, if we can throw them away and render line feed as line feed, it
    would be the best solution imho. I'll look into this.

    Thanks for your answer,
    J.
    Jonathan Ballet, Mar 18, 2006
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Rune Froysa
    Replies:
    3
    Views:
    566
    Rune Froysa
    Apr 20, 2005
  2. Chris Curvey

    xmlrpclib and decoding entity references

    Chris Curvey, May 3, 2005, in forum: Python
    Replies:
    5
    Views:
    380
    Bengt Richter
    May 4, 2005
  3. Mark Space
    Replies:
    0
    Views:
    471
    Mark Space
    May 15, 2009
  4. Lew
    Replies:
    0
    Views:
    930
  5. Joshua Cranmer
    Replies:
    0
    Views:
    451
    Joshua Cranmer
    May 15, 2009
Loading...

Share This Page