unicode and socket

Discussion in 'Python' started by zyqnews@163.net, Feb 18, 2005.

  1. Guest

    hello all,
    I am new in Python. And I have got a problem about unicode.
    I have got a unicode string, when I was going to send it out throuth a
    socket by send(), I got an exception. How can I send the unicode string
    to the remote end of the socket as it is without any conversion of
    encode, so the remote end of the socket will receive unicode string?

    Thanks
     
    , Feb 18, 2005
    #1
    1. Advertising

  2. aurora Guest

    You could not. Unicode is an abstract data type. It must be encoded into
    octets in order to send via socket. And the other end must decode the
    octets to retrieve the unicode string. Needless to say the encoding scheme
    must be consistent and understood by both ends.


    On 18 Feb 2005 11:03:46 -0800, <> wrote:

    > hello all,
    > I am new in Python. And I have got a problem about unicode.
    > I have got a unicode string, when I was going to send it out throuth a
    > socket by send(), I got an exception. How can I send the unicode string
    > to the remote end of the socket as it is without any conversion of
    > encode, so the remote end of the socket will receive unicode string?
    >
    > Thanks
    >
     
    aurora, Feb 18, 2005
    #2
    1. Advertising

  3. aurora wrote:
    > You could not. Unicode is an abstract data type. It must be encoded
    > into octets in order to send via socket. And the other end must decode
    > the octets to retrieve the unicode string. Needless to say the encoding
    > scheme must be consistent and understood by both ends.


    So use pickle.

    --Irmen
     
    Irmen de Jong, Feb 18, 2005
    #3
  4. Irmen de Jong wrote:
    > aurora wrote:
    >
    >> You could not. Unicode is an abstract data type. It must be encoded
    >> into octets in order to send via socket. And the other end must
    >> decode the octets to retrieve the unicode string. Needless to say the
    >> encoding scheme must be consistent and understood by both ends.

    >
    >
    > So use pickle.
    >
    > --Irmen


    Well, on second thought: don't use pickle.
    If all you want to transfer is unicode strings (or normal strings)
    it's safer to just encode them to, say, UTF-8, transfer
    that octet stream across, and on the other side, decode the
    UTF-8 octets back into a unicode string.


    --Irmen
     
    Irmen de Jong, Feb 18, 2005
    #4
  5. Lion Kimbro Guest

    You probably want to use UTF-16 or UTF-8 on both sides of the socket.

    See http://www.python.org/moin/Unicode for more information.

    So, we have a Unicode string...
    >>> mystring=u'eggs and ham'
    >>> mystring

    u'eggs and ham'

    Now, we want to send it over:
    >>> to_send=mystring.encode('utf-8')
    >>> to_send

    'eggs and ham'

    It's encoded in UTF-8 now.

    On the other side, (result=to_send,) we decode:

    >>> result=received.decode('utf-8')
    >>> result

    u'eggs and ham'

    You have transfered a unicode string. {:)}=
     
    Lion Kimbro, Feb 18, 2005
    #5
  6. Guest

    It's really funny, I cannot send a unicode stream throuth socket with
    python while all the other languages as perl,c and java can do it.
    then, how about converting the unicode string to a binary stream? It is
    possible to send a binary through socket with python?
     
    , Feb 19, 2005
    #6
  7. Serge Orlov Guest

    wrote:
    > It's really funny, I cannot send a unicode stream throuth socket with
    > python while all the other languages as perl,c and java can do it.


    You may really start laughing loudly <wink> after you find out that you
    can send arbitrary python objects over sockets. If you want language
    specific way of sending objects, see Irmen's first answer: use pickle.

    > then, how about converting the unicode string to a binary stream?


    Sure, there are already three answers in this thread that suggest you
    to do that. Use encode method of unicode strings.

    > It is possible to send a binary through socket with python?


    Sure. If it wouldn't be possible to send bytes through sockets with Python
    what else do you think could be sent? Perhaps you're confused that
    bytes are stored in byte strings in Python, which are often called strings in
    documentation and conversations? It will be fixed in Python 3.0, but
    these days you have to store bytes in str type.

    Serge.
     
    Serge Orlov, Feb 19, 2005
    #7
  8. aurora Guest

    On 18 Feb 2005 19:10:36 -0800, <> wrote:

    > It's really funny, I cannot send a unicode stream throuth socket with
    > python while all the other languages as perl,c and java can do it.
    > then, how about converting the unicode string to a binary stream? It is
    > possible to send a binary through socket with python?
    >


    I was answering your specific question:

    "How can I send the unicode string to the remote end of the socket as it
    is without any conversion of encode"

    The answer is you could not. Not that you cannot sent unicode but you have
    to encode it. The same applies to perl, c or Java. The only difference is
    the detail of how strings get encoded.

    There are a few posts suggest various means. Or you can check out
    codecs.getwriter() which closer resembles Java's way.
     
    aurora, Feb 19, 2005
    #8
  9. anonymous coward <> wrote:

    > It's really funny, I cannot send a unicode stream throuth socket with
    > python while all the other languages as perl,c and java can do it.


    Are you sure you understand what Unicode is, and how sockets work?

    Sockets are used to transfer byte streams. If you want to transfer
    a python-level object, you have to decide how to encode it as a
    byte stream. For integers, you have to decide whether to use a single
    byte, a string of decimal ascii characters, netstring syntax, etc. For
    text, you have to decide what character encoding to use. For arbitrary
    objects, you have to decide what serialisation protocol to use. etc.

    (and yes, the same applies to all other languages. Java sockets and C
    sockets are no different from Python sockets...)

    </F>
     
    Fredrik Lundh, Feb 19, 2005
    #9
  10. On 18 Feb 2005 19:10:36 -0800, rumours say that might have
    written:

    >It's really funny, I cannot send a unicode stream throuth socket with
    >python while all the other languages as perl,c and java can do it.


    I don't know about perl. What I think you mean by unicode in C most probably is
    the wchar_t, which is Unicode encoded as 'ucs-2' or 'utf-16' (little or big
    endian, depending on your platform) or maybe a 4-byte int, for which I don't
    know a Python equivalent. And I /assume/ in Java that Unicode is equivalent to
    'utf-16' encoded strings when input/output.

    Perhaps Unicode encoded as 'utf-16' is what you're after. However, Unicode
    encoded as 'utf-8' (like others also suggested) might be what you /should/ be
    using, given that this encoding has some attractive properties (no null bytes,
    no spurious control characters etc).

    Don't interpret as weakness the explicitness requested from Python.
    --
    TZOTZIOY, I speak England very best.
    "Be strict when sending and tolerant when receiving." (from RFC1958)
    I really should keep that in mind when talking with people, actually...
     
    Christos TZOTZIOY Georgiou, Mar 3, 2005
    #10
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Laszlo Nagy
    Replies:
    1
    Views:
    5,083
    Mark Wooding
    Jan 27, 2009
  2. Jean-Paul Calderone
    Replies:
    0
    Views:
    1,026
    Jean-Paul Calderone
    Jan 27, 2009
  3. Laszlo Nagy
    Replies:
    0
    Views:
    594
    Laszlo Nagy
    Feb 1, 2009
  4. Steve Holden
    Replies:
    0
    Views:
    711
    Steve Holden
    Feb 1, 2009
  5. Steve Holden
    Replies:
    1
    Views:
    754
Loading...

Share This Page