email with a non-ascii charset in Python3 ?

Discussion in 'Python' started by Helmut Jarausch, Aug 15, 2012.

  1. Hi,

    I'm sorry to ask such a FAQ but still I couldn't find an answer - neither in the docs nor the web.

    What's wrong with the following script?

    Many thanks for a hint,
    Helmut.

    #!/usr/bin/python3
    #_*_ coding: latin1 _*_

    import smtplib
    from email.message import Message
    import datetime

    msg= Message()
    msg.set_charset('latin-1')
    msg['Subject'] = "*** Email Test ***"
    msg['From'] = "-aachen.de"
    msg['To'] = "-aachen.de"
    msg['Date'] = datetime.datetime.utcnow().strftime('%m/%d/%Y %I:%M:%S %p')

    server= smtplib.SMTP("igpm.igpm.rwth-aachen.de")
    msg.set_payload("Gedanken über einen Test","iso-8859-1")

    ## I have tried msg.set_payload("Gedanken über einen Test".encode("iso-8859-1"),"iso-8859-1")
    ## which fails, as well

    server.send_message(msg)
    Traceback (most recent call last):
    File "./Test_EMail_Py3.py", line 17, in <module>
    server.send_message(msg)
    File "/usr/lib64/python3.2/smtplib.py", line 812, in send_message
    g.flatten(msg_copy, linesep='\r\n')
    File "/usr/lib64/python3.2/email/generator.py", line 91, in flatten
    self._write(msg)
    File "/usr/lib64/python3.2/email/generator.py", line 137, in _write
    self._dispatch(msg)
    File "/usr/lib64/python3.2/email/generator.py", line 163, in _dispatch
    meth(msg)
    File "/usr/lib64/python3.2/email/generator.py", line 396, in _handle_text
    super(BytesGenerator,self)._handle_text(msg)
    File "/usr/lib64/python3.2/email/generator.py", line 201, in _handle_text
    self.write(payload)
    File "/usr/lib64/python3.2/email/generator.py", line 357, in write
    self._fp.write(s.encode('ascii', 'surrogateescape'))
    UnicodeEncodeError: 'ascii' codec can't encode character '\xfc' in position 9: ordinal not in range(128)
    server.quit()



    This is Python 3.2.4 (GIT 20120805)
     
    Helmut Jarausch, Aug 15, 2012
    #1
    1. Advertising

  2. On Wed, 15 Aug 2012 14:48:40 +0200, Christian Heimes wrote:

    > Am 15.08.2012 14:16, schrieb Helmut Jarausch:
    >> Hi,
    >>
    >> I'm sorry to ask such a FAQ but still I couldn't find an answer -
    >> neither in the docs nor the web.
    >>
    >> What's wrong with the following script?
    >>
    >> Many thanks for a hint,
    >> Helmut.
    >>
    >> #!/usr/bin/python3 #_*_ coding: latin1 _*_
    >>
    >> import smtplib from email.message import Message import datetime
    >>
    >> msg= Message()
    >> msg.set_charset('latin-1')
    >> msg['Subject'] = "*** Email Test ***"
    >> msg['From'] = "-aachen.de"
    >> msg['To'] = "-aachen.de"
    >> msg['Date'] = datetime.datetime.utcnow().strftime('%m/%d/%Y %I:%M:%S
    >> %p')
    >>
    >> server= smtplib.SMTP("igpm.igpm.rwth-aachen.de")
    >> msg.set_payload("Gedanken über einen Test","iso-8859-1")

    >
    > You mustn't combine set_charset() with set_payload() with a charset.
    > That results into invalid output:
    >
    >>>> msg = Message()
    >>>> msg.set_payload("Gedanken über einen Test", "iso-8859-1")
    >>>> msg.as_string()

    > 'MIME-Version: 1.0\nContent-Type: text/plain;
    > charset="iso-8859-1"\nContent-Transfer-Encoding:
    > quoted-printable\n\nGedanken =FCber einen Test'
    >
    >>>> msg2 = Message()
    >>>> msg2.set_charset("iso-8859-1")
    >>>> msg2.set_payload("Gedanken über einen Test", "iso-8859-1")
    >>>> msg2.as_string()

    > 'MIME-Version: 1.0\nContent-Type: text/plain;
    > charset="iso-8859-1"\nContent-Transfer-Encoding:
    > quoted-printable\n\nGedanken über einen Test'
    >



    Thanks!
    Just, one mustn't use
    server.send_message(msg.as_string())

    But what if msg['From'] contains a non-ASCII character?

    I wonder what the usage of msg.set_charset('latin-1') is.

    With
    msg.set_charset('latin-1')
    msg.set_payload("Gedanken über einen Test") # is accepted BUT
    server.send_message(msg)

    gives
    Traceback (most recent call last):
    File "Test_EMail_Py3_2.py", line 21, in <module>
    server.send_message(msg)
    File "/usr/lib64/python3.2/smtplib.py", line 812, in send_message
    g.flatten(msg_copy, linesep='\r\n')
    File "/usr/lib64/python3.2/email/generator.py", line 91, in flatten
    self._write(msg)
    File "/usr/lib64/python3.2/email/generator.py", line 137, in _write
    self._dispatch(msg)
    File "/usr/lib64/python3.2/email/generator.py", line 163, in _dispatch
    meth(msg)
    File "/usr/lib64/python3.2/email/generator.py", line 396, in
    _handle_text
    super(BytesGenerator,self)._handle_text(msg)
    File "/usr/lib64/python3.2/email/generator.py", line 201, in
    _handle_text
    self.write(payload)
    File "/usr/lib64/python3.2/email/generator.py", line 357, in write
    self._fp.write(s.encode('ascii', 'surrogateescape'))
    UnicodeEncodeError: 'ascii' codec can't encode character '\xfc' in
    position 9: ordinal not in range(128)

    Helmut.
     
    Helmut Jarausch, Aug 15, 2012
    #2
    1. Advertising

  3. Helmut Jarausch

    MRAB Guest

    On 15/08/2012 13:16, Helmut Jarausch wrote:
    > Hi,
    >
    > I'm sorry to ask such a FAQ but still I couldn't find an answer - neither in the docs nor the web.
    >
    > What's wrong with the following script?
    >
    > Many thanks for a hint,
    > Helmut.
    >
    > #!/usr/bin/python3
    > #_*_ coding: latin1 _*_
    >

    Aw well as the other replies, the "coding" line should be:

    #-*- coding: latin1 -*-
     
    MRAB, Aug 15, 2012
    #3
  4. On Wed, 15 Aug 2012 17:57:47 +0100, MRAB wrote:

    >> #!/usr/bin/python3
    >> #_*_ coding: latin1 _*_
    >>

    > Aw well as the other replies, the "coding" line should be:
    >
    > #-*- coding: latin1 -*-



    I don't believe that actually matters to Python. It may matter to Emacs
    or some other editors, but Python simply matches on this regex:

    coding[=:]\s*([-\w.]+)

    http://docs.python.org/py3k/reference/lexical_analysis.html#encoding-declarations


    --
    Steven
     
    Steven D'Aprano, Aug 16, 2012
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Guoqi Zheng

    ascii or not, the charset of a string

    Guoqi Zheng, Sep 4, 2004, in forum: ASP .Net
    Replies:
    5
    Views:
    1,846
  2. TOXiC
    Replies:
    5
    Views:
    1,260
    TOXiC
    Jan 31, 2007
  3. optimistx

    javascript charset <> page charset

    optimistx, Aug 14, 2008, in forum: Javascript
    Replies:
    2
    Views:
    279
    optimistx
    Aug 15, 2008
  4. bruce
    Replies:
    38
    Views:
    280
    Mark Lawrence
    Nov 1, 2013
  5. MRAB
    Replies:
    0
    Views:
    99
Loading...

Share This Page