Characters aren't displayed correctly

Discussion in 'Python' started by Hussein B, Mar 1, 2009.

  1. Hussein B

    Hussein B Guest

    Hey,
    I'm retrieving records from MySQL database that contains non english
    characters.
    Then I create a String that contains HTML markup and column values
    from the previous result set.
    +++++
    markup = u'''<table>.....'''
    for row in rows:
    markup = markup + '<tr><td>' + row['id']
    markup = markup + '</table>
    +++++
    Then I'm sending the email according to this tip:
    http://code.activestate.com/recipes/473810/
    Well, the email contains ????? characters for each non english ones.
    Any ideas?
    Ubuntu 8.04
    Python 2.5.2
    Evolution Mail Client
    Thanks.
    Hussein B, Mar 1, 2009
    #1
    1. Advertising

  2. On Mar 1, 2009, at 8:31 AM, Hussein B wrote:

    > Hey,
    > I'm retrieving records from MySQL database that contains non english
    > characters.
    > Then I create a String that contains HTML markup and column values
    > from the previous result set.
    > +++++
    > markup = u'''<table>.....'''
    > for row in rows:
    > markup = markup + '<tr><td>' + row['id']
    > markup = markup + '</table>
    > +++++
    > Then I'm sending the email according to this tip:
    > http://code.activestate.com/recipes/473810/
    > Well, the email contains ????? characters for each non english ones.
    > Any ideas?


    There's so many places where this could go wrong and you haven't
    narrowed down the problem.

    Are the characters stored in the database correctly?

    Are they stored consistently (i.e. all using the same encoding, not
    some using utf-8 and others using iso-8859-1)?

    What are you getting out of the database? Is it being converted to
    Unicode correctly, or at all?

    Are you sure that the program you're using to view the email
    understands the encoding?

    Isolate those questions one at a time. Add some debugging breakpoints.
    Ensure that you have what you think you have. You might not fix your
    problem, but you will make it much smaller and more specific.


    Good luck
    Philip
    Philip Semanchuk, Mar 1, 2009
    #2
    1. Advertising

  3. On Sun, 2009-03-01 at 09:51 -0500, Philip Semanchuk wrote:
    > On Mar 1, 2009, at 8:31 AM, Hussein B wrote:
    >
    > > Hey,
    > > I'm retrieving records from MySQL database that contains non english
    > > characters.
    > > Then I create a String that contains HTML markup and column values
    > > from the previous result set.
    > > +++++
    > > markup = u'''<table>.....'''
    > > for row in rows:
    > > markup = markup + '<tr><td>' + row['id']
    > > markup = markup + '</table>
    > > +++++
    > > Then I'm sending the email according to this tip:
    > > http://code.activestate.com/recipes/473810/
    > > Well, the email contains ????? characters for each non english ones.
    > > Any ideas?

    >
    > There's so many places where this could go wrong and you haven't
    > narrowed down the problem.
    >
    > Are the characters stored in the database correctly?
    >
    > Are they stored consistently (i.e. all using the same encoding, not
    > some using utf-8 and others using iso-8859-1)?
    >
    > What are you getting out of the database? Is it being converted to
    > Unicode correctly, or at all?
    >
    > Are you sure that the program you're using to view the email
    > understands the encoding?
    >
    > Isolate those questions one at a time. Add some debugging breakpoints.
    > Ensure that you have what you think you have. You might not fix your
    > problem, but you will make it much smaller and more specific.
    >
    >
    > Good luck
    > Philip
    >
    >


    Let me add to that checklist:

    Are you sure the email you are creating has the encoding declared
    properly in the headers?

    >
    > --
    > http://mail.python.org/mailman/listinfo/python-list
    >


    Cheers,
    Cliff
    J. Clifford Dyer, Mar 1, 2009
    #3
  4. Hussein B

    Hussein B Guest

    On Mar 1, 4:51 pm, Philip Semanchuk <> wrote:
    > On Mar 1, 2009, at 8:31 AM, Hussein B wrote:
    >
    > > Hey,
    > > I'm retrieving records from MySQL database that contains non english
    > > characters.
    > > Then I create a String that contains HTML markup and column values
    > > from the previous result set.
    > > +++++
    > > markup = u'''<table>.....'''
    > > for row in rows:
    > >     markup = markup + '<tr><td>' + row['id']
    > > markup = markup + '</table>
    > > +++++
    > > Then I'm sending the email according to this tip:
    > >http://code.activestate.com/recipes/473810/
    > > Well, the email contains ????? characters for each non english ones.
    > > Any ideas?

    >
    > There's so many places where this could go wrong and you haven't  
    > narrowed down the problem.
    >
    > Are the characters stored in the database correctly?

    Yes they are.

    > Are they stored consistently (i.e. all using the same encoding, not  
    > some using utf-8 and others using iso-8859-1)?

    Yes.

    > What are you getting out of the database? Is it being converted to  
    > Unicode correctly, or at all?

    I don't know, how to make sure of this point?

    > Are you sure that the program you're using to view the email  
    > understands the encoding?

    Yes.

    > Isolate those questions one at a time. Add some debugging breakpoints.  
    > Ensure that you have what you think you have. You might not fix your  
    > problem, but you will make it much smaller and more specific.
    >
    > Good luck
    > Philip
    Hussein B, Mar 2, 2009
    #4
  5. Hussein B

    Hussein B Guest

    On Mar 1, 11:27 pm, "J. Clifford Dyer" <> wrote:
    > On Sun, 2009-03-01 at 09:51 -0500, Philip Semanchuk wrote:
    > > On Mar 1, 2009, at 8:31 AM, Hussein B wrote:

    >
    > > > Hey,
    > > > I'm retrieving records from MySQL database that contains non english
    > > > characters.
    > > > Then I create a String that contains HTML markup and column values
    > > > from the previous result set.
    > > > +++++
    > > > markup = u'''<table>.....'''
    > > > for row in rows:
    > > >     markup = markup + '<tr><td>' + row['id']
    > > > markup = markup + '</table>
    > > > +++++
    > > > Then I'm sending the email according to this tip:
    > > >http://code.activestate.com/recipes/473810/
    > > > Well, the email contains ????? characters for each non english ones.
    > > > Any ideas?

    >
    > > There's so many places where this could go wrong and you haven't  
    > > narrowed down the problem.

    >
    > > Are the characters stored in the database correctly?

    >
    > > Are they stored consistently (i.e. all using the same encoding, not  
    > > some using utf-8 and others using iso-8859-1)?

    >
    > > What are you getting out of the database? Is it being converted to  
    > > Unicode correctly, or at all?

    >
    > > Are you sure that the program you're using to view the email  
    > > understands the encoding?

    >
    > > Isolate those questions one at a time. Add some debugging breakpoints.  
    > > Ensure that you have what you think you have. You might not fix your  
    > > problem, but you will make it much smaller and more specific.

    >
    > > Good luck
    > > Philip

    >
    > Let me add to that checklist:
    >
    > Are you sure the email you are creating has the encoding declared
    > properly in the headers?
    >
    >
    >
    > > --
    > >http://mail.python.org/mailman/listinfo/python-list

    >
    > Cheers,
    > Cliff


    My HTML markup contains only table tags (you know, table, tr and td)
    Hussein B, Mar 2, 2009
    #5
  6. On Mon, 2009-03-02 at 00:33 -0800, Hussein B wrote:
    > On Mar 1, 11:27 pm, "J. Clifford Dyer" <> wrote:
    > > On Sun, 2009-03-01 at 09:51 -0500, Philip Semanchuk wrote:
    > > > On Mar 1, 2009, at 8:31 AM, Hussein B wrote:

    > >
    > > > > Hey,
    > > > > I'm retrieving records from MySQL database that contains non english
    > > > > characters.
    > > > > Then I create a String that contains HTML markup and column values
    > > > > from the previous result set.
    > > > > +++++
    > > > > markup = u'''<table>.....'''
    > > > > for row in rows:
    > > > > markup = markup + '<tr><td>' + row['id']
    > > > > markup = markup + '</table>
    > > > > +++++
    > > > > Then I'm sending the email according to this tip:
    > > > >http://code.activestate.com/recipes/473810/
    > > > > Well, the email contains ????? characters for each non english ones.
    > > > > Any ideas?

    > >
    > > > There's so many places where this could go wrong and you haven't
    > > > narrowed down the problem.

    > >
    > > > Are the characters stored in the database correctly?

    > >
    > > > Are they stored consistently (i.e. all using the same encoding, not
    > > > some using utf-8 and others using iso-8859-1)?

    > >
    > > > What are you getting out of the database? Is it being converted to
    > > > Unicode correctly, or at all?

    > >
    > > > Are you sure that the program you're using to view the email
    > > > understands the encoding?

    > >
    > > > Isolate those questions one at a time. Add some debugging breakpoints.
    > > > Ensure that you have what you think you have. You might not fix your
    > > > problem, but you will make it much smaller and more specific.

    > >
    > > > Good luck
    > > > Philip

    > >
    > > Let me add to that checklist:
    > >
    > > Are you sure the email you are creating has the encoding declared
    > > properly in the headers?
    > >
    > >
    > >
    > >
    > > Cheers,
    > > Cliff

    >
    > My HTML markup contains only table tags (you know, table, tr and td)


    Ah. The issue is not with the HTML markup, but the email headers. For
    example, the email you sent me has a header that says:

    Content-type: text/plain; charset="iso-8859-1"

    Guessing from the recipe you linked to, you probably need something
    like:

    msgRoot['Content-type'] = 'text/plain; charset="utf-16"'

    replacing utf-16 with whatever encoding you have encoded your email
    with.

    Or it may be that the header has to be attached to the individual mime
    parts. I'm not as familiar with MIME.


    Cheers,
    Cliff
    J. Clifford Dyer, Mar 2, 2009
    #6
  7. Hussein B

    Hussein B Guest

    On Mar 2, 4:03 pm, "J. Clifford Dyer" <> wrote:
    > On Mon, 2009-03-02 at 00:33 -0800, Hussein B wrote:
    > > On Mar 1, 11:27 pm, "J. Clifford Dyer" <> wrote:
    > > > On Sun, 2009-03-01 at 09:51 -0500, Philip Semanchuk wrote:
    > > > > On Mar 1, 2009, at 8:31 AM, Hussein B wrote:

    >
    > > > > > Hey,
    > > > > > I'm retrieving records from MySQL database that contains non english
    > > > > > characters.
    > > > > > Then I create a String that contains HTML markup and column values
    > > > > > from the previous result set.
    > > > > > +++++
    > > > > > markup = u'''<table>.....'''
    > > > > > for row in rows:
    > > > > >     markup = markup + '<tr><td>' + row['id']
    > > > > > markup = markup + '</table>
    > > > > > +++++
    > > > > > Then I'm sending the email according to this tip:
    > > > > >http://code.activestate.com/recipes/473810/
    > > > > > Well, the email contains ????? characters for each non english ones.
    > > > > > Any ideas?

    >
    > > > > There's so many places where this could go wrong and you haven't  
    > > > > narrowed down the problem.

    >
    > > > > Are the characters stored in the database correctly?

    >
    > > > > Are they stored consistently (i.e. all using the same encoding, not  
    > > > > some using utf-8 and others using iso-8859-1)?

    >
    > > > > What are you getting out of the database? Is it being converted to  
    > > > > Unicode correctly, or at all?

    >
    > > > > Are you sure that the program you're using to view the email  
    > > > > understands the encoding?

    >
    > > > > Isolate those questions one at a time. Add some debugging breakpoints.  
    > > > > Ensure that you have what you think you have. You might not fix your  
    > > > > problem, but you will make it much smaller and more specific.

    >
    > > > > Good luck
    > > > > Philip

    >
    > > > Let me add to that checklist:

    >
    > > > Are you sure the email you are creating has the encoding declared
    > > > properly in the headers?

    >
    > > > Cheers,
    > > > Cliff

    >
    > > My HTML markup contains only table tags (you know, table, tr and td)

    >
    > Ah.  The issue is not with the HTML markup, but the email headers.  For
    > example, the email you sent me has a header that says:
    >
    > Content-type: text/plain; charset="iso-8859-1"
    >
    > Guessing from the recipe you linked to, you probably need something
    > like:
    >
    > msgRoot['Content-type'] = 'text/plain; charset="utf-16"'
    >
    > replacing utf-16 with whatever encoding you have encoded your email
    > with.
    >
    > Or it may be that the header has to be attached to the individual mime
    > parts.  I'm not as familiar with MIME.
    >
    > Cheers,
    > Cliff


    Hey Cliff,
    I tried your tip and I still get the same thing (?????)
    I added print statement to print each value of the result set into the
    console, which also prints ???? characters instead of the real
    characters values.
    Maybe a conversion is happened upon getting the data from the
    database?
    (the values are stored correctly in the database)
    Hussein B, Mar 2, 2009
    #7
  8. Hussein B

    John Machin Guest

    On Mar 2, 7:30 pm, Hussein B <> wrote:
    > On Mar 1, 4:51 pm, Philip Semanchuk <> wrote:
    >
    > > On Mar 1, 2009, at 8:31 AM, Hussein B wrote:

    >
    > > > Hey,
    > > > I'm retrieving records from MySQL database that contains non english
    > > > characters.


    Can you reveal which language???

    > > > Then I create a String that contains HTML markup and column values
    > > > from the previous result set.
    > > > +++++
    > > > markup = u'''<table>.....'''
    > > > for row in rows:
    > > >     markup = markup + '<tr><td>' + row['id']
    > > > markup = markup + '</table>
    > > > +++++
    > > > Then I'm sending the email according to this tip:
    > > >http://code.activestate.com/recipes/473810/
    > > > Well, the email contains ????? characters for each non english ones.
    > > > Any ideas?

    >
    > > There's so many places where this could go wrong and you haven't  
    > > narrowed down the problem.

    >
    > > Are the characters stored in the database correctly?

    >
    > Yes they are.


    How do you KNOW that they are stored correctly? What makes you so
    sure?

    >
    > > Are they stored consistently (i.e. all using the same encoding, not  
    > > some using utf-8 and others using iso-8859-1)?

    >
    > Yes.


    So what is the encoding used to store them?

    >
    > > What are you getting out of the database? Is it being converted to  
    > > Unicode correctly, or at all?

    >
    > I don't know, how to make sure of this point?


    You could show us some of the output from the database query. As well
    as
    print the_output
    you should
    print repr(the_output)
    and show us both, and also tell us what you *expect* to see.

    And let's get the database output sorted out before we worry about the
    email message.
    John Machin, Mar 2, 2009
    #8
  9. Hussein B

    Hussein B Guest

    On Mar 2, 4:31 pm, John Machin <> wrote:
    > On Mar 2, 7:30 pm, Hussein B <> wrote:
    >
    > > On Mar 1, 4:51 pm, Philip Semanchuk <> wrote:

    >
    > > > On Mar 1, 2009, at 8:31 AM, Hussein B wrote:

    >
    > > > > Hey,
    > > > > I'm retrieving records from MySQL database that contains non english
    > > > > characters.

    >
    > Can you reveal which language???
    >
    >
    >

    Arabic

    > > > > Then I create a String that contains HTML markup and column values
    > > > > from the previous result set.
    > > > > +++++
    > > > > markup = u'''<table>.....'''
    > > > > for row in rows:
    > > > >     markup = markup + '<tr><td>' + row['id']
    > > > > markup = markup + '</table>
    > > > > +++++
    > > > > Then I'm sending the email according to this tip:
    > > > >http://code.activestate.com/recipes/473810/
    > > > > Well, the email contains ????? characters for each non english ones..
    > > > > Any ideas?

    >
    > > > There's so many places where this could go wrong and you haven't  
    > > > narrowed down the problem.

    >
    > > > Are the characters stored in the database correctly?

    >
    > > Yes they are.

    >
    > How do you KNOW that they are stored correctly? What makes you so
    > sure?
    >
    >

    Because MySQL Query Browser displays them correctly, in addition I use
    BIRT as the reporting system and it shows them correctly.

    >
    > > > Are they stored consistently (i.e. all using the same encoding, not  
    > > > some using utf-8 and others using iso-8859-1)?

    >
    > > Yes.

    >
    > So what is the encoding used to store them?
    >
    >
    >

    Tables are created with UTF-8 encoding option

    > > > What are you getting out of the database? Is it being converted to  
    > > > Unicode correctly, or at all?

    >
    > > I don't know, how to make sure of this point?

    >
    > You could show us some of the output from the database query. As well
    > as
    >    print the_output
    > you should
    >    print repr(the_output)
    > and show us both, and also tell us what you *expect* to see.
    >


    The result of print repr(row['name']) is '??? ??????'
    The '?' characters are supposed to be Arabic characters.

    > And let's get the database output sorted out before we worry about the
    > email message.


    Thanks all for help.
    Hussein B, Mar 2, 2009
    #9
  10. On Mar 2, 2009, at 9:50 AM, Hussein B wrote:

    > On Mar 2, 4:31 pm, John Machin <> wrote:
    >> On Mar 2, 7:30 pm, Hussein B <> wrote:
    >>
    >>> On Mar 1, 4:51 pm, Philip Semanchuk <> wrote:
    >>>> What are you getting out of the database? Is it being converted to
    >>>> Unicode correctly, or at all?

    >>
    >>> I don't know, how to make sure of this point?


    Personally, I'd add a debug breakpoint just after extracting the
    characters from the database, like so:

    import pdb
    pdb.set_trace()

    When you're stopped at the breakpoint, examine the string you get
    back. Is it what you expect? For instance, is it Unicode?

    isinstance(my_string, unicode)

    Or maybe you're expecting a utf-8 encoded string, so examine one of
    the non-ASCII characters. Is it really utf-8 encoded?

    >>> my_string = u"snö".encode("utf-8")
    >>> my_string[0]

    's'
    >>> my_string[1]

    'n'
    >>> my_string[2]

    '\xc3'
    >>> my_string[3]

    '\xb6'


    Since you feel pretty confident that you're getting what you expect
    out of the database, maybe you want to eliminate that from
    consideration. As a test, construct "by hand" a string that represents
    the email message you're trying to send. If you send that with the
    proper content-type header and you still don't get the results you
    want, then we can all stop discussing the database. Make sense?

    Forget about the HTML markup, too. That's just a distraction. Start
    with the simplest problem first, and then add pieces on.

    See if you can successfully construct and send an email that says
    "Hello world" in English/ASCII. If that works, change it to Arabic. If
    that works, change the email format to HTML. If that works, starts
    pulling the content from the database. If that works, then you're
    done. =)

    bye
    Philip
    Philip Semanchuk, Mar 2, 2009
    #10
  11. Hussein B

    John Machin Guest

    On Mar 3, 1:50 am, Hussein B <> wrote:
    > On Mar 2, 4:31 pm, John Machin <> wrote:> On Mar 2, 7:30 pm, Hussein B <> wrote:
    >
    > > > On Mar 1, 4:51 pm, Philip Semanchuk <> wrote:

    >
    > > > > On Mar 1, 2009, at 8:31 AM, Hussein B wrote:

    >
    > > > > > Hey,
    > > > > > I'm retrieving records from MySQL database that contains non english
    > > > > > characters.

    >
    > > Can you reveal which language???

    >
    > Arabic
    >
    >
    >
    > > > > > Then I create a String that contains HTML markup and column values
    > > > > > from the previous result set.
    > > > > > +++++
    > > > > > markup = u'''<table>.....'''
    > > > > > for row in rows:
    > > > > >     markup = markup + '<tr><td>' + row['id']
    > > > > > markup = markup + '</table>
    > > > > > +++++
    > > > > > Then I'm sending the email according to this tip:
    > > > > >http://code.activestate.com/recipes/473810/
    > > > > > Well, the email contains ????? characters for each non english ones.
    > > > > > Any ideas?

    >
    > > > > There's so many places where this could go wrong and you haven't  
    > > > > narrowed down the problem.

    >
    > > > > Are the characters stored in the database correctly?

    >
    > > > Yes they are.

    >
    > > How do you KNOW that they are stored correctly? What makes you so
    > > sure?

    >
    > Because MySQL Query Browser displays them correctly, in addition I use
    > BIRT as the reporting system and it shows them correctly.
    >
    >
    >
    > > > > Are they stored consistently (i.e. all using the same encoding, not  
    > > > > some using utf-8 and others using iso-8859-1)?

    >
    > > > Yes.

    >
    > > So what is the encoding used to store them?

    >
    > Tables are created with UTF-8 encoding option
    >
    > > > > What are you getting out of the database? Is it being converted to  
    > > > > Unicode correctly, or at all?

    >
    > > > I don't know, how to make sure of this point?

    >
    > > You could show us some of the output from the database query. As well
    > > as
    > >    print the_output
    > > you should
    > >    print repr(the_output)
    > > and show us both, and also tell us what you *expect* to see.

    >
    > The result of print repr(row['name']) is '??? ??????'
    > The '?' characters are supposed to be Arabic characters.


    Are you expecting 3 Arabic characters, a space, and then 6 Arabic
    characters?

    We now have some interesting evidence: row['name'] is NOT a unicode
    object -- otherwise the print would show u'??? ??????'; it's a str
    object.

    So: A utf8-encoded string is being decoded to unicode, and then re-
    encoded to some other encoding, using the "replace" (with "?") error-
    handling method. That shouldn't be hard to spot! It's about time you
    showed us the code you are using to extract the data from the
    database, including the print statements you have put in.
    John Machin, Mar 2, 2009
    #11
  12. Hussein B

    John Machin Guest

    On Mar 3, 2:22 am, Philip Semanchuk <> wrote:
    > On Mar 2, 2009, at 9:50 AM, Hussein B wrote:
    >
    > > On Mar 2, 4:31 pm, John Machin <> wrote:
    > >> On Mar 2, 7:30 pm, Hussein B <> wrote:

    >
    > >>> On Mar 1, 4:51 pm, Philip Semanchuk <> wrote:
    > >>>> What are you getting out of the database? Is it being converted to
    > >>>> Unicode correctly, or at all?

    >
    > >>> I don't know, how to make sure of this point?

    >
    > Personally, I'd add a debug breakpoint just after extracting the  
    > characters from the database, like so:
    >
    >     import pdb
    >     pdb.set_trace()
    >
    > When you're stopped at the breakpoint, examine the string you get  
    > back. Is it what you expect? For instance, is it Unicode?
    >
    >     isinstance(my_string, unicode)
    >
    > Or maybe you're expecting a utf-8 encoded string, so examine one of  
    > the non-ASCII characters. Is it really utf-8 encoded?
    >
    >  >>> my_string = u"snö".encode("utf-8")
    >  >>> my_string[0]
    > 's'
    >  >>> my_string[1]
    > 'n'
    >  >>> my_string[2]
    > '\xc3'
    >  >>> my_string[3]
    > '\xb6'
    >
    > Since you feel pretty confident that you're getting what you expect  
    > out of the database, maybe you want to eliminate that from  
    > consideration. As a test, construct "by hand" a string that represents  
    > the email message you're trying to send. If you send that with the  
    > proper content-type header and you still don't get the results you  
    > want, then we can all stop discussing the database. Make sense?
    >
    > Forget about the HTML markup, too. That's just a distraction. Start  
    > with the simplest problem first, and then add pieces on.
    >
    > See if you can successfully construct and send an email that says  
    > "Hello world" in English/ASCII. If that works, change it to Arabic. If  
    > that works, change the email format to HTML. If that works, starts  
    > pulling the content from the database. If that works, then you're  
    > done. =)


    Yuk. You are asking him to write extra speculative code when he's
    having extreme difficulty debugging the code he's already got! He's
    already said he's getting ?????? soon after the database retrieval ---
    you want him to work on the downstream problem when the upstream is
    still very muddy???

    Sheeesh.
    John Machin, Mar 2, 2009
    #12
  13. On Mar 2, 2009, at 10:50 AM, John Machin wrote:

    > On Mar 3, 2:22 am, Philip Semanchuk <> wrote:
    >> See if you can successfully construct and send an email that says
    >> "Hello world" in English/ASCII. If that works, change it to Arabic.
    >> If
    >> that works, change the email format to HTML. If that works, starts
    >> pulling the content from the database. If that works, then you're
    >> done. =)

    >
    > Yuk. You are asking him to write extra speculative code when he's
    > having extreme difficulty debugging the code he's already got! He's
    > already said he's getting ?????? soon after the database retrieval ---
    > you want him to work on the downstream problem when the upstream is
    > still very muddy???


    First of all, I preceded that paragraph with a detailed example of how
    to verify that he's getting what he expects out of the database. So
    no, I am not asking the OP to write extra speculative code. I'm giving
    him another tool with which to work at his problem.

    He claims to have done what I asked him to do in the first place --
    break the problem into steps and verify the database steps. He says
    they're working OK. I chose to take him at his word.

    If he's right, then we can move on to the next step of troubleshooting
    the email. If he's wrong and the problem is indeed with the database
    code, then we'll eventually discover that and he'll have learned a
    valuable lesson. It will be time-consuming and therefore painful for
    him, but then he'll be more likely to remember it.

    There's more than one way to attack this problem/set of problems, yes?

    This is all kind of OT since it is about general debugging and not
    about Python. The only Python-specific aspect I see is that debugging
    non-ASCII problems with print is a little tricky since it introduces
    yet another variable -- the terminal's encoding settings. If, for
    instance, the OP's terminal is set to ISO 8859-6 or some such (I don't
    know anything about encodings to handle Arabic) and he's feeding it
    UTF-8, then ??????? might indeed be the result.
    Philip Semanchuk, Mar 2, 2009
    #13
  14. Hussein B

    John Machin Guest

    On Mar 3, 3:27 am, Philip Semanchuk <> wrote:
    > On Mar 2, 2009, at 10:50 AM, John Machin wrote:
    >
    > > On Mar 3, 2:22 am, Philip Semanchuk <> wrote:
    > >> See if you can successfully construct and send an email that says
    > >> "Hello world" in English/ASCII. If that works, change it to Arabic.  
    > >> If
    > >> that works, change the email format to HTML. If that works, starts
    > >> pulling the content from the database. If that works, then you're
    > >> done. =)

    >
    > > Yuk. You are asking him to write extra speculative code when he's
    > > having extreme difficulty debugging the code he's already got! He's
    > > already said he's getting ?????? soon after the database retrieval ---
    > > you want him to work on the downstream problem when the upstream is
    > > still very muddy???

    >
    > First of all, I preceded that paragraph with a detailed example of how  
    > to verify that he's getting what he expects out of the database. So  
    > no, I am not asking the OP to write extra speculative code. I'm giving  
    > him another tool with which to work at his problem.
    >
    > He claims to have done what I asked him to do in the first place --  
    > break the problem into steps and verify the database steps. He says  
    > they're working OK. I chose to take him at his word.


    Rule number 1: Don't believe anything an OP says that is not
    corroborated by output that looks like it was produced using the repr
    () function (2.x) or ascii() function (3.x)

    Rule number 2: Don't ignore rule number 1, especially when not
    corroborated by any output at all.

    Rule number 3: [added since the Great Renaming aka the Mad Hatter's
    Tea Party] Ask the OP what version of Python they are using so that
    they can be told to use ascii() instead of repr() if using 3.X

    >
    > If he's right, then we can move on to the next step of troubleshooting  
    > the email. If he's wrong and the problem is indeed with the database  
    > code, then we'll eventually discover that


    He has *already* demonstrated, at my request, that there is a problem
    with, or soon after, the database extraction:

    """
    The result of print repr(row['name']) is '??? ??????'
    The '?' characters are supposed to be Arabic characters.
    """

    > and he'll have learned a  
    > valuable lesson. It will be time-consuming and therefore painful for  
    > him, but then he'll be more likely to remember it.
    >
    > There's more than one way to attack this problem/set of problems, yes?
    >
    > This is all kind of OT since it is about general debugging and not  
    > about Python. The only Python-specific aspect I see is that debugging  
    > non-ASCII problems with print is a little tricky since it introduces  
    > yet another variable -- the terminal's encoding settings. If, for  
    > instance, the OP's terminal is set to ISO 8859-6 or some such (I don't  
    > know anything about encodings to handle Arabic) and he's feeding it  
    > UTF-8, then ??????? might indeed be the result.


    and that is the rationale for Rule #1
    John Machin, Mar 2, 2009
    #14
  15. On Mar 2, 2009, at 5:26 PM, John Machin wrote:

    > On Mar 3, 3:27 am, Philip Semanchuk <> wrote:
    >> He claims to have done what I asked him to do in the first place --
    >> break the problem into steps and verify the database steps. He says
    >> they're working OK. I chose to take him at his word.

    >
    > Rule number 1: Don't believe anything an OP says that is not
    > corroborated by output that looks like it was produced using the repr
    > () function (2.x) or ascii() function (3.x)


    Saying "I don't believe you" has never worked well for me as a
    conversation opener. Sometimes taking someone at his word is another
    name for giving him enough rope to...make a mistake that he'll remember.

    And for many people, trust breeds trust. I trust him, maybe he'll
    trust me when I say (for the second time), "You need to break this
    problem down into discrete, debuggable units."

    I (mostly) agree with your rule. But as I said, there's more than one
    way to solve this problem. Or perhaps I should say that there's more
    than one way to lead the OP to a solution to this problem. We teach
    differently, you and I. I believe there's room in the world for *both*
    styles -- perhaps even a third or fourth! =)


    Cheers
    Philip
    Philip Semanchuk, Mar 3, 2009
    #15
  16. Hussein B

    Hussein B Guest

    On Mar 2, 5:40 pm, John Machin <> wrote:
    > On Mar 3, 1:50 am, Hussein B <> wrote:
    >
    >
    >
    > > On Mar 2, 4:31 pm, John Machin <> wrote:> On Mar 2, 7:30 pm, Hussein B <> wrote:

    >
    > > > > On Mar 1, 4:51 pm, Philip Semanchuk <> wrote:

    >
    > > > > > On Mar 1, 2009, at 8:31 AM, Hussein B wrote:

    >
    > > > > > > Hey,
    > > > > > > I'm retrieving records from MySQL database that contains non english
    > > > > > > characters.

    >
    > > > Can you reveal which language???

    >
    > > Arabic

    >
    > > > > > > Then I create a String that contains HTML markup and column values
    > > > > > > from the previous result set.
    > > > > > > +++++
    > > > > > > markup = u'''<table>.....'''
    > > > > > > for row in rows:
    > > > > > >     markup = markup + '<tr><td>' + row['id']
    > > > > > > markup = markup + '</table>
    > > > > > > +++++
    > > > > > > Then I'm sending the email according to this tip:
    > > > > > >http://code.activestate.com/recipes/473810/
    > > > > > > Well, the email contains ????? characters for each non english ones.
    > > > > > > Any ideas?

    >
    > > > > > There's so many places where this could go wrong and you haven't  
    > > > > > narrowed down the problem.

    >
    > > > > > Are the characters stored in the database correctly?

    >
    > > > > Yes they are.

    >
    > > > How do you KNOW that they are stored correctly? What makes you so
    > > > sure?

    >
    > > Because MySQL Query Browser displays them correctly, in addition I use
    > > BIRT as the reporting system and it shows them correctly.

    >
    > > > > > Are they stored consistently (i.e. all using the same encoding, not  
    > > > > > some using utf-8 and others using iso-8859-1)?

    >
    > > > > Yes.

    >
    > > > So what is the encoding used to store them?

    >
    > > Tables are created with UTF-8 encoding option

    >
    > > > > > What are you getting out of the database? Is it being converted to  
    > > > > > Unicode correctly, or at all?

    >
    > > > > I don't know, how to make sure of this point?

    >
    > > > You could show us some of the output from the database query. As well
    > > > as
    > > >    print the_output
    > > > you should
    > > >    print repr(the_output)
    > > > and show us both, and also tell us what you *expect* to see.

    >
    > > The result of print repr(row['name']) is '??? ??????'
    > > The '?' characters are supposed to be Arabic characters.

    >
    > Are you expecting 3 Arabic characters, a space, and then 6 Arabic
    > characters?
    >
    > We now have some interesting evidence: row['name'] is NOT a unicode
    > object -- otherwise the print would show u'??? ??????'; it's a str
    > object.
    >
    > So: A utf8-encoded string is being decoded to unicode, and then re-
    > encoded to some other encoding, using the "replace" (with "?") error-
    > handling method. That shouldn't be hard to spot! It's about time you
    > showed us the code you are using to extract the data from the
    > database, including the print statements you have put in.


    This is how I retrieve the data:

    db = MySQLdb.connect(host = "127.0.0.1", port = 3306, user =
    "username",
    passwd = "passwd", db = "reporting")
    cr = db.cursor(MySQLdb.cursors.DictCursor)
    cr.execute(sql)
    rows = cr.fetchall()

    Thanks all for your nice help.
    Hussein B, Mar 3, 2009
    #16
  17. Hussein B

    Hussein B Guest

    On Mar 3, 11:05 am, Hussein B <> wrote:
    > On Mar 2, 5:40 pm, John Machin <> wrote:
    >
    >
    >
    > > On Mar 3, 1:50 am, Hussein B <> wrote:

    >
    > > > On Mar 2, 4:31 pm, John Machin <> wrote:> On Mar 2, 7:30 pm, Hussein B <> wrote:

    >
    > > > > > On Mar 1, 4:51 pm, Philip Semanchuk <> wrote:

    >
    > > > > > > On Mar 1, 2009, at 8:31 AM, Hussein B wrote:

    >
    > > > > > > > Hey,
    > > > > > > > I'm retrieving records from MySQL database that contains non english
    > > > > > > > characters.

    >
    > > > > Can you reveal which language???

    >
    > > > Arabic

    >
    > > > > > > > Then I create a String that contains HTML markup and column values
    > > > > > > > from the previous result set.
    > > > > > > > +++++
    > > > > > > > markup = u'''<table>.....'''
    > > > > > > > for row in rows:
    > > > > > > >     markup = markup + '<tr><td>' + row['id']
    > > > > > > > markup = markup + '</table>
    > > > > > > > +++++
    > > > > > > > Then I'm sending the email according to this tip:
    > > > > > > >http://code.activestate.com/recipes/473810/
    > > > > > > > Well, the email contains ????? characters for each non english ones.
    > > > > > > > Any ideas?

    >
    > > > > > > There's so many places where this could go wrong and you haven't  
    > > > > > > narrowed down the problem.

    >
    > > > > > > Are the characters stored in the database correctly?

    >
    > > > > > Yes they are.

    >
    > > > > How do you KNOW that they are stored correctly? What makes you so
    > > > > sure?

    >
    > > > Because MySQL Query Browser displays them correctly, in addition I use
    > > > BIRT as the reporting system and it shows them correctly.

    >
    > > > > > > Are they stored consistently (i.e. all using the same encoding, not  
    > > > > > > some using utf-8 and others using iso-8859-1)?

    >
    > > > > > Yes.

    >
    > > > > So what is the encoding used to store them?

    >
    > > > Tables are created with UTF-8 encoding option

    >
    > > > > > > What are you getting out of the database? Is it being converted to  
    > > > > > > Unicode correctly, or at all?

    >
    > > > > > I don't know, how to make sure of this point?

    >
    > > > > You could show us some of the output from the database query. As well
    > > > > as
    > > > >    print the_output
    > > > > you should
    > > > >    print repr(the_output)
    > > > > and show us both, and also tell us what you *expect* to see.

    >
    > > > The result of print repr(row['name']) is '??? ??????'
    > > > The '?' characters are supposed to be Arabic characters.

    >
    > > Are you expecting 3 Arabic characters, a space, and then 6 Arabic
    > > characters?

    >
    > > We now have some interesting evidence: row['name'] is NOT a unicode
    > > object -- otherwise the print would show u'??? ??????'; it's a str
    > > object.

    >
    > > So: A utf8-encoded string is being decoded to unicode, and then re-
    > > encoded to some other encoding, using the "replace" (with "?") error-
    > > handling method. That shouldn't be hard to spot! It's about time you
    > > showed us the code you are using to extract the data from the
    > > database, including the print statements you have put in.

    >
    > This is how I retrieve the data:
    >
    > db = MySQLdb.connect(host = "127.0.0.1", port = 3306, user =
    > "username",
    >                          passwd = "passwd", db = "reporting")
    > cr = db.cursor(MySQLdb.cursors.DictCursor)
    > cr.execute(sql)
    > rows = cr.fetchall()
    >
    > Thanks all for your nice help.


    Hey,
    I added use_unicode and charset keyword params to the connect() method
    and I got the following:
    u'\u062f\u062e\u0648\u0644 \u0633\u0631\u064a\u0639
    \u0634\u0647\u0631'
    So characters are getting converted successfully.
    Well, using the previous recipe for sending the mail:
    http://code.activestate.com/recipes/473810/
    I got the following error:

    Traceback (most recent call last):
    File "HtmlMail.py", line 52, in <module>
    s.sendmail(sender, receiver , msg.as_string())
    File "/usr/lib/python2.5/email/message.py", line 131, in as_string
    g.flatten(self, unixfrom=unixfrom)
    File "/usr/lib/python2.5/email/generator.py", line 84, in flatten
    self._write(msg)
    File "/usr/lib/python2.5/email/generator.py", line 109, in _write
    self._dispatch(msg)
    File "/usr/lib/python2.5/email/generator.py", line 135, in _dispatch
    meth(msg)
    File "/usr/lib/python2.5/email/generator.py", line 201, in
    _handle_multipart
    g.flatten(part, unixfrom=False)
    File "/usr/lib/python2.5/email/generator.py", line 84, in flatten
    self._write(msg)
    File "/usr/lib/python2.5/email/generator.py", line 109, in _write
    self._dispatch(msg)
    File "/usr/lib/python2.5/email/generator.py", line 135, in _dispatch
    meth(msg)
    File "/usr/lib/python2.5/email/generator.py", line 201, in
    _handle_multipart
    g.flatten(part, unixfrom=False)
    File "/usr/lib/python2.5/email/generator.py", line 84, in flatten
    self._write(msg)
    File "/usr/lib/python2.5/email/generator.py", line 109, in _write
    self._dispatch(msg)
    File "/usr/lib/python2.5/email/generator.py", line 135, in _dispatch
    meth(msg)
    File "/usr/lib/python2.5/email/generator.py", line 178, in
    _handle_text
    self._fp.write(payload)
    UnicodeEncodeError: 'ascii' codec can't encode characters in position
    115-118: ordinal not in range(128)


    Again, any ideas guys? :)
    Thanks to you all, you rocks !
    Hussein B, Mar 3, 2009
    #17
  18. Hussein B

    John Machin Guest

    On Mar 3, 8:49 pm, Hussein B <> wrote:
    > On Mar 3, 11:05 am, Hussein B <> wrote:
    >
    >
    >
    > > On Mar 2, 5:40 pm, John Machin <> wrote:

    >
    > > > On Mar 3, 1:50 am, Hussein B <> wrote:

    >
    > > > > On Mar 2, 4:31 pm, John Machin <> wrote:> On Mar 2, 7:30 pm, Hussein B <> wrote:

    >
    > > > > > > On Mar 1, 4:51 pm, Philip Semanchuk <> wrote:

    >
    > > > > > > > On Mar 1, 2009, at 8:31 AM, Hussein B wrote:

    >
    > > > > > > > > Hey,
    > > > > > > > > I'm retrieving records from MySQL database that contains non english
    > > > > > > > > characters.

    >
    > > > > > Can you reveal which language???

    >
    > > > > Arabic

    >
    > > > > > > > > Then I create a String that contains HTML markup and column values
    > > > > > > > > from the previous result set.
    > > > > > > > > +++++
    > > > > > > > > markup = u'''<table>.....'''
    > > > > > > > > for row in rows:
    > > > > > > > >     markup = markup + '<tr><td>' + row['id']
    > > > > > > > > markup = markup + '</table>
    > > > > > > > > +++++
    > > > > > > > > Then I'm sending the email according to this tip:
    > > > > > > > >http://code.activestate.com/recipes/473810/
    > > > > > > > > Well, the email contains ????? characters for each non english ones.
    > > > > > > > > Any ideas?

    >
    > > > > > > > There's so many places where this could go wrong and you haven't  
    > > > > > > > narrowed down the problem.

    >
    > > > > > > > Are the characters stored in the database correctly?

    >
    > > > > > > Yes they are.

    >
    > > > > > How do you KNOW that they are stored correctly? What makes you so
    > > > > > sure?

    >
    > > > > Because MySQL Query Browser displays them correctly, in addition I use
    > > > > BIRT as the reporting system and it shows them correctly.

    >
    > > > > > > > Are they stored consistently (i.e. all using the same encoding, not  
    > > > > > > > some using utf-8 and others using iso-8859-1)?

    >
    > > > > > > Yes.

    >
    > > > > > So what is the encoding used to store them?

    >
    > > > > Tables are created with UTF-8 encoding option

    >
    > > > > > > > What are you getting out of the database? Is it being converted to  
    > > > > > > > Unicode correctly, or at all?

    >
    > > > > > > I don't know, how to make sure of this point?

    >
    > > > > > You could show us some of the output from the database query. As well
    > > > > > as
    > > > > >    print the_output
    > > > > > you should
    > > > > >    print repr(the_output)
    > > > > > and show us both, and also tell us what you *expect* to see.

    >
    > > > > The result of print repr(row['name']) is '??? ??????'
    > > > > The '?' characters are supposed to be Arabic characters.

    >
    > > > Are you expecting 3 Arabic characters, a space, and then 6 Arabic
    > > > characters?

    >
    > > > We now have some interesting evidence: row['name'] is NOT a unicode
    > > > object -- otherwise the print would show u'??? ??????'; it's a str
    > > > object.

    >
    > > > So: A utf8-encoded string is being decoded to unicode, and then re-
    > > > encoded to some other encoding, using the "replace" (with "?") error-
    > > > handling method. That shouldn't be hard to spot! It's about time you
    > > > showed us the code you are using to extract the data from the
    > > > database, including the print statements you have put in.

    >
    > > This is how I retrieve the data:

    >
    > > db = MySQLdb.connect(host = "127.0.0.1", port = 3306, user =
    > > "username",
    > >                          passwd = "passwd", db = "reporting")
    > > cr = db.cursor(MySQLdb.cursors.DictCursor)
    > > cr.execute(sql)
    > > rows = cr.fetchall()

    >
    > > Thanks all for your nice help.

    >
    > Hey,
    > I added use_unicode and charset keyword params to the connect() method


    Hey, that was a brilliant idea -- I was just about to ask you to try
    use_unicode=True, charset="utf8" ... what were the actual values that
    you used?

    Let's suppose that you used charset="XXXX" ... as far as I can tell,
    not being a mysqldb user myself, this means that your data tables and/
    or your default connection don't use XXXX as an encoding. If so, this
    might be an issue you might like to take up with whoever created the
    database that you are using.

    > and I got the following:
    > u'\u062f\u062e\u0648\u0644 \u0633\u0631\u064a\u0639
    > \u0634\u0647\u0631'
    > So characters are getting converted successfully.


    I guess so -- U+06nn sure are Arabic characters :)

    However as suggested above, "converted from what?" might be worth
    pursuing if you like to understand what is going on instead of just
    applying magic recipes ;-)


    > Well, using the previous recipe for sending the mail:http://code.activestate.com/recipes/473810/
    > I got the following error:
    >
    > Traceback (most recent call last):
    >   File "HtmlMail.py", line 52, in <module>
    >     s.sendmail(sender, receiver , msg.as_string())


    [big snip]

    > _handle_text
    >     self._fp.write(payload)
    > UnicodeEncodeError: 'ascii' codec can't encode characters in position
    > 115-118: ordinal not in range(128)
    >
    > Again, any ideas guys? :)


    That recipe appears to have been written by an ascii bigot for ascii
    bigots :-(

    Try reading the docs for email.charset (that's the charset module in
    the email package).

    Cheers,
    John
    John Machin, Mar 3, 2009
    #18
  19. Hussein B

    Hussein B Guest

    On Mar 3, 12:21 pm, John Machin <> wrote:
    > On Mar 3, 8:49 pm, Hussein B <> wrote:
    >
    >
    >
    > > On Mar 3, 11:05 am, Hussein B <> wrote:

    >
    > > > On Mar 2, 5:40 pm, John Machin <> wrote:

    >
    > > > > On Mar 3, 1:50 am, Hussein B <> wrote:

    >
    > > > > > On Mar 2, 4:31 pm, John Machin <> wrote:> On Mar 2, 7:30 pm, Hussein B <> wrote:

    >
    > > > > > > > On Mar 1, 4:51 pm, Philip Semanchuk <> wrote:

    >
    > > > > > > > > On Mar 1, 2009, at 8:31 AM, Hussein B wrote:

    >
    > > > > > > > > > Hey,
    > > > > > > > > > I'm retrieving records from MySQL database that contains non english
    > > > > > > > > > characters.

    >
    > > > > > > Can you reveal which language???

    >
    > > > > > Arabic

    >
    > > > > > > > > > Then I create a String that contains HTML markup and column values
    > > > > > > > > > from the previous result set.
    > > > > > > > > > +++++
    > > > > > > > > > markup = u'''<table>.....'''
    > > > > > > > > > for row in rows:
    > > > > > > > > >     markup = markup + '<tr><td>' + row['id']
    > > > > > > > > > markup = markup + '</table>
    > > > > > > > > > +++++
    > > > > > > > > > Then I'm sending the email according to this tip:
    > > > > > > > > >http://code.activestate.com/recipes/473810/
    > > > > > > > > > Well, the email contains ????? characters for each non english ones.
    > > > > > > > > > Any ideas?

    >
    > > > > > > > > There's so many places where this could go wrong and you haven't  
    > > > > > > > > narrowed down the problem.

    >
    > > > > > > > > Are the characters stored in the database correctly?

    >
    > > > > > > > Yes they are.

    >
    > > > > > > How do you KNOW that they are stored correctly? What makes you so
    > > > > > > sure?

    >
    > > > > > Because MySQL Query Browser displays them correctly, in addition I use
    > > > > > BIRT as the reporting system and it shows them correctly.

    >
    > > > > > > > > Are they stored consistently (i.e. all using the same encoding, not  
    > > > > > > > > some using utf-8 and others using iso-8859-1)?

    >
    > > > > > > > Yes.

    >
    > > > > > > So what is the encoding used to store them?

    >
    > > > > > Tables are created with UTF-8 encoding option

    >
    > > > > > > > > What are you getting out of the database? Is it being converted to  
    > > > > > > > > Unicode correctly, or at all?

    >
    > > > > > > > I don't know, how to make sure of this point?

    >
    > > > > > > You could show us some of the output from the database query. As well
    > > > > > > as
    > > > > > >    print the_output
    > > > > > > you should
    > > > > > >    print repr(the_output)
    > > > > > > and show us both, and also tell us what you *expect* to see.

    >
    > > > > > The result of print repr(row['name']) is '??? ??????'
    > > > > > The '?' characters are supposed to be Arabic characters.

    >
    > > > > Are you expecting 3 Arabic characters, a space, and then 6 Arabic
    > > > > characters?

    >
    > > > > We now have some interesting evidence: row['name'] is NOT a unicode
    > > > > object -- otherwise the print would show u'??? ??????'; it's a str
    > > > > object.

    >
    > > > > So: A utf8-encoded string is being decoded to unicode, and then re-
    > > > > encoded to some other encoding, using the "replace" (with "?") error-
    > > > > handling method. That shouldn't be hard to spot! It's about time you
    > > > > showed us the code you are using to extract the data from the
    > > > > database, including the print statements you have put in.

    >
    > > > This is how I retrieve the data:

    >
    > > > db = MySQLdb.connect(host = "127.0.0.1", port = 3306, user =
    > > > "username",
    > > >                          passwd = "passwd", db = "reporting")
    > > > cr = db.cursor(MySQLdb.cursors.DictCursor)
    > > > cr.execute(sql)
    > > > rows = cr.fetchall()

    >
    > > > Thanks all for your nice help.

    >
    > > Hey,
    > > I added use_unicode and charset keyword params to the connect() method

    >
    > Hey, that was a brilliant idea -- I was just about to ask you to try
    >  use_unicode=True, charset="utf8" ... what were the actual values that
    > you used?


    I didn't supply values for them the first times.

    > Let's suppose that you used charset="XXXX" ... as far as I can tell,
    > not being a mysqldb user myself, this means that your data tables and/
    > or your default connection don't use XXXX as an encoding. If so, this
    > might be an issue you might like to take up with whoever created the
    > database that you are using.
    >
    > > and I got the following:
    > > u'\u062f\u062e\u0648\u0644 \u0633\u0631\u064a\u0639
    > > \u0634\u0647\u0631'
    > > So characters are getting converted successfully.

    >
    > I guess so -- U+06nn sure are Arabic characters :)
    >
    > However as suggested above, "converted from what?" might be worth
    > pursuing if you like to understand what is going on instead of just
    > applying magic recipes ;-)
    >
    > > Well, using the previous recipe for sending the mail:http://code.activestate.com/recipes/473810/
    > > I got the following error:

    >
    > > Traceback (most recent call last):
    > >   File "HtmlMail.py", line 52, in <module>
    > >     s.sendmail(sender, receiver , msg.as_string())

    >
    > [big snip]
    >
    > > _handle_text
    > >     self._fp.write(payload)
    > > UnicodeEncodeError: 'ascii' codec can't encode characters in position
    > > 115-118: ordinal not in range(128)

    >
    > > Again, any ideas guys? :)

    >
    > That recipe appears to have been written by an ascii bigot for ascii
    > bigots :-(
    >
    > Try reading the docs for email.charset (that's the charset module in
    > the email package).


    Every thing is working now, I did the following:
    t = MIMEText(markup.encode('utf-8'), 'html', 'utf-8')

    > Cheers,
    > John


    Thank you all guys and especially you John, I owe you a HUGE bottle of
    beer :D
    Hussein B, Mar 3, 2009
    #19
  20. Hussein B

    John Machin Guest

    On Mar 3, 10:22 pm, Hussein B <> wrote:
    > > > Hey,
    > > > I added use_unicode and charset keyword params to the connect() method

    >
    > > Hey, that was a brilliant idea -- I was just about to ask you to try
    > >  use_unicode=True, charset="utf8" ... what were the actual values that
    > > you used?

    >
    > I didn't supply values for them the first times.


    I guessed that! I was referring to the fact that you didn't tell us
    what values you did eventually supply that made it generate seemingly
    reasonable Arabic letters in unicode!! Was it charset="utf8" that did
    the trick?

    >
    >
    >
    > > Let's suppose that you used charset="XXXX" ... as far as I can tell,
    > > not being a mysqldb user myself, this means that your data tables and/
    > > or your default connection don't use XXXX as an encoding. If so, this
    > > might be an issue you might like to take up with whoever created the
    > > database that you are using.

    >
    > > > and I got the following:
    > > > u'\u062f\u062e\u0648\u0644 \u0633\u0631\u064a\u0639
    > > > \u0634\u0647\u0631'
    > > > So characters are getting converted successfully.

    >
    > > I guess so -- U+06nn sure are Arabic characters :)

    >
    > > However as suggested above, "converted from what?" might be worth
    > > pursuing if you like to understand what is going on instead of just
    > > applying magic recipes ;-)

    >
    > > > Well, using the previous recipe for sending the mail:http://code.activestate.com/recipes/473810/
    > > > I got the following error:

    >
    > > > Traceback (most recent call last):
    > > >   File "HtmlMail.py", line 52, in <module>
    > > >     s.sendmail(sender, receiver , msg.as_string())

    >
    > > [big snip]

    >
    > > > _handle_text
    > > >     self._fp.write(payload)
    > > > UnicodeEncodeError: 'ascii' codec can't encode characters in position
    > > > 115-118: ordinal not in range(128)

    >
    > > > Again, any ideas guys? :)

    >
    > > That recipe appears to have been written by an ascii bigot for ascii
    > > bigots :-(

    >
    > > Try reading the docs for email.charset (that's the charset module in
    > > the email package).

    >
    > Every thing is working now, I did the following:
    > t = MIMEText(markup.encode('utf-8'), 'html', 'utf-8')


    > Thank you all guys and especially you John, I owe you a HUGE bottle of
    > beer :D


    Thanks for the kind thought, but beer decreases grey-cell count and
    increases girth ... I don't need any assistance with those matters :)

    Cheers,
    John
    John Machin, Mar 3, 2009
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Arnold Franke
    Replies:
    2
    Views:
    313
    Arnold Franke
    Feb 10, 2004
  2. =?Utf-8?B?TWFya3VzUG9laGxlcg==?=

    Saved HTM File not displayed correctly

    =?Utf-8?B?TWFya3VzUG9laGxlcg==?=, Aug 1, 2005, in forum: ASP .Net
    Replies:
    2
    Views:
    648
    Joerg Jooss
    Aug 1, 2005
  3. =?Utf-8?B?YmFocg==?=

    Asp.net pages not displayed correctly

    =?Utf-8?B?YmFocg==?=, Aug 16, 2005, in forum: ASP .Net
    Replies:
    4
    Views:
    1,963
    =?Utf-8?B?YmFocg==?=
    Aug 16, 2005
  4. JohnW
    Replies:
    1
    Views:
    367
    JohnW
    Feb 10, 2006
  5. boney
    Replies:
    1
    Views:
    563
Loading...

Share This Page