portable unicode literals

Discussion in 'Python' started by Ulrich Eckhardt, Oct 15, 2012.

  1. Hi!

    I need a little nudge in the right direction, as I'm misunderstanding
    something concerning string literals in Python 2 and 3. In Python 2.7,
    b'' and '' are byte strings, while u'' is a unicode literal. In Python
    3.2, b'' is a byte string and '' is a unicode literal, while u'' is a
    syntax error.

    This actually came as a surprise to me, I assumed that using b'' I could
    portably create a byte string (which is true) and using u'' I could
    portably create a unicode string (which is not true). This feature would
    help porting code between both versions. While this is a state I can
    live with, I wonder what the rationale for this is.

    !puzzled thanks

    Uli
     
    Ulrich Eckhardt, Oct 15, 2012
    #1
    1. Advertisements

  2. It was a mistake that is corrected in Python 3.3.

    You can now use u'' to create Unicode literals in both 2.x and 3.3 or
    better. This is a feature only designed for porting code though: you
    shouldn't use u'' in new code not intended for 2.x.
     
    Steven D'Aprano, Oct 15, 2012
    #2
    1. Advertisements

  3. Ulrich Eckhardt

    Dave Angel Guest

    Python 3.3 added that syntax, for easier porting. You can now use
    u"xyz" for a unicode string in both 2.x and 3.3
     
    Dave Angel, Oct 15, 2012
    #3
  4. u'' is legal in 3.3 again.
     
    Alex Strickland, Oct 15, 2012
    #4
  5. from __future__ import unicode_literals

    And now you can portable use b'' for a byte string and '' for a unicode
    string. When you will drop Python 2 support then just remove import from
    __future__.
     
    Serhiy Storchaka, Oct 15, 2012
    #5
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.