portable unicode literals

Discussion in 'Python' started by Ulrich Eckhardt, Oct 15, 2012.

  1. Hi!

    I need a little nudge in the right direction, as I'm misunderstanding
    something concerning string literals in Python 2 and 3. In Python 2.7,
    b'' and '' are byte strings, while u'' is a unicode literal. In Python
    3.2, b'' is a byte string and '' is a unicode literal, while u'' is a
    syntax error.

    This actually came as a surprise to me, I assumed that using b'' I could
    portably create a byte string (which is true) and using u'' I could
    portably create a unicode string (which is not true). This feature would
    help porting code between both versions. While this is a state I can
    live with, I wonder what the rationale for this is.

    !puzzled thanks

    Ulrich Eckhardt, Oct 15, 2012
    1. Advertisements

  2. It was a mistake that is corrected in Python 3.3.

    You can now use u'' to create Unicode literals in both 2.x and 3.3 or
    better. This is a feature only designed for porting code though: you
    shouldn't use u'' in new code not intended for 2.x.
    Steven D'Aprano, Oct 15, 2012
    1. Advertisements

  3. Ulrich Eckhardt

    Dave Angel Guest

    Python 3.3 added that syntax, for easier porting. You can now use
    u"xyz" for a unicode string in both 2.x and 3.3
    Dave Angel, Oct 15, 2012
  4. u'' is legal in 3.3 again.
    Alex Strickland, Oct 15, 2012
  5. from __future__ import unicode_literals

    And now you can portable use b'' for a byte string and '' for a unicode
    string. When you will drop Python 2 support then just remove import from
    Serhiy Storchaka, Oct 15, 2012
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.