portable unicode literals

Discussion in 'Python' started by Ulrich Eckhardt, Oct 15, 2012.

  1. Hi!

    I need a little nudge in the right direction, as I'm misunderstanding
    something concerning string literals in Python 2 and 3. In Python 2.7,
    b'' and '' are byte strings, while u'' is a unicode literal. In Python
    3.2, b'' is a byte string and '' is a unicode literal, while u'' is a
    syntax error.

    This actually came as a surprise to me, I assumed that using b'' I could
    portably create a byte string (which is true) and using u'' I could
    portably create a unicode string (which is not true). This feature would
    help porting code between both versions. While this is a state I can
    live with, I wonder what the rationale for this is.

    !puzzled thanks

    Uli
    Ulrich Eckhardt, Oct 15, 2012
    #1
    1. Advertising

  2. On Mon, 15 Oct 2012 15:05:01 +0200, Ulrich Eckhardt wrote:

    > Hi!
    >
    > I need a little nudge in the right direction, as I'm misunderstanding
    > something concerning string literals in Python 2 and 3. In Python 2.7,
    > b'' and '' are byte strings, while u'' is a unicode literal. In Python
    > 3.2, b'' is a byte string and '' is a unicode literal, while u'' is a
    > syntax error.
    >
    > This actually came as a surprise to me, I assumed that using b'' I could
    > portably create a byte string (which is true) and using u'' I could
    > portably create a unicode string (which is not true). This feature would
    > help porting code between both versions. While this is a state I can
    > live with, I wonder what the rationale for this is.


    It was a mistake that is corrected in Python 3.3.

    You can now use u'' to create Unicode literals in both 2.x and 3.3 or
    better. This is a feature only designed for porting code though: you
    shouldn't use u'' in new code not intended for 2.x.


    --
    Steven
    Steven D'Aprano, Oct 15, 2012
    #2
    1. Advertising

  3. Ulrich Eckhardt

    Dave Angel Guest

    On 10/15/2012 09:05 AM, Ulrich Eckhardt wrote:
    > Hi!
    >
    > I need a little nudge in the right direction, as I'm misunderstanding
    > something concerning string literals in Python 2 and 3. In Python 2.7,
    > b'' and '' are byte strings, while u'' is a unicode literal. In Python
    > 3.2, b'' is a byte string and '' is a unicode literal, while u'' is a
    > syntax error.
    >
    > This actually came as a surprise to me, I assumed that using b'' I
    > could portably create a byte string (which is true) and using u'' I
    > could portably create a unicode string (which is not true). This
    > feature would help porting code between both versions. While this is a
    > state I can live with, I wonder what the rationale for this is.
    >
    > !puzzled thanks
    >
    > Uli


    Python 3.3 added that syntax, for easier porting. You can now use
    u"xyz" for a unicode string in both 2.x and 3.3

    --

    DaveA
    Dave Angel, Oct 15, 2012
    #3
  4. On 2012/10/15 03:05 PM, Ulrich Eckhardt wrote:

    > This actually came as a surprise to me, I assumed that using b'' I could
    > portably create a byte string (which is true) and using u'' I could
    > portably create a unicode string (which is not true). This feature would
    > help porting code between both versions. While this is a state I can
    > live with, I wonder what the rationale for this is.
    >
    > !puzzled thanks


    u'' is legal in 3.3 again.

    --
    Regards
    Alex
    Alex Strickland, Oct 15, 2012
    #4
  5. On 15.10.12 16:05, Ulrich Eckhardt wrote:
    > I need a little nudge in the right direction, as I'm misunderstanding
    > something concerning string literals in Python 2 and 3. In Python 2.7,
    > b'' and '' are byte strings, while u'' is a unicode literal. In Python
    > 3.2, b'' is a byte string and '' is a unicode literal, while u'' is a
    > syntax error.
    >
    > This actually came as a surprise to me, I assumed that using b'' I could
    > portably create a byte string (which is true) and using u'' I could
    > portably create a unicode string (which is not true). This feature would
    > help porting code between both versions. While this is a state I can
    > live with, I wonder what the rationale for this is.


    from __future__ import unicode_literals

    And now you can portable use b'' for a byte string and '' for a unicode
    string. When you will drop Python 2 support then just remove import from
    __future__.
    Serhiy Storchaka, Oct 15, 2012
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Eli Bendersky
    Replies:
    1
    Views:
    1,152
    Mike Treseler
    Mar 1, 2006
  2. Replies:
    4
    Views:
    871
    Roedy Green
    Nov 21, 2005
  3. John Goche
    Replies:
    8
    Views:
    16,426
  4. nico
    Replies:
    6
    Views:
    768
  5. Replies:
    7
    Views:
    902
Loading...

Share This Page