portable unicode literals

U

Ulrich Eckhardt

Hi!

I need a little nudge in the right direction, as I'm misunderstanding
something concerning string literals in Python 2 and 3. In Python 2.7,
b'' and '' are byte strings, while u'' is a unicode literal. In Python
3.2, b'' is a byte string and '' is a unicode literal, while u'' is a
syntax error.

This actually came as a surprise to me, I assumed that using b'' I could
portably create a byte string (which is true) and using u'' I could
portably create a unicode string (which is not true). This feature would
help porting code between both versions. While this is a state I can
live with, I wonder what the rationale for this is.

!puzzled thanks

Uli
 
S

Steven D'Aprano

Hi!

I need a little nudge in the right direction, as I'm misunderstanding
something concerning string literals in Python 2 and 3. In Python 2.7,
b'' and '' are byte strings, while u'' is a unicode literal. In Python
3.2, b'' is a byte string and '' is a unicode literal, while u'' is a
syntax error.

This actually came as a surprise to me, I assumed that using b'' I could
portably create a byte string (which is true) and using u'' I could
portably create a unicode string (which is not true). This feature would
help porting code between both versions. While this is a state I can
live with, I wonder what the rationale for this is.

It was a mistake that is corrected in Python 3.3.

You can now use u'' to create Unicode literals in both 2.x and 3.3 or
better. This is a feature only designed for porting code though: you
shouldn't use u'' in new code not intended for 2.x.
 
D

Dave Angel

Hi!

I need a little nudge in the right direction, as I'm misunderstanding
something concerning string literals in Python 2 and 3. In Python 2.7,
b'' and '' are byte strings, while u'' is a unicode literal. In Python
3.2, b'' is a byte string and '' is a unicode literal, while u'' is a
syntax error.

This actually came as a surprise to me, I assumed that using b'' I
could portably create a byte string (which is true) and using u'' I
could portably create a unicode string (which is not true). This
feature would help porting code between both versions. While this is a
state I can live with, I wonder what the rationale for this is.

!puzzled thanks

Uli

Python 3.3 added that syntax, for easier porting. You can now use
u"xyz" for a unicode string in both 2.x and 3.3
 
A

Alex Strickland

This actually came as a surprise to me, I assumed that using b'' I could
portably create a byte string (which is true) and using u'' I could
portably create a unicode string (which is not true). This feature would
help porting code between both versions. While this is a state I can
live with, I wonder what the rationale for this is.

!puzzled thanks

u'' is legal in 3.3 again.
 
S

Serhiy Storchaka

I need a little nudge in the right direction, as I'm misunderstanding
something concerning string literals in Python 2 and 3. In Python 2.7,
b'' and '' are byte strings, while u'' is a unicode literal. In Python
3.2, b'' is a byte string and '' is a unicode literal, while u'' is a
syntax error.

This actually came as a surprise to me, I assumed that using b'' I could
portably create a byte string (which is true) and using u'' I could
portably create a unicode string (which is not true). This feature would
help porting code between both versions. While this is a state I can
live with, I wonder what the rationale for this is.

from __future__ import unicode_literals

And now you can portable use b'' for a byte string and '' for a unicode
string. When you will drop Python 2 support then just remove import from
__future__.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top