New Python 3.0 string formatting - really necessary?

MRAB · Dec 21, 2008

Aaron said:
Hi, not to take sides, but, there is a possibility.

This behavior is currently legal:

'0 1'

So, just extend it. (Unproduced.)

'(2, 3, 4) %i'

Which is quite clever and way ahead of its (posessive) time.

A couple of problems:

1. How do you handle a literal '%'? If you just double up then you'll
need to fix the string after all your substitutions.

2. What if a substitution introduces a '%'?

I suppose a possible solution would be to introduce a special format
string, including a literal, eg:

f"%r %i" % (2, 3, 4) % 1

and then convert the result to a true string:

print(str(f"%r %i" % (2, 3, 4) % 1))

(although print() would call __str__ anyway).

The format string would track where the last substitution occurred.

Hmm... I think I'll just learn the new method.

walterbyrd · Dec 21, 2008

On Dec 20 said:
He got really hung up on the % syntax.

I guess it's good to know that there is, at least, one person in the
world doesn't like the % formatting. As least the move was not
entirely pointless.

But, you must admit, of all the things people complain about with
Python, the % formatting is probably one of the least common
complaints. Complaints about Python's speed seem much more common.

Yet, 3.0 makes the speed worse, and "fixes" a non-problem.

I can see where the new formatting might be helpful in some cases.
But, I am not sure it's worth the cost.

Kay Schluehr · Dec 21, 2008

Debated by who? The entire Python-using community? Every single Python
programmer? Or just the small proportion of Python developers who are
also core developers?

"If I'd asked people what they wanted, they would have asked for a
better horse."

Henry Ford

r · Dec 21, 2008

I guess it's good to know that there is, at least, one person in the
world doesn't like the % formatting. As least the move was not
entirely pointless.

But, you must admit, of all the things people complain about with
Python, the % formatting is probably one of the least common
complaints. Complaints about Python's speed seem much more common.

Yet, 3.0 makes the speed worse, and "fixes" a non-problem.

I can see where the new formatting might be helpful in some cases.
But, I am not sure it's worth the cost.

This all really comes down to the new python users. Yea, i said it.
Not rabid fanboys like Steven and myself.(i can't speak for walter but
i think he would agree) Are we going to make sure joe-blow python
newbie likes the language. And doesn't get turned off and run over to
ruby or whoever. Like it or not, without newusers python is doomed to
the same fate as all the other "great" languages who had their 15 mins
of fame.

We must proactively seek out the wants of these new users and make
sure python stays alive. But we also must not sell are pythonic souls
in the process.

It would be nice to get a vote together and see what does the average
pythoneer want? What do they like, What do they dislike. What is the
state of the Python Union? Does anybody know, Does anybody care? I
think python is slipping away from it's dominate foothold on the
world. Google's use of python may be the only thing holding this house
of cards together. Ruby's "hype" is defiantly growing and unless we
strive for greatness, python may fail. I think ruby may have their act
together a little better than us right now. And since Ruby is such a
hodge-podge of different languages, the __init__ hold is there for
many.

what does joe-python want???

Marc 'BlackJack' Rintsch · Dec 21, 2008

It would be nice to get a vote together and see what does the average
pythoneer want? What do they like, What do they dislike. What is the
state of the Python Union? Does anybody know, Does anybody care? I think
python is slipping away from it's dominate foothold on the world.
Google's use of python may be the only thing holding this house of cards
together. Ruby's "hype" is defiantly growing and unless we strive for
greatness, python may fail. I think ruby may have their act together a
little better than us right now. And since Ruby is such a hodge-podge of
different languages, the __init__ hold is there for many.

what does joe-python want???

That's not completely irrelevant but I think one of Python's strength is
that we have a BDFL who decides carefully what he thinks is best for the
language instead of integrating every random idea some newbie comes up
with and which might sound useful at first sight.

Python has its quirks but even with things I don't like I often realize
later it was a wise decision that integrates well into the language
whereas my ideas for "fixes" of the quirks wouldn't. "joe-python" most
often doesn't see the whole picture and demands changes that look easy at
first sight, but are hard to implement right and efficient or just shifts
the problem somewhere else where the next "joe-python" trips over it.

Ciao,
Marc 'BlackJack' Rintsch

Marc 'BlackJack' Rintsch · Dec 21, 2008

Really? You know, it's funny, but when I read problems that people have
with python, I don't remember seeing that. Loads of people complain
about the white space issue. Some people complain about the speed. Lots
of complaints about certain quirky behavior, but I have not come across
any complaints about the string formatting.

Many newbie code I have seen avoids it by string concatenation:

greeting = 'Hello, my name is ' + name + ' and I am ' + str(age) + ' old.'

That's some kind of indirect complaint.

In fact, from what I have seen, many of the "problems" being "fixed"
seem to be non-problems.

And even if nobody has problems with the limitations of ``%`` string
formatting why shouldn't they add a more flexible and powerful way!?
PythonÂ 3.0 is not a bug fix release.

Ciao,
Marc 'BlackJack' Rintsch

Steven D'Aprano · Dec 21, 2008

2) In my experience, major version changes tend to be slower than
before. When a lot of things change, especially if very low-level
things change, as happened in python 3.0, the new code has not yet went
through many years of revision and optimization that the old code has.

I was around for the change from Python 1.5 -> 2.x. By memory, I skipped
a couple of versions... I think I didn't make the move until Python 2.2
or 2.3 was released. Python 2.0 was significantly slower than 1.5 in a
number of critical areas, but not for long.

Actually, it's quite possible that Python 1.5 is still faster than Python
2.x in some areas -- but of course it misses a lot of features, and at
the end of the day, the difference between your script completing in 0.03
seconds or in 0.06 seconds is meaningless.

In my opinion, python 3 was rushed out the door a bit. It could have
done with a few more months of optimization and polishing. However, on
the other hand, it is going to take so long for python infrastructure to
convert to python 3, that an earlier release makes sense, even if it
hasn't been excessively polished. The biggest reason for the speed
change is the rewritten stdio and unicode-everything. Hopefully this
stuff can be improved in future updates. I don't think anyone WANTS
cpython to be slower.

I understand that the 3.0.1 release due out around Christmas will have
some major speed-ups in stdio.

Aaron Brady · Dec 21, 2008

Errors should never pass silently, unless explicitly silenced. You have
implicitly silenced the TypeError you get from not having enough
arguments for the first format operation. That means that you will
introduce ambiguity and bugs.

No, it's not part of the (new) '%' operation. '%' handles one flag at
a time. It's not an error if the author intends it.

"%i %i %i %i" % 5 % 3 %7

Here I have four slots and only three numbers. Which output did I expect?

'%i 5 3 7'
'5 %i 3 7'
'5 3 %i 7'
'5 3 7 %i'

Anything, so long as it's (contraction) consistent and general.

Or more likely, the three numbers is a mistake, there is supposed to be a
fourth number there somewhere, only now instead of the error being caught
immediately, it won't be discovered until much later.

Leave that to unit testing and your QA team.

To make the change, the burden of proof (which is large) would fall to
me. However, in the abstract case, it's not clear that either one is
favorable, more obvious, or a simpler extrapolation.

Bug-proneness is an argument against a construction, just not a
conclusive one. How heavy is it in this case?

Aaron Brady · Dec 21, 2008

A couple of problems:

1. How do you handle a literal '%'? If you just double up then you'll
need to fix the string after all your substitutions.

2. What if a substitution introduces a '%'?

I suppose a possible solution would be to introduce a special format
string, including a literal, eg:

f"%r %i" % (2, 3, 4) % 1

and then convert the result to a true string:

print(str(f"%r %i" % (2, 3, 4) % 1))

(although print() would call __str__ anyway).

The format string would track where the last substitution occurred.

Hmm... I think I'll just learn the new method.

Now that I'm fighting 'r's war for him/her...

Um, here's one possibility. On the first interpolation, flags are
noted and stored apart from subsequent interpolations. Then, use a
sentinel to terminate the interpolation. (Unproduced.)
'%dss0'

The first %s is replaced with %d, but doesn't hijack the '0'. If you
want to interpolate the %d, use the sentinel. The sentinel is what
causes '%%' to be handled.
Traceback (most recent call last):
'1ss0'

Treating tuples as a special case appears to be the simpler solution,
but this, 'chaining', to adopt the term, is still feasible.

Marc 'BlackJack' Rintsch · Dec 21, 2008

You seem to have made an unwarranted assumption, namely that a binary
operator has to compile to a function with two operands. There is no
particular reason why this has to always be the case: for example, I
believe that C# when given several strings to add together optimises
this into a single call to a concatenation method.

Python *could* do something similar if the appropriate opcodes/methods
supported more than two arguments:

a+b+c+d might execute a.__add__(b,c,d) allowing more efficient string
concatenations or matrix operations, and a%b%c%d might execute as
a.__mod__(b,c,d).

But that needs special casing strings and ``%`` in the comiler, because
it might not be always safe to do this on arbitrary objects. Only in
cases where the type of `a` is known at compile time and ``a % b``
returns an object of ``type(a)``.

Ciao,
Marc 'BlackJack' Rintsch

Steven D'Aprano · Dec 21, 2008

Steven D'Aprano said:
Steven D'Aprano said:

Errors should never pass silently, unless explicitly silenced. You have
implicitly silenced the TypeError you get from not having enough
arguments for the first format operation. That means that you will
introduce ambiguity and bugs.

"%i %i %i %i" % 5 % 3 %7

Here I have four slots and only three numbers. Which output did I
expect?

'%i 5 3 7'
'5 %i 3 7'
'5 3 %i 7'
'5 3 7 %i'

Or more likely, the three numbers is a mistake, there is supposed to be
a fourth number there somewhere, only now instead of the error being
caught immediately, it won't be discovered until much later.

Click to expand...

You seem to have made an unwarranted assumption, namely that a binary
operator has to compile to a function with two operands. There is no
particular reason why this has to always be the case: for example, I
believe that C# when given several strings to add together optimises
this into a single call to a concatenation method.
[...]

Python *could* do something similar if the appropriate opcodes/methods
supported more than two arguments:

a+b+c+d might execute a.__add__(b,c,d) allowing more efficient string
concatenations or matrix operations, and a%b%c%d might execute as
a.__mod__(b,c,d).

That's only plausible if the operations are associative. Addition is
associative, but string interpolation is not:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: not all arguments converted during string formatting

Since string interpolation isn't associative, your hypothetical __mod__
method might take multiple arguments, but it would have to deal with them
two at a time, unlike concatenation where the compiler could do them all
at once. So whether __mod__ takes two arguments or many is irrelevant:
its implementation must rely on some other function which takes two
arguments and must succeed or fail on that.

Either that, or we change the design of % interpolation, and allow it to
silently ignore errors. I assumed that is what Aaron wanted.

In that alternate universe your example:

"%i %i %i %i" % 5 % 3 %7

simply throws "TypeError: not enough arguments for format string"

That has a disturbing consequence.

Consider that most (all?) operations, we can use temporary values:

x = 1 + 2 + 3 + 4
=> x == 10

gives the same value for x as:

temp = 1 + 2 + 3
x = temp + 4

I would expect that the same should happen for % interpolation:

# using Aaron's hypothetical syntax
s = "%s.%s.%s.%s" % 1 % 2 % 3 % 4
=> "1.2.3.4"

should give the same result as:

temp = "%s.%s.%s.%s" % 1 % 2 % 3
s = temp % 4

But you're arguing that the first version should succeed and the second
version, using a temporary value, should fail. And that implies that if
you group part of the expression in parentheses, it will fail as well:

s = ("%s.%s.%s.%s" % 1 % 2 % 3) % 4

Remove the parentheses, and it succeeds. That's disturbing. That makes
the % operator behave very differently from other operators.

Note that with the current syntax, we don't have that problem: short-
supplying arguments leads to an exception no matter what.

"%s" % (1,2,3)

just converts the tuple as a single argument. It also provides the
answer to how you put a percent in the format string (double it)

I trust you know that already works, but just in case:

'12.5%'

and what happens if a substitution inserts a percent (it doesn't
interact with the formatting operators).
Ditto:

'%g'

Steve Holden · Dec 21, 2008

r said:
This all really comes down to the new python users. Yea, i said it.
Not rabid fanboys like Steven and myself.(i can't speak for walter but
i think he would agree) Are we going to make sure joe-blow python
newbie likes the language. And doesn't get turned off and run over to
ruby or whoever. Like it or not, without newusers python is doomed to
the same fate as all the other "great" languages who had their 15 mins
of fame.

We must proactively seek out the wants of these new users and make
sure python stays alive. But we also must not sell are pythonic souls

that's "our" (possessive), r, not "are" (verb)

in the process.

It would be nice to get a vote together and see what does the average
pythoneer want? What do they like, What do they dislike. What is the
state of the Python Union? Does anybody know, Does anybody care? I
think python is slipping away from it's dominate foothold on the
world. Google's use of python may be the only thing holding this house
of cards together. Ruby's "hype" is defiantly growing and unless we
strive for greatness, python may fail. I think ruby may have their act
together a little better than us right now. And since Ruby is such a
hodge-podge of different languages, the __init__ hold is there for
many.

what does joe-python want???

Don't make the mistake of assuming there is a "Joe Python" whose needs
neatly encapsulate the sum of all Python users' needs. There's plenty of
evidence from this group that different people like, want or need
different things from Python, and attempting to measure user
requirements by democratic means is not likely to produce much useful
information.

There is no such thing as "the average Python programmer": an average
can only be measured for one-dimensional values on some sort of linear
continuum. Python users live in a multi-dimensional space where the
concept of an average has little meaning and less use.

As for your assertion that Google's use of Python may be the only thing
maintaining Python's popularity, it's complete twaddle. Take a look
around at who's involved in using Python. I suspect Industrial Light and
Magic ,may have more Python programmers than Google, who also make
extensive use of Java and one other language (C++?), as well as a bevy
of others as justified by project needs. Rackspace, NASA, Canonical and
many others are keen supporters of the language, and they put their
money where their mouths are by incorporating it into their products.

regards
Steve

skip · Dec 21, 2008

Marc> Many newbie code I have seen avoids it by string concatenation:

Marc> greeting = 'Hello, my name is ' + name + ' and I am ' + str(age) + ' old.'

Marc> That's some kind of indirect complaint.

I see Python code like that written by people with a C/C++ background. I
don't think you can necessarily chalk that up to %-string avoidance. They
learn that + will concatenate two strings and don't look further.

Aaron Brady · Dec 21, 2008

But that needs special casing strings and ``%`` in the comiler, because
it might not be always safe to do this on arbitrary objects. Only in
cases where the type of `a` is known at compile time and ``a % b``
returns an object of ``type(a)``.

'x+y' makes no guarantees whatsoever. It could return an object of
type(x), type(y), or neither. 'a%b' in the case of strings is just,
str.__mod__, returning string.

In a+b+c, 'a' gets dibs over what the rest see, so there's no more
danger in the multi-ary case, than in binary; and that hasn't stopped
us before.

You might be confusing the cases of arbitrary operators vs. uniform
operators. 'a' does not get dibs in 'a+b*c'; 'b*c' are allowed to
carry out their affairs. But in 'a+b+c', 'a*b*c', 'a%b%c', and so on,
'a' has final say on b's and c's behaviors via its return value, so
loses nothing by combining such a call.

In short, you can force it anyway, so it's syntactic sugar after that.

Aaron Brady · Dec 21, 2008

r wrote: snip

that's "our" (possessive), r, not "are" (verb)

Don't make the mistake of assuming there is a "Joe Python" whose needs
neatly encapsulate the sum of all Python users' needs. There's plenty of
evidence from this group that different people like, want or need
different things from Python, and attempting to measure user
requirements by democratic means is not likely to produce much useful
information.

There is no such thing as "the average Python programmer": an average
can only be measured for one-dimensional values on some sort of linear
continuum. Python users live in a multi-dimensional space where the
concept of an average has little meaning and less use.

You've confused dimensions with modes. There is such thing as the
center of a bivariate distribution--- it is merely the most common of
the individual variables, the top of a 3-D hill, or the center of
mass.

However, an average only makes sense for unimodal distributions. If
the distribution is bi-modal, there's no "average" in the neat sense.

Dollars earned per hour spent writing in Python is a good candidate.
There are two modes in that distribution. One at 0, the other in the
tens or hundreds. And the global average is less common than either
mode individually. So in this case, we have one "Joe Py" for every
mode: Joe Free Py, and Joe Paid Py. (You might actually get multi-
modal on that one-- Joe Salary Py, Joe Wage Py, Joe Stipend Py, Joe
Free Py, but $0.01/hr. is less common than 0, and less common than
$50.)

You might also argue that the standard deviation is so high as to make
any one data point unrepresentative of many others. But if you have
variables in two dimensions, they're independent by definition (or
there exists a basis set that is).

Mel · Dec 21, 2008

Duncan said:
I don't see that. What I suggested was that a % b % c would map to
a.__mod__(b,c). (a % b) % c would also map to that, but a % (b % c) could
only possibly map to a.__mod__(b.__mod__(c))

There's a compiling problem here, no? You don't want a%b%c to implement as
a.__mod__(b,c) if a is a number.

Mel.

Marc 'BlackJack' Rintsch · Dec 21, 2008

I could be wrong, but I don't see that would be the case.

I think it would be safe (in this hypothetical universe) any time that
'a' had a method __mod__ which accepted more than one argument.

And returns an object of ``type(a)`` or at least a "duck type" so that it
is guaranteed that ``a.__mod__(b, c)`` really has the same semantics as
``a.__mod__(b).__mod__(c)``. For arbitrary objects `a`, `b`, and `c`
that are not known at compile time, how could the compiler decide if it
is safe to emit code that calls `a.__mod__()` with multiple arguments?

Ciao,
Marc 'BlackJack' Rintsch

MRAB · Dec 21, 2008

Aaron said:
Now that I'm fighting 'r's war for him/her...

Um, here's one possibility. On the first interpolation, flags are
noted and stored apart from subsequent interpolations. Then, use a
sentinel to terminate the interpolation. (Unproduced.)

'%dss0'

The first %s is replaced with %d, but doesn't hijack the '0'. If you
want to interpolate the %d, use the sentinel. The sentinel is what
causes '%%' to be handled.

Traceback (most recent call last):

'1ss0'

Treating tuples as a special case appears to be the simpler solution,
but this, 'chaining', to adopt the term, is still feasible.

A possible solution occurred to me shortly after I posted, but I decided
that sleep was more important.

The original format is a string. The result of '%' is a string if
there's only 1 placeholder to fill, or a (partial) format object (class
"Format"?) if there's more than one. Similarly, the format object
supports '%'. The result of '%' is a string if there's only 1
placeholder to fill, or a new (partial) format object if there's more
than one.
<type 'str'>

Aaron Brady · Dec 21, 2008

A possible solution occurred to me shortly after I posted, but I decided
that sleep was more important.

The original format is a string. The result of '%' is a string if
there's only 1 placeholder to fill, or a (partial) format object (class
"Format"?) if there's more than one. Similarly, the format object
supports '%'. The result of '%' is a string if there's only 1
placeholder to fill, or a new (partial) format object if there's more
than one.

>>> f = "%r %i"
>>> type(f)
<type 'str'>
>>> f = f % (2, 3, 4)
>>> type(f)
<type 'Format'>
>>> f = f % 1
>>> type(f)
<type 'str'>

Alright, so how are you handling:
?

In other words, are you slipping '1' in to the very first available
slot, or the next, after the location of the prior?

MRAB · Dec 21, 2008

Aaron said:
Alright, so how are you handling:

?

In other words, are you slipping '1' in to the very first available
slot, or the next, after the location of the prior?

Let's assume that Format objects display their value like the equivalent
string format:
'%%i 1'

I Really Need New working kahoot Script/ code in this August 2023..	0	Aug 17, 2023
is list comprehension necessary?	15	Oct 26, 2010
CSS File does not really work?	1	Jul 7, 2023
New to python looking for help	4	Sep 26, 2023
RELEASED Python 3.0 final	36	Dec 4, 2008
PyDev 3.0 Released	2	Nov 7, 2013
New coder looking for critique on fun project.	6	Jul 20, 2023
ANN: Celery 3.0 (chiastic slide) released!	1	Jul 7, 2012

New Python 3.0 string formatting - really necessary?

MRAB

walterbyrd

Kay Schluehr

r

Marc 'BlackJack' Rintsch

Marc 'BlackJack' Rintsch

Steven D'Aprano

Aaron Brady

Aaron Brady

Marc 'BlackJack' Rintsch

Steven D'Aprano

Steve Holden

skip

Aaron Brady

Aaron Brady

Mel

Marc 'BlackJack' Rintsch

MRAB

Aaron Brady

MRAB

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads