PyWart: The problem with "print"

Dave Angel · Jun 3, 2013

Strikes me it's a bit like problems when prototyping circuit boards. The
card doesn't work, so you mount it on an extender card, problem goes
away, remove extender card, problem reappears. Wash, rinse, repeat

That's when you use a little kappy-zapper spray.

Ian Kelly · Jun 3, 2013

I'm actually with RR in terms of eliminating the overhead involved with
'dead' function calls, since there are instances when optimizing in Python
is desirable. I actually recently adjusted one of my own scripts to
eliminate branching and improve data layout to achieve a 1000-fold
improvement in efficiency (~45 minutes to 0.42 s. for one example) --- all
in pure Python. The first approach was unacceptable, the second is fine.
For comparison, if I add a 'deactivated' debugprint call into the inner loop
(executed 243K times in this particular test), then the time of the
double-loop step that I optimized takes 0.73 seconds (nearly doubling the
duration of the whole step).

It seems to me that your problem here wasn't that the time needed for
the deactivated debugprint was too great. Your problem was that a
debugprint that executes 243K times in 0.73 seconds is going to
generate far too much output to be useful, and it had no business
being there in the first place. *Reasonably* placed debugprints are
generally not going to be a significant time-sink for the application
when disabled.

The easiest way to eliminate these 'dead' calls is to simply comment-out the
print call, but I would be quite upset if the interpreter tried to outsmart
me and do it automagically as RR seems to be suggesting.

Indeed, the print function is for general output, not specifically for
debugging. If you have the global print deactivation that RR is
suggesting, then what you have is no longer a print function, but a
misnamed debug function.

Jason Swails · Jun 3, 2013

It seems to me that your problem here wasn't that the time needed for
the deactivated debugprint was too great. Your problem was that a
debugprint that executes 243K times in 0.73 seconds is going to
generate far too much output to be useful, and it had no business
being there in the first place. *Reasonably* placed debugprints are
generally not going to be a significant time-sink for the application
when disabled.

Well in 'debug' mode I wouldn't use an example that executed the loop 200K
times -- I'd find one that executed a manageable couple dozen, maybe.
When 'disabled,' the print statement won't do anything except consume
clock cycles and potentially displace useful cache (the latter being the
more harmful, since most applications are bound by the memory bus). It's
better to eliminate this dead call when you're not in 'debugging' mode.
(When active, it certainly would've taken more than 0.73
seconds) Admittedly such loops should be tight enough that debugging
statements inside the inner loop are generally unnecessary, but perhaps not
always.

But unlike RR, who suggests some elaborate interpreter-wide, ambiguous
ignore-rule to squash out all of these functions, I'm simply suggesting
that sometimes it's worth commenting-out debug print calls instead of 'just
leaving them there because you won't notice the cost'

.

The easiest way to eliminate these 'dead' calls is to simply comment-out
the

Indeed, the print function is for general output, not specifically for
debugging. If you have the global print deactivation that RR is
suggesting, then what you have is no longer a print function, but a
misnamed debug function.

Exactly. I was just trying to make the point that it is -occasionally-
worth spending the time to comment-out certain debug calls rather than
leaving 'dead' function calls in certain places.

All the best,
Jason

--
Jason M. Swails
Quantum Theory Project,
University of Florida
Ph.D. Candidate
352-392-4032

Jason Swails · Jun 3, 2013

It seems to me that your problem here wasn't that the time needed for
the deactivated debugprint was too great. Your problem was that a
debugprint that executes 243K times in 0.73 seconds is going to
generate far too much output to be useful, and it had no business
being there in the first place. *Reasonably* placed debugprints are
generally not going to be a significant time-sink for the application
when disabled.

Well in 'debug' mode I wouldn't use an example that executed the loop 200K
times -- I'd find one that executed a manageable couple dozen, maybe.
When 'disabled,' the print statement won't do anything except consume
clock cycles and potentially displace useful cache (the latter being the
more harmful, since most applications are bound by the memory bus). It's
better to eliminate this dead call when you're not in 'debugging' mode.
Admittedly such loops should be tight enough that debugging statements
inside the inner loop are generally unnecessary, but perhaps not always.

But unlike RR, who suggests some elaborate interpreter-wide, ambiguous
ignore-rule to squash out all of these functions, I'm simply suggesting
that sometimes it's worth commenting-out debug print calls instead of 'just
leaving them there because you won't notice the cost'

.

The easiest way to eliminate these 'dead' calls is to simply comment-out
the

Indeed, the print function is for general output, not specifically for
debugging. If you have the global print deactivation that RR is
suggesting, then what you have is no longer a print function, but a
misnamed debug function.

Exactly. I was just trying to make the point that it is -occasionally-
worth spending the time to comment-out certain debug calls rather than
leaving 'dead' function calls in certain places.

All the best,
Jason

Jason Swails · Jun 3, 2013

ack, sorry for the double-post.

Steven D'Aprano · Jun 3, 2013

But unlike RR, who suggests some elaborate interpreter-wide, ambiguous
ignore-rule to squash out all of these functions, I'm simply suggesting
that sometimes it's worth commenting-out debug print calls instead of
'just leaving them there because you won't notice the cost' .

+1

Further to this idea, many command line apps have a "verbose" mode, where
they print status messages as the app runs. Some of these include
multiple levels, so you can tune just how many messages you get, commonly:

- critical messages only
- important or critical messages
- warnings, important or critical messages
- status, warnings, important or critical messages
- all of the above, plus debugging messages
- all of the above, plus even more debugging messages

Since this verbosity level is selectable at runtime, the code itself must
include many, many calls to some equivalent to print, enough calls to
print to cover the most verbose case, even though most of the time most
such calls just return without printing.

This is a feature. And like all features, it has a cost. If (generic)
your application does not benefit from verbose print statements scattered
all throughout it, *don't put them in*. But if it will, then there is a
certain amount of overhead to this feature. Deal with it, either by
accepting the cost, or by writing more code that trades off complexity
for efficiency. It's 2013, not 1975, and computers have more than 32K of
RAM and the slowest CPU on the market is a million times faster than the
ones that took us to the moon, and quite frankly I have no sympathy for
the view that CPU cycles are so precious that we mustn't waste them. If
that were the case, Python is the wrong language.

Chris Angelico · Jun 3, 2013

... quite frankly I have no sympathy for
the view that CPU cycles are so precious that we mustn't waste them. If
that were the case, Python is the wrong language.

CPU cycles *are* valuable still, though. The efficiency of your code
determines how well it scales - but we have to be talking 100tps vs
1000tps here. There needs to be a huge difference for it to be at all
significant.

ChrisA

Rick Johnson · Jun 3, 2013

Oh Steven, you've really outdone yourself this time with the
theatrics. I hope you scored some "cool points" with your
minions. Heck, you almost had me convinced until i slapped
myself and realized your whole argument is just pure BS. For
the sake of the lemmings, i must dissect your BS and expose
it's methane emitting innards for all to smell.

Many
languages provide a function, method, or statement by which users can
write easily to stdout, and Python is no exception with it's own "print"
function. However, whilst writing to stdout via "print" is slightly less
verbose than calling the "write" method of "sys.stdout", we don't really
gain much from this function except a few keystrokes... is this ALL
print should be? A mere syntactical sugar?

Click to expand...

Perhaps you should read the docs before asking rhetorical questions,
because the actual answer is, No, print is not mere syntactical sugar
saving a few keystrokes.
[...]

And perhaps you should read a dictionary and obtain (at
minimum) a primary school level education in English before
making such foolish statements, because, OBVIOUSLY you don't
know the definition of "syntactical sugar"... shall i
educate you?

############################################################
# Wikipedia: "syntactic sugar" #
############################################################
# In computer science, syntactic sugar is syntax within a #
# programming language that is designed to make things #
# easier to read or to express. It makes the language #
# "sweeter" for human use: things can be expressed more #
# clearly, more concisely, or in an alternative style that #
# some may prefer[...] #
############################################################

The print function is the very definition of a "syntactic sugar".

For example:
print("some sting")

is much more readable than:

sys.stdout.write("some string"+"\n")

or:

sys.stderr.write("some string"+"\n")

or:

streamFoo.write("blah")

But wait, there's more!

############################################################
# Wikipedia: "syntactic sugar" (continued) #
############################################################
# [...]Specifically, a construct in a language is called #
# syntactic sugar if it can be removed from the language #
# without any effect on what the language can do: #
# functionality and expressive power will remain the same. #
############################################################

Again, the removal of a print function (or print statement)
will not prevent users from calling the write method on
sys.stdout or sys.stderr (or ANY "stream object" for that matter!)

The only mistake i made was to specify stdout.write
specifically instead of generally referring to the print
function as a sugar for "stream.write()".

I've found that many subtle bugs are caused by not limiting the inputs
to sane values (or types). And with Python's duct typing [...]
and implicit
casting to Boolean, you end up with all sorts of misleading things
happening! Maybe you're testing for truth values and get a string
instead; which screws everything up!!!

Click to expand...

Only if you're a lousy programmer who doesn't understand Python's truth
model.

I understand the Python truth model quite well, i just don't
happen to like it. Implicit conversion to Boolean breaks the
law of "least astonishment".

Many times you'll get a result (or an input) that you expect
to be a Boolean, but instead is a string. A good example of
poor coding is "dialog box return values". Take your
standard yes/no/cancel dialog, i would expect it to return
True|False|None respectively, HOWEVER, some *idiot* decided
to return the strings 'yes'|'no'|'cancel'.

If you tried to "bool test" a string (In a properly designed
language that does NOT support implicit Boolean conversion)
you would get an error if you tried this:

py> string = " "
py> if string:
... do_something()
ERROR: Cannot convert string to Boolean!

However, with Python's implicit conversion to Boolean, the
same conditional will ALWAYS be True: because any string
that is not the null string is True (as far as Python is
concerned). This is an example of Python devs breaking TWO
Zens at once:

"explicit is better than implicit"
"errors should NEVER pass silently"

And even though Python does not raise an error, it should!

No no no, that's not how you do it. It should be:
if DEBUG == True == True:
Wait, no, I got that wrong. It should be:
if DEBUG == True == True == True:
Hang on, I've nearly got it!
if DEBUG == True == True == True == True:
Or, you could program like a professional, and say:
if DEBUG:

Obviously you don't appreciate the value of "explicit enough".

if VALUE:

is not explicit enough, however

if bool(VALUE)

or at least:

if VALUE == True

is "explicit enough". Whereas:

if VALUE == True == True

is just superflous. But that's just one example. What about this:

if lst:

I don't like that because it's too implict. What exactly
about the list are we wanting to test? Do we want to know if
we have list object or a None object, OR, do we want to know
if we have a list object AND the list has members? I prefer
to be explicit at the cost of a few keystrokes:

if len(lst) > 0:

Covers the "has members" test and:

if lst is not None

covers the "existence" test.

I know Python allows me to be implicit here, however, i am
choosing to be explicit for the sake of anyone who will read
my code in the future but also for myself, because being
explicit when testing for truth can catch subtle bugs.

Consider the following:

What if the symbol `value` is expected to be a list,
however, somehow it accidentally got reassigned to another
type. If i choose to be implicit and use: "if value", the
code could silently work for a type i did not intend,
therefore the program could go on for quite some time before
failing suddenly on attribute error, or whatever.

However, if i choose to be explicit and use:

"if len(VALUE) > 0:

then the code will fail when it should: at the comparison
line. Because any object that does not provide a __len__
method would cause Python to raise NameError.

By being "explicit enough" i will inject readability and
safety into my code base. (that's twice you've been schooled
in one reply BTW!)

By the way, why is DEBUG a constant? Doesn't that defeat the purpose?

Hmm, I agree!. You're actually correct here. We should not
be reassigning constants should we? (<--rhetorical) In
correcting me you've exposed yet another design flaw with
Python. Sadly Python DOES allow reassignment of CONSTANTS.

And of course you have profiled your application, and determined that the
bottleneck in performance is the calls to debugprint, because otherwise
you are wasting your time and ours with premature optimization.
Life is hard. Sometimes you have to choose between performance and
debugging.

Only if your language does not provide a proper debugprint
function or provide the tools to create a proper debug print
function. I detest true global variables, however, there are
some legitimate reasons for true globals in every language.
This "debugprint" problem is one of those reasons.

If you really care about this premature optimization, you can do this:
if __debug__:
print("whatever")

That's hideous! Two lines of code to make a single debug
message, are you joking? I realize Python will allow me to
place the entire statement on one line, however i refuse to
do that also. I am very strict about my block structure and
styles, and even the consistent inconsistency of the great
GvR will not sway me away from adherence to my consistent
readable style.

You then globally disable these print calls by running Python with the -O
switch.
Python does not work like that. Perhaps you should learn how to program
in Python before telling us how it should be improved?

And perhaps you should listen to diverse ideas and be open
to change instead of clinging to your guns and religion.

So, it will be utterly useless then, since __LOCALDEBUG__ has no effect,
and __GLOBALDEBUG__ overrides it. Great.

Of course global debug overrides local debug, what's the
purpose of global switching if it cannot override local
switching? "__GLOBALDEBUG__ = False" would disables ALL
debug messages EVERYWHERE. Yes, you are correct on this
issue. It would be the same as setting a command line switch.
However, you misunderstand __LOCALDEBUG__. When global
debugging is "on" "__LOCALDEBUG__ = False" will disable
debug messages ONLY in the module for which it was declared.

Vito De Tullio · Jun 3, 2013

Rick said:
Take your
standard yes/no/cancel dialog, i would expect it to return
True|False|None respectively,

you clearly mean True / False / FileNotFound.

( http://thedailywtf.com/Articles/What_Is_Truth_0x3f_.aspx )

Rick Johnson · Jun 3, 2013

you clearly mean True / False / FileNotFound.

No, i clearly meant what i said

. FileDialogs only return
one of two values; either a valid path or a value
representing "failure". I suppose FileNotFound is a custom
exception? That will work however i wonder if exception
handling is overkill for this?

try:
path = filedialog.open("path")
except FileNotFound:
return
do_something(path)

As opposed to:

path = filedialog.open("path")
if path:
do_something(path)

Or, if Python was really cool!

if filedialog.open("path") as path:
do_something(path)

However, i think True|False|None is the best return values
for a yes|no|cancel choice. Consider:

result = yesnocancel("save changes?")
if result:
# Try to save changes and close.
if self.fileSave():
app.close()
else:
show_error()
elif result is False:
# Close without saving changes.
app.close()
else:
# Canceled: Do nothing.
return

Steven D'Aprano · Jun 4, 2013

you clearly mean True / False / FileNotFound.

( http://thedailywtf.com/Articles/What_Is_Truth_0x3f_.aspx )

No no, he actually means

return True
return False
raise an exception

Or perhaps

0
1
2

Or perhaps:

'yes'
'no'
'cancel'

like all right-thinking people expect *wink*

Of course the one thing that a programmer should never, ever do, under
pain of maybe having to learn something, is actually check the
documentation of an unfamiliar library or function before making
assumptions of what it will return. If you follow this advice, you too
can enjoy the benefits of writing buggy code.

Steven D'Aprano · Jun 4, 2013

Obviously you don't appreciate the value of "explicit enough".

if VALUE:

is not explicit enough, however

Consider a simple thought experiment. Suppose we start with a sequence of
if statements that begin simple and get more complicated:

if a == 1: ...

if a == 1 and b > 2*c: ...

if a == 1 and b > 2*c or d%4 == 1: ...

if a == 1 and b > 2*c or d%4 == 1 and not (d**3//7)%3 == 0: ...

I don't believe that any of these tests are improved by adding an
extraneous "== True" at the end:

if (a == 1) == True: ...

if (a == 1 and b > 2*c) == True: ...

if (a == 1 and b > 2*c or d%4 == 1) == True: ...

if (a == 1 and b > 2*c or d%4 == 1 and not (d**3//7)%3 == 0) == True: ...

At some point your condition becomes so complicated that you may wish to
save it as a separate variable, or perhaps you need to check the flag in
a couple of places and so calculate it only once. Moving the flag out
into a separate variable doesn't make "== True" any more useful or
helpful.

flag = a == 1
if flag == True: ...

But even if it did, well, you've just entered the Twilight Zone, because
of course "flag == True" is just a flag, so it too needs to be tested
with "== True":

flag = (a == 1) == True
if flag == True: ...

but that too is just a flag so it needs more "explicitness"... and so on
forever. This conclusion is of course nonsense. Adding "== True" to your
boolean tests isn't helpful, so there's no need for even one, let alone
an infinite series of "== True".

"if flag" is as explicit as it needs to be. There's no need to
artificially inflate the "explicitness" as if being explicit was good in
and of itself. We don't normally write code like this:

n += int(1)

just to be explicit about 1 being an int. That would be redundant and
silly. In Python, 1 *is* an int.

[...]

if lst:

I don't like that because it's too implict. What exactly about the list
are we wanting to test?

If you are unfamiliar with Python, then you have to learn what the
semantics of "if lst" means. Just as you would have to learn what
"if len(lst) > 0" means.

I prefer to be explicit at the cost of a few keystrokes:

if len(lst) > 0:

This line of code is problematic, for various reasons:

- you're making assumptions about the object which are unnecessary;

- which breaks duck-typing;

- and risks doing too much work, or failing altogether.

You're looking up the length of the lst object, but you don't really care
about the length. You only care about whether there is something there or
not, whether lst is empty or not. It makes no difference whether lst
contains one item or one hundred million items, and yet you're asking to
count them all. Only to throw that count away immediately!

Looking at the length of a built-in list is cheap, but why assume it is a
built-in list? Perhaps it is a linked list where counting the items
requires a slow O(N) traversal of the entire list. Or some kind of lazy
sequence that has no way of counting the items remaining, but knows
whether it is exhausted or not.

The Python way is to duck-type, and to let the lst object decide for
itself whether it's empty or not:

if lst: ...

not to make assumptions about the specific type and performance of the
object.

Consider the following:

What if the symbol `value` is expected to be a list, however, somehow
it accidentally got reassigned to another type. If i choose to be
implicit and use: "if value", the code could silently work for a type i
did not intend, therefore the program could go on for quite some time
before failing suddenly on attribute error, or whatever.

`if len(lst) > 0` also works for types you don't intend. Any type that
defines a __len__ method which returns an integer will do it.

Tuples, sets and dicts are just the most obvious examples of things that
support len() but do not necessarily support all the things you might
wish to do to a list.

However, if i choose to be explicit and use:

"if len(VALUE) > 0:

then the code will fail when it should: at the comparison line.

Except of course when it doesn't.

Because
any object that does not provide a __len__ method would cause Python to
raise NameError.
TypeError.

By being "explicit enough" i will inject readability and safety into my
code base.

Unnecessary verbosity and redundancy, unnecessary restrictions on the
type of the object, and unjustifiable assumptions about the cost of
calling len().

Chris Angelico · Jun 4, 2013

The print function is the very definition of a "syntactic sugar".

For example:
print("some sting")

is much more readable than:

sys.stdout.write("some string"+"\n")
...
Again, the removal of a print function (or print statement)
will not prevent users from calling the write method on
sys.stdout or sys.stderr (or ANY "stream object" for that matter!)

And you could abolish ALL of the builtins by requiring that you import
ctypes and implement them all yourself. That is not the point of the
term. If print() is mere syntactic sugar, then everything is syntactic
sugar for Brainf* code.

The point of syntactic sugar is that there is a trivially-equivalent
underlying interpretation. For instance, in C, array subscripting is
trivially equivalent to addition and dereferencing:

a <-> *(a+i)

This is syntactic sugar. The Python print() function does much more
than write(), so it is NOT syntactic sugar.

Many times you'll get a result (or an input) that you expect
to be a Boolean, but instead is a string. A good example of
poor coding is "dialog box return values". Take your
standard yes/no/cancel dialog, i would expect it to return
True|False|None respectively, HOWEVER, some *idiot* decided
to return the strings 'yes'|'no'|'cancel'.

Click to expand...

Why True|False|None? Why should they represent Yes|No|Cancel?
Especially, *why None*? What has None to do with Cancel?

However, with Python's implicit conversion to Boolean, the
same conditional will ALWAYS be True: because any string
that is not the null string is True (as far as Python is
concerned). This is an example of Python devs breaking TWO
Zens at once:

"explicit is better than implicit"
"errors should NEVER pass silently"

Click to expand...

Right, because it's Python's fault that you can't use implicit boolean
conversion to sanely test for something that has three possible
outcomes. I think there's something in the nature of a boolean test
that makes this awkward, but I can't quite see it... hmm, some kind of
integer issue, I think...

Obviously you don't appreciate the value of "explicit enough".

if VALUE:

is not explicit enough, however

if bool(VALUE)

or at least:

if VALUE == True

is "explicit enough".

Click to expand...

Why? The 'if' implies a boolean context. In C, it's common to compare
integers for nonzeroness with a bare if; it's also common, though far
from universal, to compare strings for nullness - effectively
equivalent to "is not None". You don't need to be any more explicit
than that.

Granted, the definitions of truthiness differ from language to
language. In C, a NULL pointer is false and any actual pointer is
true, so an empty string is true (to the extent that C even has the
concept of strings, but leave that aside). In Pike, any array is true,
but the absence of an array can be indicated with (effectively) a
null, whereas Python deems that an empty list is false. Still, most
languages do have some system of coercion-to-boolean. (Notable
exception: REXX. An IF statement will accept *only* the two permitted
boolean values, anything else is an error.)

However, if i choose to be explicit and use:

"if len(VALUE) > 0:

then the code will fail when it should: at the comparison
line. Because any object that does not provide a __len__
method would cause Python to raise NameError.

Click to expand...

I thought you were dead against wasting CPU cycles! Your code here has
to calculate the actual length of the object, then compare it with
zero; the simple boolean check merely has to announce the presence or
absence of content. This is a HUGE difference in performance, and you
should totally optimize this down for the sake of that. Don't bother
measuring it, this will make more difference to your code than
replacing bubble sort with bogosort!

ChrisA

jmfauth · Jun 4, 2013

I never purposely inject ANY superfluous cycles in my code except in
the case of testing or development. To me it's about professionalism.
Let's consider a thought exercise shall we?

--------

The flexible string representation is the perfect example
of this lack of professionalism.
Wrong by design, a non understanding of the mathematical logic,
of the coding of characters, of Unicode and of the usage of
characters (everything is tight together).

How is is possible to arrive to such a situation ?
The answer if far beyond my understanding (although
I have my opinion on the subject).

jmf

rusi · Jun 4, 2013

--------

The flexible string representation is the perfect example
of this lack of professionalism.
Wrong by design, a non understanding of the mathematical logic,
of the coding of characters, of Unicode and of the usage of
characters (everything is tight together).

How is is possible to arrive to such a situation ?
The answer if far beyond my understanding (although
I have my opinion on the subject).

jmf

The Clash of the Titans

LÃ© jmf chÃ¢rgeth with mightÆ´ might
And le Mond underneath trembleth
Now RR mounts his sturdy steed
And the windmill yonder turneth

Mark Lawrence · Jun 4, 2013

The Clash of the Titans

LÃ© jmf chÃ¢rgeth with mightÆ´ might
And le Mond underneath trembleth
Now RR mounts his sturdy steed
And the windmill yonder turneth

+1 funniest poem of the week

--
"Steve is going for the pink ball - and for those of you who are
watching in black and white, the pink is next to the green." Snooker
commentator 'Whispering' Ted Lowe.

Mark Lawrence

Rick Johnson · Jun 4, 2013

Consider a simple thought experiment. Suppose we start with a sequence of
if statements that begin simple and get more complicated:
if a == 1: ...
if a == 1 and b > 2*c: ...
if a == 1 and b > 2*c or d%4 == 1: ...
if a == 1 and b > 2*c or d%4 == 1 and not (d**3//7)%3 == 0: ...
I don't believe that any of these tests are improved by adding an
extraneous "== True" at the end:
if (a == 1) == True: ...
if (a == 1 and b > 2*c) == True: ...
if (a == 1 and b > 2*c or d%4 == 1) == True: ...
if (a == 1 and b > 2*c or d%4 == 1 and not (d**3//7)%3 == 0) == True: ...

And i agree!

You are misunderstanding my very valid point. Post-fixing a
"== True" when truth testing a *real* Boolean (psst: that's
a True or False object) is superfluous, I'm referring to
truth testing non-Boolean values. So with that in mind, the
following is acceptably "explicit enough" for me:

a = True
if a:
do_something()

However, since Python allows implicit conversion to Boolean
for ALL types, unless we know for sure, beyond any
reasonable doubt, that the variable we are truth testing is
pointing to a True or False object, we are taking too many
chances and will eventually create subtle bugs.

a = " "
if a:
do_something()

When if write code that "truth tests", i expect that the
value i'm testing is a True or False object, not an empty
list that *magically* converts to False when i place an "if"
in front of it, or a list with more members that magically
converts to True when i place an "if" in front of it.

This implicit conversion seems like a good idea at first,
and i was caught up in the hype myself for some time: "Hey,
i can save a few keystrokes, AWESOME!". However, i can tell
you with certainty that this implicit conversion is folly.
It is my firm belief that truth testing a value that is not
a Boolean should raise an exception. If you want to convert
a type to Boolean then pass it to the bool function:

lst = [1,2,3]
if bool(lst):
do_something

This would be "explicit enough"

If you are unfamiliar with Python, then you have to learn what the
semantics of "if lst" means. Just as you would have to learn what
"if len(lst) > 0" means.

Again, i understand the folly of "implicit Boolean
conversion" just fine.

This line of code is problematic, for various reasons:
- you're making assumptions about the object which are unnecessary;
- which breaks duck-typing;
- and risks doing too much work, or failing altogether.
You're looking up the length of the lst object, but you don't really care
about the length.

Yes i do care about the length or i would not have asked.
I'm asking Python to tell me if the iterable has members,
amd if it does, i want to execute a block of code, if it
does not, i want to do nothing. But i'm also informing the
reader of my source code that the symbol i am truth testing
is expected to be an iterable with a __len__ method.

"if lst" does not give me the same answer (or imply the same
meaning to a reader), it merely tells me that the implict
conversion has resulted in a True value, but what if the lst
symbol is pointing to a string? Then i will falsely believe
i have a list with members when i actually have a string
with length greater than zero.

You only care about whether there is something there or
not, whether lst is empty or not. It makes no difference whether lst
contains one item or one hundred million items, and yet you're asking to
count them all. Only to throw that count away immediately!

I agree. Summing the list members just to guarantee that the
iterable has members is foolish, however, python gives me no
other choice IF i want to be "explicit enough". In a
properly designed language, the base iterable object would
supply a "hasLength" or "hasMembers" method that would
return a much faster check of:

try:
iterable[0]
except IndexError:
return False
else:
return True

That check would guarantee the iterable contained at least
one member without counting them all.

Looking at the length of a built-in list is cheap, but why assume it is a
built-in list? Perhaps it is a linked list where counting the items
requires a slow O(N) traversal of the entire list. Or some kind of lazy
sequence that has no way of counting the items remaining, but knows
whether it is exhausted or not.

Yes, but the problem is not "my approach", rather the lack
of proper language design (my apologizes to the "anointed
one". ;-)

The Python way is to duck-type, and to let the lst object decide for
itself whether it's empty or not:
if lst: ...
not to make assumptions about the specific type and performance of the
object.

Well Steven, in the real world sometimes you have no other
choice. I don't have time to read and comprehend thousands
of lines of code just to use a simple interface. We are all
aware that:

"Look Before You Leap"

is always a slower method than:

"It's Easier to Ask Forgiveness Than Permission"

When i am writing code i prefer to be "explicit enough" so
that IF my assumptions about the exact type of an object are
incorrect, the code will fail quickly enough that i can
easily find and correct the problem. In this manner i can
develop code much faster because i do not need to understand
the minutia of an API in order to wield it. On the contrary,
Implicit Conversion to Boolean is a bug producing nightmare
that requires too much attention to minutia.

`if len(lst) > 0` also works for types you don't intend. Any type that
defines a __len__ method which returns an integer will do it.
Tuples, sets and dicts are just the most obvious examples of things that
support len() but do not necessarily support all the things you might
wish to do to a list.

Agreed.

The "if len(var) > 0" will return True for ANY object that
includes a __len__ method. This test is fine if you want to
test iterables generically, however, if you want to be
specific about testing lists you could not rely on that code
because strings and all other iterables would return the
same "truthy" or "falsey" value. But how do we solve this
issue? I don't want to see this:

if isinstance(var, list) and len(var) > 0:
do_something()

But we are really ignoring the elephant in the room. Implict
conversion to Boolean is just a drop in the bucket compared
to the constant "shell game" we are subjected to when
reading source code. We so naively believe that a symbol
named "lst" is a list object or a symbol "age" is a integer,
when we could be totally wrong! This is the source of many
subtle bugs!!!

There must be some method by which we can truth test an
iterable object and verify it has members, but do so in a
manner that is valid for all types AND exposes the "expected
type" in the method name. hmm...

Adding a method like "is_valid" to every object can seem
logical, however, this can fail just as miserably as
Python's current implicit bool. And, more disastrously, an
"is_valid" method is not going to raise an error (where it
should) because it works for all types.

What we need is a method by which we can validate a symbol
and simultaneously do the vaidation in a manner that will
cast light on the type that is expected. In order for this
to work, you would need validators with unique "type names"

if var.is_validList():
elif var.is_validString():
elif var.is_vaildTuple():
elif var.is_validInteger():
elif var.is_validFloat():
elif var.is_validDict():
etc...

By this manner, we can roll three common tests into one
method:

* boolean conversion
* member truthiness for iterables
* type checking

But most importantly, we destroy implicitly and and be
"explicit enough", but not so explicit that our fingers
hurt.

*school-bell-rings*

PS: Damn i'm good! I believe the BDFL owes me a Thank You
email for this gold i just dropped on the Python community.
Flattery is welcome. Pucker up!

Chris Angelico · Jun 4, 2013

But we are really ignoring the elephant in the room. Implict
conversion to Boolean is just a drop in the bucket compared
to the constant "shell game" we are subjected to when
reading source code. We so naively believe that a symbol
named "lst" is a list object or a symbol "age" is a integer,
when we could be totally wrong! This is the source of many
subtle bugs!!!

You know, if you want a language with strict type declarations and
extreme run-time efficiency, there are some around. I think one of
them might even be used to make the most popular Python. Give it a
try, you might like it! There's NO WAY that you could accidentally
pass a list to a function that's expecting a float, NO WAY to
unexpectedly call a method on the wrong type of object. It would suit
you perfectly!

ChrisA

Rick Johnson · Jun 4, 2013

What we need is a method by which we can validate a symbol
and simultaneously do the vaidation in a manner that will
cast light on the type that is expected. In order for this
to work, you would need validators with unique "type names"

if var.is_validList():
elif var.is_validString():
elif var.is_vaildTuple():
elif var.is_validInteger():
elif var.is_validFloat():
elif var.is_validDict():
etc...

Actually, instead of forcing all types to have many "specific"
methods, one builtin could solve the entire issue. The function would
be similar to isinstance() taking two arguments "object" and "type",
however, it will not only guarantee type but also handle the
conversion to Boolean:

if is_valid(var, list):
# if this block executes we know
# the var is of <type list> and
# var.length is greater than one.
else:
# if this block executes we know
# that var is not of <type list>
# or, var.length equals zero.

The is_valid function would replace implicit Boolean conversion for
all types in manner that is "explicit enough" whilst maintaining
finger longevity. This is how you design a language for consistency
and readability.

Again. PUCKER UP WHO-VILLE!

Fábio Santos · Jun 4, 2013

You know, if you want a language with strict type declarations and
extreme run-time efficiency, there are some around. I think one of
them might even be used to make the most popular Python. Give it a
try, you might like it! There's NO WAY that you could accidentally
pass a list to a function that's expecting a float, NO WAY to
unexpectedly call a method on the wrong type of object. It would suit
you perfectly!

I agree. I have never had this kind of issues in a dynamic language. Except
when passing stuff to Django's fields. And in JavaScript. It seems like the
thing was made to create references to `undefined`. And make them easily
convertible to numbers and strings so that our calculations mysteriously
fail when we're missing a function argument somewhere.

PyWart: NameError trackbacks are superfluous	15	Mar 16, 2013
PyWart: Import resolution order	12	Jan 11, 2013
PyWart: Namespace asinitiy and the folly of the global statement	14	Feb 7, 2013
PyWart: Packages (oh how thou art lacking!)	0	Nov 10, 2013
PyWart (Terminolgy): "Class"	15	Jan 14, 2013
PyWart: Itertools module needs attention	3	Sep 12, 2011
PyWart: More surpises via "implict conversion to boolean" (and othersteaming piles!)	21	Feb 10, 2014
PyWart: Python modules are not so "modular" after all!	2	Nov 10, 2013

PyWart: The problem with "print"

Dave Angel

Ian Kelly

Jason Swails

Jason Swails

Jason Swails

Steven D'Aprano

Chris Angelico

Rick Johnson

Vito De Tullio

Rick Johnson

Steven D'Aprano

Steven D'Aprano

Chris Angelico

jmfauth

rusi

Mark Lawrence

Rick Johnson

Chris Angelico

Rick Johnson

Fábio Santos

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads