PyWart: The problem with "print"

R

Rick Johnson

Note to those of you who may be new to Python: I will refer to "print" as afunction -- just be aware that "print" was a statement before Python3000 was introduced.

------------------------------------------------------------
Introduction:
------------------------------------------------------------
Many languages provide a function, method, or statement by which users can write easily to stdout, and Python is no exception with it's own "print" function. However, whilst writing to stdout via "print" is slightly less verbose than calling the "write" method of "sys.stdout", we don't really gain much from this function except a few keystrokes... is this ALL print should be? A mere syntactical sugar? For me, i think the "print" function CAN and SHOULD be so much more!

------------------------------------------------------------
Print's Role in Debugging:
------------------------------------------------------------
A print function can be helpful in many ways, and one very important role print plays is to inform the programmer of internal states of objects via debug messages written to stdout.

Sure, there are fancy debuggers by which internal state and object identitycan be inspected on the fly, however, "print" is always going to be there no matter what libraries or add-on you have available. And let's face it folks, print is the most simplistic and intuitive interface you'll ever find for debugging purposes. Sure, "print" may not be the most productive for large scale debugging, but for the majority of debugging tasks, it's a good fit.

I know some of you will cringe at the idea of using print for debugging, however, i will argue that using a debugger can weaken your detective skills.If an exception message and trackback are not giving you enough information to find the bug, well then, the language OR the code you've written is not worth a monkey's toss!

I've found that many subtle bugs are caused by not limiting the inputs to sane values (or types). And with Python's duct typing and implicit casting to Boolean, you end up with all sorts of misleading things happening! Maybe you're testing for truth values and get a string instead; which screws everything up!!!

Anyhoo, i digress...

------------------------------------------------------------
Inadequacies of "print" Debugging.
------------------------------------------------------------
In it's current implementation, print is helpful, but in many ways, print is lacking it's true potential. Many of the problems that propagate up when using print as a debugger focus on the fact that you cannot easily switch the debug messages "on" or "off".

Sure, you can comment-out all calls to print, but if you need to see the messages again you will be forced to uncomment all the lines again... hey that's no fun!

A "wise programmer" may think he's solved the problem by writing a functioncalled "debugprint" that looks like this:

def debugprint(*args):
if DEBUG == True:
print(*args)

However that solution is at best woefully inadequate and at worse a cycle burner!

* Woefully inadequate because: Switching on or off the debug
messages is only valid in the current module that the
function was imported. What if you want to kill all
debugprint messages EVERYWHERE? Do you really want to edit
"debug = BOOLEAN" in every source file OR do something
stupid like import debugprint and edit the DEBUG constant
OR even dumber, edit the debugprint source code? GAWD NO!

* But even if you are willing to cope with all the "switch-
on-and-off" nonsense, are you willing to have you code
slowed by numerous calls to a dead function containing a
comparison that will always be false?

## START INTERACTIVE SESSION ##
py> from __future__ import print_function
py> DEBUG = True
py> def debugprint(*args):
... if not DEBUG:
... return
... print(*args)
py> debugprint("foo", "bar")
foo bar
py> DEBUG = False
py> code = compile('debugprint("foo", "bar")', '<string>', 'exec')
py> import dis
py> dis.disassemble(code)
1 0 LOAD_NAME 0 (debugprint)
3 LOAD_CONST 0 ('foo')
6 LOAD_CONST 1 ('bar')
9 CALL_FUNCTION 2
12 POP_TOP
13 LOAD_CONST 2 (None)
16 RETURN_VALUE
## END INTERACTIVE SESSION ##
After a few million executions of this superfluous
comparison your cpu is losing faith in your ability to
write logical code!

py> function.call() + false_comparison() == 'cycle burner'
"YOU BET YOU A$$ IT DOES!!"

------------------------------------------------------------
Solution.
------------------------------------------------------------
This realization has brought me to the conclusion that Python (and other languages) need a "scoped print function". What is a "scoped print function" anyway? Well what i am proposing is that Python include the following "debug switches" in the language:

------------------------------
Switch: "__GLOBALDEBUG__"
------------------------------
Global switching allows a programmer to instruct the
interpreter to IGNORE all print functions or to EVALUATE
all print functions by assigning a Boolean value of True
or False respectively to the global switch (Note: global
switch always defaults to True!).

Any script that includes the assignment "__GLOBALDEBUG__ =
False" will disable ALL debug print messages across the
entire interpreter namespace. In effect, all print
statements will be treated as comments and ignored by the
interpreter. No dead functions will be called and no false
comparisons will be made!

(Note: __GLOBALDEBUG__ should not be available in any
local namespace but accessed only form the topmost
namespace. Something like: __main__.__GLOBALDEBUG__ =
Boolean

------------------------------
Switch: "__LOCALDEBUG__"
------------------------------
Local switching allows a programmer to turn on (or off)
debug messages in the module it was declared. Not sure if
this should be more specific than modules; like classes,
blocks, or functions??? Of course this declaration will
be overridden by any global switch.
 
C

Chris Angelico

* Woefully inadequate because: Switching on or off the debug
messages is only valid in the current module that the
function was imported. What if you want to kill all
debugprint messages EVERYWHERE? Do you really want to edit
"debug = BOOLEAN" in every source file OR do something
stupid like import debugprint and edit the DEBUG constant
OR even dumber, edit the debugprint source code? GAWD NO!

Easy fix to this one. Instead of copying and pasting debugprint into
everything, have it in a module and import it everywhere. Then the
debug flag will be common to them all.

Oh, and you probably want to add **kwargs to debugprint, because the
print function does a lot more than sys.stdout.write does:
1#2#3#4

* But even if you are willing to cope with all the "switch-
on-and-off" nonsense, are you willing to have you code
slowed by numerous calls to a dead function containing a
comparison that will always be false?

Hmm. Could be costly. Hey, you know, Python has something for testing that.
0.5838018519113444

That's roughly half a second for a million calls to debugprint().
That's a 580ns cost per call. Rather than fiddle with the language,
I'd rather just take this cost. Oh, and there's another way, too: If
you make the DEBUG flag have effect only on startup, you could write
your module thus:

if DEBUG:
debugprint=print
else:
def debugprint(*args,**kwargs):
pass

So you can eliminate part of the cost there, if it matters to you. If
a half-microsecond cost is going to be significant to you, you
probably should be looking at improving other areas, maybe using
ctypes/cython, or possibly playing with the new preprocessor tricks
that have been being discussed recently. There's really no need to
change the language to solve one specific instance of this "problem".

ChrisA
 
A

Andrew Berg

I don't think you go far enough. Obviously we need way more flexibility. A simple on/off is okay for some things, but a finer granularity
would be really helpful because some things are more important than others. And why stop at stdout/stderr? We need to add a consistent way
to output these messages to files too in case we need to reference them again. The messages should have a consistent format as well. Why add
the same information to each message when it would be much simpler to simply define a default format and insert the real meat of the message
into it? It really seems like we should already have something like this. Hmm.....
 
C

Chris Angelico

I don't think you go far enough. Obviously we need way more flexibility. A simple on/off is okay for some things, but a finer granularity
would be really helpful because some things are more important than others. And why stop at stdout/stderr? We need to add a consistent way
to output these messages to files too in case we need to reference them again. The messages should have a consistent format as well. Why add
the same information to each message when it would be much simpler to simply define a default format and insert the real meat of the message
into it? It really seems like we should already have something like this. Hmm.....

You have a really good point there. I'm sure I could think of a really
good way to do all this, but I'm stuck... it's like there's a log jam
in my head...

(Okay, maybe I should go to bed now, my puns are getting worse.
Considering how late it is, I'll probably sleep like a log.)

ChrisA
 
R

Rick Johnson

Easy fix to this one. Instead of copying and pasting debugprint into
everything, have it in a module and import it everywhere. Then the
debug flag will be common to them all.

Ignoring the fact that you have "import everywhere", what if you want
to stop ALL debug messages? If you "import everywhere" to get them,
you then have to "edit everywhere" to stop them.
Oh, and you probably want to add **kwargs to debugprint, because the
print function does a lot more than sys.stdout.write does:

The kwargs to print are not germane to the issue, however noobs may be
watching so glad you pointed that one out.
[...]
py> timeit.timeit('debugprint("asdf") [...]
0.5838018519113444

That's roughly half a second for a million calls to debugprint().
That's a 580ns cost per call. Rather than fiddle with the language,
I'd rather just take this cost.

I never purposely inject ANY superfluous cycles in my code except in
the case of testing or development. To me it's about professionalism.
Let's consider a thought exercise shall we?

Imagine your an auto mechanic. You customer brings in his
car and asks you to make some repairs. You make the
repairs but you also adjust the air/fuel ratio to run
"rich" (meaning the vehicle will get less MPG). Do you
still pat yourself on the back and consider you've done a
professional job?

I would not! However, you're doing the same thing as the mechanic when
your code executes superflouos calls and burns cycles for no other
reason than pure laziness. CPU's are not immortal you know, they have
a lifetime. Maybe you don't care about destroying someone's CPU,
however, i do!

I just wonder how many of your "creations" (aka: monstrosities!) are
burning cycles this very minute!
[...]
So you can eliminate part of the cost there, if it matters to you. If
a half-microsecond cost is going to be significant to you, you
probably should be looking at improving other areas, maybe using
ctypes/cython, or possibly playing with the new preprocessor tricks
that have been being discussed recently. There's really no need to
change the language to solve one specific instance of this "problem".

That's like suggesting to the poor fella who's MPG is suffering
(because of your incompetent adjustments!) to buy fuel additives to
compensate for the loss of MPG. Why should he incur costs because you
are incompetent?
 
R

Rick Johnson

[...]
Or use the logging module. It's easy to get going quickly
(just call logging.basicConfig at startup time), and with
a little care and feeding, you can control the output in
more ways than can fit into the margin. Oh, yeah, I'm sure
it introduces some overhead. So does everything else.

I hate log files, at least during development or testing. I prefer to debug on the command line or using my IDE. Log files are for release time, not development.
 
S

Steven D'Aprano

Many
languages provide a function, method, or statement by which users can
write easily to stdout, and Python is no exception with it's own "print"
function. However, whilst writing to stdout via "print" is slightly less
verbose than calling the "write" method of "sys.stdout", we don't really
gain much from this function except a few keystrokes... is this ALL
print should be? A mere syntactical sugar?

Perhaps you should read the docs before asking rhetorical questions,
because the actual answer is, No, print is not mere syntactical sugar
saving a few keystrokes.


Help on built-in function print in module builtins:

print(...)
print(value, ..., sep=' ', end='\n', file=sys.stdout, flush=False)

Prints the values to a stream, or to sys.stdout by default.
Optional keyword arguments:
file: a file-like object (stream); defaults to the current
sys.stdout.
sep: string inserted between values, default a space.
end: string appended after the last value, default a newline.
flush: whether to forcibly flush the stream.


Still not powerful enough for you? Easily fixed:

import builtins

# The higher the verbosity, the more messages are printed.
verbosity = 2

def print(*args, level=1, **kwargs):
if level >= verbosity:
builtins.print(*args, **kwargs)


print("debug message", level=4) # Only prints if verbosity >= 4
print("info message", level=3)
print("warning message", level=2)
print("critical message", level=1) # Only prints if verbosity >= 1


Trivial enough to embed in each module that needs it, in which case each
module can have its own verbosity global variable. Or you can put it in a
helper module, with a single, application-wide global variable, and use
it like this:

import verbose_print
print = verbose_print.print
verbose_print.verbosity = 3
print("some message", level=4)

Of course, in practice you would set the verbosity according to a command
line switch, or an environment variable, or a config file, and not hard
code it in your source code.

I've found that many subtle bugs are caused by not limiting the inputs
to sane values (or types). And with Python's duct typing

Nothing worse than having pythons roaming through your ducts, eating your
ducks.

and implicit
casting to Boolean, you end up with all sorts of misleading things
happening! Maybe you're testing for truth values and get a string
instead; which screws everything up!!!

Only if you're a lousy programmer who doesn't understand Python's truth
model.

A "wise programmer" may think he's solved the problem by writing a
function called "debugprint" that looks like this:

def debugprint(*args):
if DEBUG == True:
print(*args)

No no no, that's not how you do it. It should be:

if DEBUG == True == True:

Wait, no, I got that wrong. It should be:

if DEBUG == True == True == True:

Hang on, I've nearly got it!

if DEBUG == True == True == True == True:

Or, you could program like a professional, and say:

if DEBUG:

By the way, why is DEBUG a constant? Doesn't that defeat the purpose?

However that solution is at best woefully inadequate and at worse a
cycle burner!

Certainly you don't want to be burning cycles. Imagine the pollution from
the burning rubber tyres!

* Woefully inadequate because: Switching on or off the debug
messages is only valid in the current module that the function was
imported. What if you want to kill all debugprint messages
EVERYWHERE?

You start by learning how Python works, and not making woefully incorrect
assumptions.

Do you really want to edit "debug = BOOLEAN" in every
source file

Python does not work that way.

OR do something stupid like import debugprint and edit
the DEBUG constant

Heaven forbid that people do something that actually works the way Python
is designed to work.

OR even dumber, edit the debugprint source code?
GAWD NO!

* But even if you are willing to cope with all the "switch-
on-and-off" nonsense, are you willing to have you code slowed by
numerous calls to a dead function containing a comparison that will
always be false?

And of course you have profiled your application, and determined that the
bottleneck in performance is the calls to debugprint, because otherwise
you are wasting your time and ours with premature optimization.

Life is hard. Sometimes you have to choose between performance and
debugging.

This
realization has brought me to the conclusion that Python (and other
languages) need a "scoped print function". What is a "scoped print
function" anyway? Well what i am proposing is that Python include the
following "debug switches" in the language:

------------------------------
Switch: "__GLOBALDEBUG__"
------------------------------
Global switching allows a programmer to instruct the interpreter to
IGNORE all print functions or to EVALUATE all print functions by
assigning a Boolean value of True or False respectively to the global
switch (Note: global switch always defaults to True!).

If you really care about this premature optimization, you can do this:

if __debug__:
print("whatever")


You then globally disable these print calls by running Python with the -O
switch.

Any script that includes the assignment "__GLOBALDEBUG__ = False" will
disable ALL debug print messages across the entire interpreter
namespace. In effect, all print statements will be treated as comments
and ignored by the interpreter. No dead functions will be called and
no false comparisons will be made!

(Note: __GLOBALDEBUG__ should not be available in any local namespace
but accessed only form the topmost namespace. Something like:
__main__.__GLOBALDEBUG__ = Boolean

Python does not work like that. Perhaps you should learn how to program
in Python before telling us how it should be improved?

------------------------------
Switch: "__LOCALDEBUG__"
------------------------------
Local switching allows a programmer to turn on (or off) debug messages
in the module it was declared. Not sure if this should be more
specific than modules; like classes, blocks, or functions??? Of course
this declaration will be overridden by any global switch.

So, it will be utterly useless then, since __LOCALDEBUG__ has no effect,
and __GLOBALDEBUG__ overrides it. Great.

My only
concern is that some programmers may be confused why their print calls
are not working and may not have the capacity to resolve that function
named "print" is tied to the global and local switches named
"__GLOBAL_DEBUG__" and "__LOCAL_DEBUG__". To prevent any cognitive
dissonance it may be desirable to introduce a new function called
"debugprint".

That's not what cognitive dissonance means. The word you are looking for
is "confusion".

Cognitive dissonance is the mental stress and anguish a person feels when
deep down they know that they are the best, most intelligent, most expert
Python programmer on the planet, better even than Python's creator, and
yet every time they open their mouth to tell the world how Python gets it
wrong and how to fix it, they just get shot down in flames.
 
S

Steven D'Aprano

Maybe you don't care about destroying someone's CPU, however, i do!

And yet here you are, destroying millions of people's CPUs by sending
them email or usenet messages filled with garbage.
 
C

Chris Angelico

Ignoring the fact that you have "import everywhere", what if you want
to stop ALL debug messages? If you "import everywhere" to get them,
you then have to "edit everywhere" to stop them.

Example:

## debugprint.py
DEBUG = True
def debugprint(*a,**kw):
if DEBUG:
return print(*a,**kw)

## every other module
from debugprint import debugprint
debugprint("I got imported!")
def foo():
debugprint("I got foo'd!")

See how many places you need to edit to change the DEBUG flag? You can
even do it at run time with this version of the code:

## toggle debugging
import debugprint
debugprint.DEBUG = not debugprint.DEBUG

And, as several others have pointed out, this is kinda sorta what the
logging module does, only it does it better. Same method; you import
the same module everywhere. It is THE SAME module.
I never purposely inject ANY superfluous cycles in my code except in
the case of testing or development. To me it's about professionalism.

Why do you use Python? Clearly the only professional option is to use
raw assembler. Or possibly you could justify C on the grounds of
portability.
Let's consider a thought exercise shall we?

Imagine your an auto mechanic. You customer brings in his
car and asks you to make some repairs. You make the
repairs but you also adjust the air/fuel ratio to run
"rich" (meaning the vehicle will get less MPG). Do you
still pat yourself on the back and consider you've done a
professional job?

I would not! However, you're doing the same thing as the mechanic when
your code executes superflouos calls and burns cycles for no other
reason than pure laziness. CPU's are not immortal you know, they have
a lifetime. Maybe you don't care about destroying someone's CPU,
however, i do!

Better analogy: When you build a car, you incorporate a whole bunch of
gauges and indicators. They clutter things up, and they're extra
weight to carry; wouldn't the car get more MPG (side point: can I have
my car get more OGG instead? I like open formats) if you omit them?
I just wonder how many of your "creations" (aka: monstrosities!) are
burning cycles this very minute!

Every one that's written in a high level language. So that's Yosemite,
Minstrel Hall, Tisroc, KokoD, RapidSend/RapidRecv, and Vizier. And
that's just the ones that I've personally created and that I *know*
are currently running (and that I can think of off-hand). They're
wasting CPU cycles dealing with stuff that I, the programmer, now
don't have to. Now let's see. According to my server, right now, load
average is 0.21 - of a single-core Intel processor that was mid-range
back in 2009. And that's somewhat higher-than-normal load, caused by
some sort of usage spike a few minutes ago (and is dropping);
normally, load average is below 0.10.

At what point would it be worth my effort to rewrite all that code to
eliminate waste? Considering that I could build a new server for a few
hundred (let's say $1000 to be generous, though the exact price will
depend on where you are), or rent one in a top-quality data center for
$40-$55/month and not have to pay for electricity or internet, any
rewrite would need to take less than two days of my time to be
worthwhile. Let 'em burn cycles; we can always get more.
That's like suggesting to the poor fella who's MPG is suffering
(because of your incompetent adjustments!) to buy fuel additives to
compensate for the loss of MPG. Why should he incur costs because you
are incompetent?

He's welcome to push a wheelbarrow if he doesn't want the overhead of
a car. The car offers convenience, but at a cost. This is an eternal
tradeoff.

ChrisA
 
C

Chris Angelico

Nothing worse than having pythons roaming through your ducts, eating your
ducks.

Steven, you misunderstand. It's more about using your ducts to type
code. Have you seen the Mythbusters episode where they're trying to
enter a building surreptitiously? ("Crimes and Mythdemeanors 1", I
think, if you want to look it up.) At one point, we can CLEARLY hear
one of the hosts, moving along a duct, *typing*. We can hear the
click-click-click of giant keys.

Hah. Knew I could trust Youtube.

ChrisA
 
N

Ned Batchelder

[...]
Or use the logging module. It's easy to get going quickly
(just call logging.basicConfig at startup time), and with
a little care and feeding, you can control the output in
more ways than can fit into the margin. Oh, yeah, I'm sure
it introduces some overhead. So does everything else.
I hate log files, at least during development or testing. I prefer to debug on the command line or using my IDE. Log files are for release time, not development.
Rick, you should give the logging module a try. The default
configuration from basicConfig is that the messages all go to stderr, so
no log files to deal with. And it's configurable in the ways you want,
plus a lot more.

--Ned.
 
J

Jason Swails

Hmm. Could be costly. Hey, you know, Python has something for testing that.

DEBUG: return\n\tprint(*args)\nDEBUG=False',number=1000000)
0.5838018519113444

That's roughly half a second for a million calls to debugprint().
That's a 580ns cost per call. Rather than fiddle with the language,
I'd rather just take this cost. Oh, and there's another way, too: If
you make the DEBUG flag have effect only on startup, you could write
your module thus:

This is a slightly contrived demonstration... The time lost in a function
call is not just the time it takes to execute that function. If it
consistently increases the frequency of cache misses then the cost is much
greater -- possibly by orders of magnitude if the application is truly
bound by the bandwidth of the memory bus and the CPU pipeline is almost
always saturated.

I'm actually with RR in terms of eliminating the overhead involved with
'dead' function calls, since there are instances when optimizing in Python
is desirable. I actually recently adjusted one of my own scripts to
eliminate branching and improve data layout to achieve a 1000-fold
improvement in efficiency (~45 minutes to 0.42 s. for one example) --- all
in pure Python. The first approach was unacceptable, the second is fine.
For comparison, if I add a 'deactivated' debugprint call into the inner
loop (executed 243K times in this particular test), then the time of the
double-loop step that I optimized takes 0.73 seconds (nearly doubling the
duration of the whole step). The whole program is too large to post here,
but the relevant code portion is shown below:

i = 0
begin = time.time()
for mol in owner:
for atm in mol:
blankfunc("Print i %d" % i)
new_atoms = self.atom_list[atm]
i += 1
self.atom_list = new_atoms
print "Done in %f seconds." % (time.time() - begin)

from another module:

DEBUG = False

[snip]

def blankfunc(instring):
if DEBUG:
print instring

Also, you're often not passing a constant literal to the debug print --
you're doing some type of string formatting or substitution if you're
really inspecting the value of a particular variable, and this also takes
time. In the test I gave the timings for above, I passed a string the
counter substituted to the 'dead' debug function. Copy-and-pasting your
timeit experiment on my machine yields different timings (Python 2.7):
DEBUG: return\n\tsys.stdout.write(*args)\nDEBUG=False',number=1000000)
0.15644001960754395

which is ~150 ns/function call, versus ~1300 ns/function call. And there
may be even more extreme examples, this is just one I was able to cook up
quickly.

This is, I'm sad to say, where my alignment with RR ends. While I use
prints in debugging all the time, it can also become a 'crutch', just like
reliance on valgrind or gdb. If you don't believe me, you've never hit a
bug that 'magically' disappears when you add a debugging print statement
;-).

The easiest way to eliminate these 'dead' calls is to simply comment-out
the print call, but I would be quite upset if the interpreter tried to
outsmart me and do it automagically as RR seems to be suggesting. And if
you're actually debugging, then you typically only add a few targeted print
statements -- not too hard to comment-out. If it is, and you're really
that lazy, then by all means add a new, greppable function call and use a
sed command to comment those lines out for you.

BTW: *you* in the above message refers to a generic person -- none of my
comments were aimed at anybody in particular

All the best,
Jason

P.S. All that said, I would agree with ChrisA's suggestion that the
overhead is negligible is most cases...
 
C

Chris Angelico

Copy-and-pasting your timeit experiment on my machine yields different
timings (Python 2.7):

0.15644001960754395

which is ~150 ns/function call, versus ~1300 ns/function call. And there
may be even more extreme examples, this is just one I was able to cook up
quickly.

The exact time will of course vary enormously. My point still would
stand at 1300ns; it's still extremely low compared to many other
overheads.
This is, I'm sad to say, where my alignment with RR ends. While I use
prints in debugging all the time, it can also become a 'crutch', just like
reliance on valgrind or gdb. If you don't believe me, you've never hit a
bug that 'magically' disappears when you add a debugging print statement
;-).

Yes, I've had situations like that. They are, however, EXTREMELY rare
compared to the times when a bug magically becomes shallow when you
add a debugging print!
The easiest way to eliminate these 'dead' calls is to simply comment-out the
print call, but I would be quite upset if the interpreter tried to outsmart
me and do it automagically as RR seems to be suggesting. And if you're
actually debugging, then you typically only add a few targeted print
statements -- not too hard to comment-out. If it is, and you're really that
lazy, then by all means add a new, greppable function call and use a sed
command to comment those lines out for you.

Yes. I also have high hopes for some of the cool AST manipulation
that's being developed at the moment; it should be relatively easy to
have a simple flag that controls whether debugprint (btw, I'd shorten
the name) represents code or no-code.

But consider all the other things that you probably do in your Python
programs. Every time you reference something as "module.name", you
require a lookup into the current module's namespace to find the
module name, then into that module's namespace to find the object you
want. Snagging names as locals is a common optimization, but is far
from universally applied. Why? Because the readability cost just isn't
worth it. We use Python because it is "fast enough", not because it
lets us squeeze every CPU cycle out of the code.

That said, I can often smoke Python with Pike, thanks to a few rather
cool optimizations (including looking up module names at compile time,
which reduces what I just said above). Maybe in the future some of
these optimizations will be done, I don't know. But for 99.9% of
Python scripts, you will *never* see the performance difference.

ChrisA
 
J

Jason Swails

Ah, yes. The Heisenbug. ;-)

Indeed. Being in the field of computational chemistry/physics, I was
almost happy to have found one just to say I was hunting a Heisenbug. It
seems to be a term geared more toward the physics-oriented programming
crowd.

We used to run into those back in the days of C and assembly langua

ge.

They're much harder to see in the wild with Python.

Yea, I've only run into Heisenbugs with Fortran or C/C++. Every time I've
seen one it's been due to an uninitialized variable somewhere -- something
valgrind is quite good at pinpointing. (And yes, a good portion of our
code is -still- in Fortran -- but at least it's F90+ :).
 
C

Chris Angelico

Aside from an I/O caching bug directly affected by calling print or
somefile.write (where somefile is stdout), how else could I create a
Heisenbug in pure Python?

If you have a @property, merely retrieving it could affect things. It
shouldn't happen, but bugs can be anywhere. Basically, ANY Python code
can trigger ANY Python code.

ChrisA
 
M

Michael Torrie

[...] Or use the logging module. It's easy to get going quickly
(just call logging.basicConfig at startup time), and with a little
care and feeding, you can control the output in more ways than can
fit into the margin. Oh, yeah, I'm sure it introduces some
overhead. So does everything else.

I hate log files, at least during development or testing. I prefer to
debug on the command line or using my IDE. Log files are for release
time, not development.

Except that it's not. Have you even looked at what the logging module
is? It most certainly can log to stderr if you provide no logging
handler to write to a file.
 
D

Devin Jeanpierre

Aside from an I/O caching bug directly affected by calling print or
somefile.write (where somefile is stdout), how else could I create a
Heisenbug in pure Python?

The garbage collector can do this, but I think in practice it's
ridiculously rare, since __del__ is almost never useful due to its
unreliability*. The problem is that the garbage collector can do
whatever it wants. For example, in CPython it is called after so many
cycles have been created. This allows code and user actions to
inadvertently affect the garbage collector, and therefore, the
invocation of __del__.

If your __del__ does anything that accesses mutable global state also
used elsewhere, it's conceivable that the order of someone else's
access and __del__'s invocation depends on the GC. One order or the
other might be the wrong one which causes a failure. As it happens,
the "bt" command in pdb creates a cycle and might trigger the garbage
collector, causing __del__ to run immediately, and potentially hiding
the failure.

This isn't really "pure python" in that Python doesn't even guarantee
__del__ is ever called at all, let alone why. It's completely
implementation-specific, and not a property of Python the language.

-- Devin

... [*] Some people use it as an "unreliable fallback"; this turns
their magical autosaving code into an attractive and yet horribly
dangerous nuisance. Friends don't let friends use __del__.
 
C

Chris Angelico

04 AM, Rick Johnson
[...] Or use the logging module. It's easy to get going quickly
(just call logging.basicConfig at startup time), and with a little
care and feeding, you can control the output in more ways than can
fit into the margin. Oh, yeah, I'm sure it introduces some
overhead. So does everything else.

I hate log files, at least during development or testing. I prefer to
debug on the command line or using my IDE. Log files are for release
time, not development.

Except that it's not. Have you even looked at what the logging module
is? It most certainly can log to stderr if you provide no logging
handler to write to a file.

Plus, writing to a file actually makes a lot of sense for development
too. It's far easier to run the program the same way in dev and
release, which often means daemonized. I like to have Upstart manage
all my services, for instance.

ChrisA
 
M

Mark Lawrence

Ah, yes. The Heisenbug. ;-)

We used to run into those back in the days of C and assembly language.
They're much harder to see in the wild with Python.

Strikes me it's a bit like problems when prototyping circuit boards.
The card doesn't work, so you mount it on an extender card, problem goes
away, remove extender card, problem reappears. Wash, rinse, repeat :)

--
"Steve is going for the pink ball - and for those of you who are
watching in black and white, the pink is next to the green." Snooker
commentator 'Whispering' Ted Lowe.

Mark Lawrence
 
R

Robert Kern

I am a huge proponent of using the right tool for the job. There is
nothing wrong with some well-placed FORTRAN, as long as the PSF

No, no. It's the PSU that you have to worrNO CARRIER
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,013
Latest member
KatriceSwa

Latest Threads

Top