Odd behavior with imp.reload and logging

A

Andrew Berg

When using a logger in a module and then using imp.reload to reload the
module, logger messages are repeated in direct proportion to the number
of times the modules was loaded. That is, on the first import, the
message is written once, but on the second run, each message is written
twice, three times on the third run, and so on.

With this code:
import logging

test_logger = logging.getLogger(__name__)
console_handler = logging.StreamHandler()
console_formatter = logging.Formatter('{asctime} - {module} - {funcName} - line {lineno} - {levelname} - {message}', style='{')
console_handler.setFormatter(console_formatter)
test_logger.addHandler(console_handler)
test_logger.setLevel(logging.DEBUG)

test_logger.info('Test info')

I get this in the interpreter:
Python 3.2.2 (default, Sep 4 2011, 09:07:29) [MSC v.1500 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.2011-09-21 20:51:10,665 - test2 - <module> - line 11 - INFO - Test info
2011-09-21 20:51:10,665 - test2 - <module> - line 11 - INFO - Test info
<module 'test2' from 'test2.py'>


What causes this, and how can I fix it (or at least work around it)? Due
to the nature of the program, it's much more convenient to reload a
module than to restart the entire program (especially when testing).
 
S

Steven D'Aprano

When using a logger in a module and then using imp.reload to reload the
module, logger messages are repeated in direct proportion to the number
of times the modules was loaded. That is, on the first import, the
message is written once, but on the second run, each message is written
twice, three times on the third run, and so on. [...]
What causes this, and how can I fix it (or at least work around it)? Due
to the nature of the program, it's much more convenient to reload a
module than to restart the entire program (especially when testing).

You unconditionally add a handler every time you reload the module,
regardless of whether or not the old handler is still there. You could
try something like this (untested):


import logging
test_logger = logging.getLogger(__name__)
if not test_logger.handlers:
console_handler = logging.StreamHandler()
console_formatter = logging.Formatter(
'{asctime} - {module} - {funcName} - line {lineno} - '
'{levelname} - {message}', style='{'
)
console_handler.setFormatter(console_formatter)
test_logger.addHandler(console_handler)
test_logger.setLevel(logging.DEBUG)

test_logger.info('Test info')
 
A

Andrew Berg

You could
try something like this (untested):
That works. Thanks!
This makes me wonder what else stays around after a reload and what side
effects there are, though. I would really like to purge everything from
the previous import. The main program has no dependence on the module
whatsoever.
 
C

Chris Angelico

This makes me wonder what else stays around after a reload and what side
effects there are, though. I would really like to purge everything from
the previous import. The main program has no dependence on the module
whatsoever.

On-the-fly reloading of modules isn't really one of Python's
strengths. Everyone who asks about it seems to be doing rapid
development/debugging and wanting to save on startup time (as opposed
to, say, running a server and updating code in it while it's active
and serving clients), so the question becomes: Which is more of a
problem, startup delay or the risk that it's not the same as a clean
start? Python doesn't guarantee that your debugging session is going
to be useful - if you reload that module and weird things happen, it
could be because of reload(), not because of a module bug.

Ranting Rick will probably expect me to mention Pike here, but I
won't. Muahahahaha..... oh. I just did. Oh well!

ChrisA
 
A

Andrew Berg

On-the-fly reloading of modules isn't really one of Python's
strengths. Everyone who asks about it seems to be doing rapid
development/debugging and wanting to save on startup time (as opposed
to, say, running a server and updating code in it while it's active
and serving clients), so the question becomes: Which is more of a
problem, startup delay or the risk that it's not the same as a clean
start? Python doesn't guarantee that your debugging session is going
to be useful - if you reload that module and weird things happen, it
could be because of reload(), not because of a module bug.
The main program is an IRC bot, which could potentially be in use by
many people in several channels on a network. As it is, the bot can only
connect to one server, but it could probably be set up to connect to any
number of networks. Making a number of quick fixes or changes to one
module could be very irritating to users if the bot has to terminate
each time, especially if those users don't know or care about that
specific module.
Startup time is an issue because it must connect to a network before it
can take any input. Also, many disconnects/reconnects could easily cause
problems (like the network refusing the connection as a DoS prevention
measure).
I'm not tied to any particular solution, and it's quite possible I'm
missing something since I am still a beginner.
 
C

Chris Angelico

The main program is an IRC bot, which could potentially be in use by
many people in several channels on a network. As it is, the bot can only
connect to one server, but it could probably be set up to connect to any
number of networks. Making a number of quick fixes or changes to one
module could be very irritating to users if the bot has to terminate
each time, especially if those users don't know or care about that
specific module.
Startup time is an issue because it must connect to a network before it
can take any input. Also, many disconnects/reconnects could easily cause
problems (like the network refusing the connection as a DoS prevention
measure).

Playing with networking and the desire to reload without restarting? I
think Pike may be a good choice for you. It has a C-like syntax but
Python-like guts; you use braces to delimit blocks of code, but arrays
and mappings and such are first-class objects that you can pass around
and use. It's a favorite of mine, but quite off-topic for this mailing
list; I'd be happy to help you get started with it.

Chris Angelico
 
A

Andrew Berg

I think Pike may be a good choice for you.
That's quite unappealing for a few reasons. First, that would likely
require writing an entirely new bot (I'm not even that familiar with the
current one; I've only been writing a module for it). Also, I don't
really enjoy programming (I'm aware I'm likely in the minority on this
list); I tolerate it enough to get certain things done, so learning
another language, especially when I'm still learning Python, is not
something I want to do. Python is probably not the best tool for this
particular job, but I am not nearly dedicated to this project enough to
learn another programming language.

So, is there any way to at least monitor what happens after a reload? I
haven't noticed anything odd until I came across this logging issue.
 
C

Chris Angelico

That's quite unappealing for a few reasons. First, that would likely
require writing an entirely new bot (I'm not even that familiar with the
current one; I've only been writing a module for it).

Ah, then yeah, it's probably not a good idea to change languages. But
you may end up finding other issues with reload() as well.

I wonder whether this would work...

modules=[]

# instead of 'import mymodule' use:
modules.append(__import__('mymodule')); mymodule=modules[-1]

In theory, this should mean that you load it fresh every time - I
think. If not, manually deleting entries from sys.modules might help,
either with or without the list of modules.

ChrisA
 
S

Steven D'Aprano

That works. Thanks!
This makes me wonder what else stays around after a reload

Practically everything. A reload doesn't delete anything, except as a
side-effect of running the module again.

Don't think of reloading as:

* Revert anything the module is responsible for.
* Delete the module object from the import cache.
* Import the module in a fresh environment.

Instead, think of it as:

* Re-import the module in the current environment.


In practice, you won't often see such side-effects, because most modules
don't store state outside of themselves. If they store state *inside*
themselves, then they will (almost always) overwrite that state. E.g.
this will work as expected:


state = [something]


But this leaves state hanging around in other modules and will be
surprising:

import another_module
another_module.state.append(something)


My guess is that the logging module uses a cache to save the logger,
hence there is state inadvertently stored outside your module.


Another place where reload() doesn't work as expected:
False


Objects left lying around from before the reload will keep references
open to the way things were before the reload. This often leads to
confusion when modules are edited, then reloaded. (Been there, done that.)
 
A

Andrew Berg

Objects left lying around from before the reload will keep references
open to the way things were before the reload. This often leads to
confusion when modules are edited, then reloaded. (Been there, done that.)
I'll keep that in mind. My module does have a class, but instances are
kept inside dictionaries, which are explicitly set to {} at the
beginning (can't use the update() method for dictionaries that don't
exist). Also, class instances get pickled after creation and unpickled
when the module is imported.
 
A

Andrew Berg

In theory, this should mean that you load it fresh every time - I
think. If not, manually deleting entries from sys.modules might help,
either with or without the list of modules.
I've played around with sys.modules, and it seems there are issues with
logging being reloaded (not sure if it's fair to call it a bug), not my
module. The only reliable way I've found to start fresh is to delete at
least logging from sys.modules (I can even do this from within my
module). Even if the module that imports logging is deleted, things in
logging persist (I can confirm at least handlers and handles to log
files; the latter requiring garbage collection as well to free up).
However, this would probably wreak havoc on other code that uses logging
(although, I could always import logging as something else).

On a side note, this actually gives me a good way to detect that the
module is reloaded (I can't think of any other way to detect it, at
least not off the top of my head).
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,743
Messages
2,569,478
Members
44,899
Latest member
RodneyMcAu

Latest Threads

Top