Modifying a built-in function for logging purposes

Q

qwweeeit

Hi all,
I wonder if it is possible to change (temporarily) a built-in function
for logging purposes.
Let me explain:
I want to log all the 'open' operations, recording the file to be
opened, the "mode" (r/w/a...) and (possibly) the module which made the
call.
After that the 'open' can carry on following the standard built-in
path.
Possibly I don't want to have to modify the source of the application
to be logged.
If there is another solution, I am open to any suggestion...

Bye.
 
G

Greg Krohn

Hi all,
I wonder if it is possible to change (temporarily) a built-in function
for logging purposes.
Let me explain:
I want to log all the 'open' operations, recording the file to be
opened, the "mode" (r/w/a...) and (possibly) the module which made the
call.
After that the 'open' can carry on following the standard built-in
path.
Possibly I don't want to have to modify the source of the application
to be logged.
If there is another solution, I am open to any suggestion...

Bye.

Would this work?

_file = file
def file(path, mode):
print path, mode
return _file(path, mode)

-greg
 
Q

qwweeeit

Hi Greg,
thank for your replay, but I didn't succeed in any way. You must
consider however that I'm not a Python "expert"...
IMHO, it must be a script that change part of the interpreter, and
substitute a new module (py) in the place of the standard one (py or
pyc). The standard module must be saved in order to be able to undo the
changes and go back to the normal behaviour.
The problem is that I don't know if the built-in functions like open
(or file) are written in Python or in C and, besides that, if they can
be modified.
Other solutions which modify the source to be logged, are not
solutions, because it is far simpler to introduce here and there print
statements...
Bye.
 
R

Robert Kern

Hi Greg,
thank for your replay, but I didn't succeed in any way. You must
consider however that I'm not a Python "expert"...
IMHO, it must be a script that change part of the interpreter, and
substitute a new module (py) in the place of the standard one (py or
pyc). The standard module must be saved in order to be able to undo the
changes and go back to the normal behaviour.
The problem is that I don't know if the built-in functions like open
(or file) are written in Python or in C and, besides that, if they can
be modified.

Short answer: if you don't know stuff like this, then you probably
shouldn't mess around with the builtins in production code.

Depending on your operating system, there are probably programs that let
you list all of the files that have been opened on your system. For
example, on OS X, lsof(1) does the trick.
Other solutions which modify the source to be logged, are not
solutions, because it is far simpler to introduce here and there print
statements...

--
Robert Kern
(e-mail address removed)

"In the fields of hell where the grass grows high
Are the graves of dreams allowed to die."
-- Richard Harter
 
F

Fredrik Lundh

I wonder if it is possible to change (temporarily) a built-in function
for logging purposes.
Let me explain:
I want to log all the 'open' operations, recording the file to be
opened, the "mode" (r/w/a...) and (possibly) the module which made the
call.

import sys
import __builtin__ # note: no plural s

old_open = __builtin__.open

def myopen(*args):
code = sys._getframe(1).f_code
print "OPEN", args, "FROM", code.co_name, "IN", code.co_filename
return old_open(*args)

__builtin__.open = myopen

this only handles file opens that goes via the "open" function, of course.
to handle all opens, including internal operations (e.g. imports), you're
probably better off using an external tool (e.g. strace, filemon, or some-
thing similar).

</F>
 
B

Bengt Richter

Short answer: if you don't know stuff like this, then you probably
shouldn't mess around with the builtins in production code.

Depending on your operating system, there are probably programs that let
you list all of the files that have been opened on your system. For
example, on OS X, lsof(1) does the trick.
Playing with this a little:

----< logopen.py >---------------------------------------
import sys, time

class OpenLogger(object):
def __init__(self, logfile=sys.stdout):
self.logfile = logfile
self.old_open = __builtins__.open
def open(self, path, *mode):
openargs = ', '.join(map(repr, mode and [path, mode[0]] or [path]))
tt = time.time()
thdr = '%s.%02d' % (time.strftime('%Y-%m-%d %H:%M:%S', time.localtime(tt)), int(tt*100)%100)
print >> self.logfile, '%s: open(%s) called from %r in "%s" line %s' % ((thdr, openargs) +
[(f.f_code.co_name, f.f_code.co_filename, f.f_lineno) for f in [sys._getframe(1)]][0])
return self.old_open(path, *mode)
def on(self):
__builtins__.open = self.open
def off(self):
__builtins__.open = self.old_open

def main(*logfile):
logger = OpenLogger(*logfile)
try:
logger.on()
script_to_monitor = sys.argv[1]
sys.argv[0:] = sys.argv[1:]
xdict = dict(__builtins__=__builtins__, __name__='__main__')
execfile(script_to_monitor, xdict)
finally:
logger.off()

if __name__ == '__main__':
if not sys.argv[2:]: raise SystemExit, """
Usage: [python] logopen.py [-log logfile] script_to_monitor [script_to_monitor args]"""
logfile = sys.argv[1] == '-log' and sys.argv.pop(1) and [open(sys.argv.pop(1), 'a')] or []
main(*logfile)
---------------------------------------------------------

A script whose opens we can monitor, that opens a file from module and function scope:
----< pnlines.py >---------------------------------------
import sys
print '----< %r' % sys.argv[1]
for i, line in enumerate(open(sys.argv[1], 'r')):
print '%4s: %s' %(i+1, line.rstrip())

def foo(nlines=1):
for i, line in enumerate(open(sys.argv[1])): # test default 'r'
print '%4s: %s' %(i+1, line.rstrip())
if i+1 >= nlines: break
print '----< 3 lines of %r' % sys.argv[1]
foo(3)
---------------------------------------------------------

Result (tested only as as far as you see here):

[ 0:23] C:\pywk\clp>del logopen.txt
Could Not Find C:\pywk\clp\logopen.txt

[ 0:23] C:\pywk\clp>py24 logopen.py

Usage: [python] logopen.py [-log logfile] script_to_monitor [script_to_monitor args]

Ok, so we pass pnlines.py as the file for it itself to print:

[ 0:23] C:\pywk\clp>py24 logopen.py -log logopen.txt pnlines.py pnlines.py
----< 'pnlines.py'
1: import sys
2: print '----< %r' % sys.argv[1]
3: for i, line in enumerate(open(sys.argv[1], 'r')):
4: print '%4s: %s' %(i+1, line.rstrip())
5:
6: def foo(nlines=1):
7: for i, line in enumerate(open(sys.argv[1])): # test default 'r'
8: print '%4s: %s' %(i+1, line.rstrip())
9: if i+1 >= nlines: break
10: print '----< 3 lines of %r' % sys.argv[1]
11: foo(3)
12:
----< 3 lines of 'pnlines.py'
1: import sys
2: print '----< %r' % sys.argv[1]
3: for i, line in enumerate(open(sys.argv[1], 'r')):

[ 0:24] C:\pywk\clp>type logopen.txt
2005-05-15 00:24:06.55: open('pnlines.py', 'r') called from '?' in "pnlines.py" line 3
2005-05-15 00:24:06.55: open('pnlines.py') called from 'foo' in "pnlines.py" line 7

Maybe the OP can build on this and contribute back something more useful and tested ;-)

Regards,
Bengt Richter
 
B

Bengt Richter

import sys
import __builtin__ # note: no plural s

old_open = __builtin__.open

def myopen(*args):
code = sys._getframe(1).f_code
print "OPEN", args, "FROM", code.co_name, "IN", code.co_filename
return old_open(*args)

__builtin__.open = myopen

this only handles file opens that goes via the "open" function, of course.
to handle all opens, including internal operations (e.g. imports), you're
probably better off using an external tool (e.g. strace, filemon, or some-
thing similar).
I should have mentioned that for my version of the same thing.

I wonder what the chances are for a hook to catch all opens.
My feeling is that file system access should be virtualized
to look like a unix directory tree, with all manner of duck-typed
file-system-like things mountable in the tree, and built-in open
would refer to the open of a particular virtually mounted file system
that might be configured to default as now. Anyway, lots of stuff
would become possible... e.g., msys, the MinGW-related shell provides
some of this capability, virtualizing windows partitions as /c/* /d/*
and so forth, as well as having virtual mounts of various subdir trees.
Good night...

Regards,
Bengt Richter
 
Q

qwweeeit

Hi Robert,
Short answer: if you don't know stuff like this, then you probably
shouldn't mess around with the builtins in production code.
I begin to be fed up of beeing treated as a child who is only able to
make damages...
But this time you are right...
So let's change point of view: instead of trying to modify built-in
functions, it is better to write a small wrapper of the application you
want to log.
For example:
# app_wrapper.py
....insert here the excellent routine of Fredrik Lundh (thank you
Fredrik for your contribution!)
import app #name of the application to be logged.
In the test I carried on the answer was:
OPEN ('pippo2',) FROM ? IN /home/qwweeeit/app.py
OPEN ('pippo3', 'w') FROM ? IN /home/qwweeeit/app.py
OPEN ('pippolong', 'w') FROM ? IN /home/qwweeeit/app.py
I have not tested yet if multi-module applications answer in terms of
modules.
I must thank also Bengt Richter, also if his suggestion is far too
complicate for me...
Concerning his second replay and the virtualization of file system
access (or the usage of an external tool as suggested by Fredrik), I
leave the matter to the experts.
Bye.
 
S

Steven D'Aprano

Hi Greg,
thank for your replay, but I didn't succeed in any way. You must
consider however that I'm not a Python "expert"...

Can you post what you did and what results you got? Because Greg's trick
worked for me. See below.
IMHO, it must be a script that change part of the interpreter, and
substitute a new module (py) in the place of the standard one (py or
pyc). The standard module must be saved in order to be able to undo the
changes and go back to the normal behaviour. The problem is that I don't
know if the built-in functions like open (or file) are written in Python
or in C and, besides that, if they can be modified.

Why do you think it matters if they are written in Python or C or any
other language for that matter? Almost everything in Python is a
first-class object. That means you can rebind functions, methods, classes,
and any other object.

You can't rebind statements like print. But open is just an object:

py> open
<type 'file'>
py> print open("Something.txt", "r").read()
some text in a file
py>
py> save_open = open<type 'file'>
py>
py> def open(pathname, mode):
.... print "The pathname is: " + pathname
.... print "The mode is: " + mode
.... return save_open(pathname, mode)
....
py> contents = open("Something.txt", "r").read()
The pathname is: Something.txt
The mode is: r
py> contents
'some text in a file'

Other solutions
which modify the source to be logged, are not solutions, because it is
far simpler to introduce here and there print statements... Bye.

Introducing print statements is good for quick-and-dirty debugging. For
more serious work, you should investigate the debug module.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,767
Messages
2,569,572
Members
45,045
Latest member
DRCM

Latest Threads

Top