Hey, get this! [was: import from database]

S

Steve Holden

This is even stranger: it makes it if I import the module a second time:

import dbimp as dbimp
import sys

if __name__ == "__main__":
dbimp.install()
#k = sys.modules.keys()
#k.sort()
#for kk in k:
#print kk
#import bsddb.db
import a.b.c.d
import smtplib
import ftplib
import fileinput
try:
print "first import"
import bsddb
except:
print "second import"
import bsddb
print "Done!"

$ python -i test.py
dbimporter: item: *db* args: () keywords: {}
Accepted *db*
dbimporter: item: /c/steve/Projects/Python/dbimp args: () keywords: {}
dbimporter: item: /c/steve/Projects/Python/dbimp/c args: () keywords: {}
dbimporter: item: /c/steve/Projects/Python/dbimp/\code args: () keywords: {}
dbimporter: item: /usr/lib/python24.zip args: () keywords: {}
dbimporter: item: /usr/lib/python2.4 args: () keywords: {}
dbimporter: item: /usr/lib/python2.4/plat-cygwin args: () keywords: {}
dbimporter: item: /usr/lib/python2.4/lib-tk args: () keywords: {}
dbimporter: item: /usr/lib/python2.4/lib-dynload args: () keywords: {}
dbimporter: item: /usr/lib/python2.4/site-packages args: () keywords: {}
dbimporter: item: /usr/lib/python2.4/site-packages/a args: () keywords: {}
dbimporter: item: /usr/lib/python2.4/site-packages/a/b args: () keywords: {}
dbimporter: item: /usr/lib/python2.4/site-packages/a/b/c args: ()
keywords: {}
found smtplib in db
load_module: smtplib
found socket in db
load_module: socket
socket loaded: <module 'socket' from 'db:socket'> pkg: 0
found rfc822 in db
load_module: rfc822
rfc822 loaded: <module 'rfc822' from 'db:rfc822'> pkg: 0
found base64 in db
load_module: base64
base64 loaded: <module 'base64' from 'db:base64'> pkg: 0
found hmac in db
load_module: hmac
hmac loaded: <module 'hmac' from 'db:hmac'> pkg: 0
dbimporter: item: /usr/lib/python2.4/email args: () keywords: {}
found random in db
load_module: random
random loaded: <module 'random' from 'db:random'> pkg: 0
found quopri in db
load_module: quopri
quopri loaded: <module 'quopri' from 'db:quopri'> pkg: 0
smtplib loaded: <module 'smtplib' from 'db:smtplib'> pkg: 0
found ftplib in db
load_module: ftplib
dbimporter: item: /usr/lib/python2.4/site-packages/PIL args: () keywords: {}
dbimporter: item: /usr/lib/python2.4/site-packages/piddle args: ()
keywords: {}
ftplib loaded: <module 'ftplib' from 'db:ftplib'> pkg: 0
found fileinput in db
load_module: fileinput
fileinput loaded: <module 'fileinput' from 'db:fileinput'> pkg: 0
first import
found bsddb in db
load_module: bsddb
found weakref in db
load_module: weakref
weakref loaded: <module 'weakref' from 'db:weakref'> pkg: 0
second import
Done!
So it's clearly something pretty funky. It now "works" (for some value
of "work" wiht both MySQL and sqlite. I hope I have this sorted out
before PyCon ... I'm currently a bit confused!

regards
Steve
 
P

Peter Otten

Steve said:
This is even stranger: it makes it if I import the module a second time:

[second import seems to succeed]

Maybe you are experiencing some version confusion? What you describe looks
much like the normal Python 2.3 behaviour (with no import hook involved)
whereas you seem to operate on the 2.4 library.
A partially initialized module object is left behind in sys.modules and seen
by further import attempts.

$ cat arbitrary.py

import not_there

def f():
print "you ain't gonna see that"

$ python
Python 2.3.3 (#1, Apr 6 2004, 01:47:39)
[GCC 3.3.3 (SuSE Linux)] on linux2
Type "help", "copyright", "credits" or "license" for more information.Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "arbitrary.py", line 2, in ?
import not_there
ImportError: No module named not_thereTraceback (most recent call last):

I have no experience with import hooks, but for normal imports that has been
fixed in Python 2.4:

$ py24
Python 2.4 (#5, Jan 4 2005, 10:14:01)
[GCC 3.3.3 (SuSE Linux)] on linux2
Type "help", "copyright", "credits" or "license" for more information.Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "arbitrary.py", line 2, in ?
import not_there
ImportError: No module named not_thereTraceback (most recent call last):
File "<stdin>", line 1, in ?
File "arbitrary.py", line 2, in ?
import not_there
ImportError: No module named not_there
Peter
 
S

Steve Holden

Peter said:
Steve Holden wrote:

This is even stranger: it makes it if I import the module a second time:


[second import seems to succeed]

Maybe you are experiencing some version confusion? What you describe looks
much like the normal Python 2.3 behaviour (with no import hook involved)
whereas you seem to operate on the 2.4 library.
A partially initialized module object is left behind in sys.modules and seen
by further import attempts.
I agree that this is 2.3-like behavior, but Python cannot lie ...

sholden@dellboy ~/Projects/Python/dbimp
$ python
Python 2.4 (#1, Dec 4 2004, 20:10:33)
[GCC 3.3.3 (cygwin special)] on cygwin
Type "help", "copyright", "credits" or "license" for more information.
$ cat arbitrary.py

import not_there

def f():
print "you ain't gonna see that"

$ python
Python 2.3.3 (#1, Apr 6 2004, 01:47:39)
[GCC 3.3.3 (SuSE Linux)] on linux2
Type "help", "copyright", "credits" or "license" for more information.

Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "arbitrary.py", line 2, in ?
import not_there
ImportError: No module named not_there

Traceback (most recent call last):
File "<stdin>", line 1, in ?
AttributeError: 'module' object has no attribute 'f'


I have no experience with import hooks, but for normal imports that has been
fixed in Python 2.4:

$ py24
Python 2.4 (#5, Jan 4 2005, 10:14:01)
[GCC 3.3.3 (SuSE Linux)] on linux2
Type "help", "copyright", "credits" or "license" for more information.

Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "arbitrary.py", line 2, in ?
import not_there
ImportError: No module named not_there

Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "arbitrary.py", line 2, in ?
import not_there
ImportError: No module named not_there


Peter
$ cat arbitrary.py
import not_there

def f():
print "you ain't gonna see that"


sholden@dellboy ~/Projects/Python/dbimp
$ python
Python 2.4 (#1, Dec 4 2004, 20:10:33)
[GCC 3.3.3 (cygwin special)] on cygwin
Type "help", "copyright", "credits" or "license" for more information.Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "arbitrary.py", line 1, in ?
import not_there
ImportError: No module named not_thereTraceback (most recent call last):
File "<stdin>", line 1, in ?
File "arbitrary.py", line 1, in ?
import not_there
ImportError: No module named not_there
Yup, looks like 2.4 (despite this funny cygwin stuff, could that make a
difference).

Let me try it under Windows [ferkle, ferkle ...]

Does the same thing there.

This problem also seems to depend what's already loaded. I wrote a
program to write a test program that looks like this:

import dbimp
dbimp.install()

print "Trying aifc"
try:
import aifc
except:
print "%Failed: aifc"


print "Trying anydbm"
try:
import anydbm
except:
print "%Failed: anydbm"


print "Trying asynchat"
try:
import asynchat
except:
print "%Failed: asynchat"

...

import dbimp
dbimp.install()

print "Trying aifc"
try:
import aifc
except:
print "%Failed: aifc"


print "Trying anydbm"
try:
import anydbm
except:
print "%Failed: anydbm"


print "Trying asynchat"
try:
import asynchat
except:
print "%Failed: asynchat"

The two platforms give expectedly close results. I'm storing compiled
code, so a version incompatibility would be a problem, I agree, but I
have checked the program that loaded the database, and loaded it again
from the Windows source rather than the CygWin source, just to see
whether there were any unnoticed platform dependencies. The results were
exactly the same using either library, and Windows and Cygwin showed
only minor variations.

exhaustCyg24.txt:%Failed: bsddb.dbtables
exhaustCyg24.txt:%Failed: bsddb.test.test_all
exhaustCyg24.txt:%Failed: bsddb.test.test_associate
exhaustCyg24.txt:%Failed: bsddb.test.test_basics
exhaustCyg24.txt:%Failed: bsddb.test.test_compat
exhaustCyg24.txt:%Failed: bsddb.test.test_dbobj
exhaustCyg24.txt:%Failed: bsddb.test.test_dbshelve
exhaustCyg24.txt:%Failed: bsddb.test.test_dbtables
exhaustCyg24.txt:%Failed: bsddb.test.test_env_close
exhaustCyg24.txt:%Failed: bsddb.test.test_get_none
exhaustCyg24.txt:%Failed: bsddb.test.test_join
exhaustCyg24.txt:%Failed: bsddb.test.test_lock
exhaustCyg24.txt:%Failed: bsddb.test.test_misc
exhaustCyg24.txt:%Failed: bsddb.test.test_queue
exhaustCyg24.txt:%Failed: bsddb.test.test_recno
exhaustCyg24.txt:%Failed: bsddb.test.test_thread
exhaustCyg24.txt:%Failed: tzparse
exhaustWin24.txt:%Failed: bsddb.dbtables
exhaustWin24.txt:%Failed: bsddb.test.test_all
exhaustWin24.txt:%Failed: bsddb.test.test_associate
exhaustWin24.txt:%Failed: bsddb.test.test_basics
exhaustWin24.txt:%Failed: bsddb.test.test_compat
exhaustWin24.txt:%Failed: bsddb.test.test_dbobj
exhaustWin24.txt:%Failed: bsddb.test.test_dbshelve
exhaustWin24.txt:%Failed: bsddb.test.test_dbtables
exhaustWin24.txt:%Failed: bsddb.test.test_env_close
exhaustWin24.txt:%Failed: bsddb.test.test_get_none
exhaustWin24.txt:%Failed: bsddb.test.test_join
exhaustWin24.txt:%Failed: bsddb.test.test_lock
exhaustWin24.txt:%Failed: bsddb.test.test_misc
exhaustWin24.txt:%Failed: bsddb.test.test_queue
exhaustWin24.txt:%Failed: bsddb.test.test_recno
exhaustWin24.txt:%Failed: bsddb.test.test_thread
exhaustWin24.txt:%Failed: pty
exhaustWin24.txt:%Failed: rlcompleter
exhaustWin24.txt:%Failed: tzparse

As a workaround I can for the moment just omit everything that show the
least suspicion of not working from the database, but I'd really rather
know what's failing.

If you have any clues I'd be grateful.

regards
Steve
 
S

Steve Holden

Steve said:
Peter said:
Steve Holden wrote:

This is even stranger: it makes it if I import the module a second time:



[second import seems to succeed]

Maybe you are experiencing some version confusion? What you describe
looks
much like the normal Python 2.3 behaviour (with no import hook involved)
whereas you seem to operate on the 2.4 library.
A partially initialized module object is left behind in sys.modules
and seen
by further import attempts.
I agree that this is 2.3-like behavior, but Python cannot lie ...

sholden@dellboy ~/Projects/Python/dbimp
$ python
Python 2.4 (#1, Dec 4 2004, 20:10:33)
[GCC 3.3.3 (cygwin special)] on cygwin
Type "help", "copyright", "credits" or "license" for more information.
Just to make things simpler, and (;-) to appeal to a wider audience,
here is a program that doesn't use database at all (it loads the entire
standard library into a dict) and still shows the error.

What *I* would like to know is: who is allowing the import of bsddb.os,
thereby somehow causing the code of the os library module to be run a
second time.

#
# Establish standard library in dict
#
import os
import glob
import sys
import marshal
import new

def importpy(dct, path, modname, package):
c = compile(file(path).read(), path, "exec")
dct[modname] = (marshal.dumps(c), package, path)
if package:
print "Package", modname, path
else:
print "Module", modname, path

def importall(dct, path, modlist):
os.chdir(path)
for f in glob.glob("*"):
if os.path.isdir(f):
fn = os.path.join(path, f, "__init__.py")
if os.path.exists(fn):
ml = modlist + [f]
importpy(dct, fn, ".".join(ml), 1)
importall(dct, os.path.join(path, f), ml)
elif f.endswith('.py') and '.' not in f[:-3] and f !=
"__init__.py":
importpy(dct, os.path.join(path, f),
".".join(modlist+[f[:-3]]), 0)

class dbimporter(object):

def __init__(self, item, *args, **kw):
##print "dbimporter: item:", item, "args:", args, "keywords:", kw
if item != "*db*":
raise ImportError
print "Accepted", item

def find_module(self, fullname, path=None):
print "find_module:", fullname, "from", path
if fullname not in impdict:
#print "Bailed on", fullname
return None
else:
print "found", fullname, "in db"
return self

def load_module(self, modname):
print "load_module:", modname
if modname in sys.modules:
return sys.modules[modname]
try:
row = impdict[modname]
except KeyError:
#print modname, "not found in db"
raise ImportError, "DB module %s not found in modules"
code, package, path = row
code = marshal.loads(code)
module = new.module(modname)
sys.modules[modname] = module
module.__name__ = modname
module.__file__ = path # "db:%s" % modname
module.__loader__ = dbimporter
if package:
module.__path__ = sys.path
exec code in module.__dict__
print modname, "loaded:", repr(module), "pkg:", package
return module

def install():
sys.path_hooks.append(dbimporter)
sys.path_importer_cache.clear() # probably not necessary
sys.path.insert(0, "*db*") # probably not needed with a metea-path
hook?

if __name__ == "__main__":
impdict = {}
for path in sys.argv[1:]:
importall(impdict, path, [])
install()
import bsddb

Running this against a copy of the Python 2.4 standard library in C:\Lib
gives me

[...]
Module _strptime C:\Lib\_strptime.py
Module _threading_local C:\Lib\_threading_local.py
Module __future__ C:\Lib\__future__.py
Accepted *db*
find_module: bsddb from None
found bsddb in db
load_module: bsddb
find_module: bsddb._bsddb from None
find_module: bsddb.sys from None
find_module: bsddb.os from None
find_module: bsddb.nt from None
find_module: bsddb.ntpath from None
find_module: bsddb.stat from None
Traceback (most recent call last):
File "C:\Steve\Projects\Python\dbimp\dictload.py", line 79, in ?
import bsddb
File "C:\Steve\Projects\Python\dbimp\dictload.py", line 65, in
load_module
exec code in module.__dict__
File "C:\Lib\bsddb\__init__.py", line 62, in ?
import sys, os
File "C:\Python24\lib\os.py", line 133, in ?
from os.path import (curdir, pardir, sep, pathsep, defpath, extsep,
altsep,
ImportError: No module named path

The 2.3 bsddb library doesn't cause the same problems (and even loads
into 2.4 quite nicely). Lots of modules *will* import, and most packages
don't seem to cause problems. Anyone give me a pointer here?

regards
Steve
 
B

Bernhard Herzog

Steve Holden said:
What *I* would like to know is: who is allowing the import of bsddb.os,
thereby somehow causing the code of the os library module to be run a
second time.

I would guess (without actually running the code) that this part is
responsible:
if package:
module.__path__ = sys.path

You usually should initialize a package's __path__ to an empty list.
The __init__ module will take care of modifying it if necessary.

Bernhard
 
S

Steve Holden

Bernhard said:
I would guess (without actually running the code) that this part is
responsible:




You usually should initialize a package's __path__ to an empty list.
The __init__ module will take care of modifying it if necessary.

Bernhard
Thanks, Bernhard, that appeared to fix the problem. The error certainly
went away, anyway ... now I can do more experimentation.

regards
Steve
 
B

Bernhard Herzog

Bernhard Herzog said:
You usually should initialize a package's __path__ to an empty list.

Actually, normally it's a list that contains the name of the package
directory as its only item. I'm not sure what you should do when you do
not import from a file system.

Bernhard
 
J

Just

Bernhard Herzog said:
Actually, normally it's a list that contains the name of the package
directory as its only item. I'm not sure what you should do when you do
not import from a file system.

If it's a path importer, it could be a cookie, specific to the importer.
I think in Steve's case initializing __path__ to ["*db*"] should work.

Just
 
S

Steve Holden

Just said:
Actually, normally it's a list that contains the name of the package
directory as its only item. I'm not sure what you should do when you do
not import from a file system.


If it's a path importer, it could be a cookie, specific to the importer.
I think in Steve's case initializing __path__ to ["*db*"] should work.

Just

And that's exactly the conclusion I came to when import of the package's
submodules didn't work as anticipated.

Coming to the question of writing a customer importer from the
documentation I discovered there is a huge amount of layered cruft in
the import scheme going all the way back to the days of the "ni" module.
It took me two aborted attempts just to realize I should be using PEP
302 and not ihooks or some wrapper around __import__().

While this may be interesting history it's very confusing, and I'm
encouraging Alex Martelli to describe the current PEP-302-based scheme a
little more fully in his forthcoming revision to the Nutshell. The PEP
is just a little terse in places, I feel.

I'm also wondering if the inspect module shouldn't have a facility to
hook into custom importers, since its code is pretty much filestore
based at present. This should probably be via an *optional* API to avoid
breakage in existing custom importers.

regards
Steve
 
J

Just

If it's a path importer, it could be a cookie, specific to the importer.
I think in Steve's case initializing __path__ to ["*db*"] should work.

Just

And that's exactly the conclusion I came to when import of the package's
submodules didn't work as anticipated.

Coming to the question of writing a customer importer from the
documentation I discovered there is a huge amount of layered cruft in
the import scheme going all the way back to the days of the "ni" module.
It took me two aborted attempts just to realize I should be using PEP
302 and not ihooks or some wrapper around __import__().[/QUOTE]

Yes. PEP 302 came about when I tried to reimplement PEP 273 (zip import)
in a "sane" way ("saner" is probably as far as I got...).

Import is indeed very confusing, especially because of packages. I tried
to convince Guido at some point to simplfy package imports by getting
rid of __path__ altogether (and then simply search for
"packagename/submodule.py" on sys.path) but he disliked the idea of
widening submodule imports that much. On the other hand, __path__ is a
mutable list so people can get the same effect by adding stuff to it.
While this may be interesting history it's very confusing, and I'm
encouraging Alex Martelli to describe the current PEP-302-based scheme a
little more fully in his forthcoming revision to the Nutshell. The PEP
is just a little terse in places, I feel.

Yeah, it's a PEP, not official documentation, but since there isn't any
official documentation, all you've got is the PEP :(
I'm also wondering if the inspect module shouldn't have a facility to
hook into custom importers, since its code is pretty much filestore
based at present. This should probably be via an *optional* API to avoid
breakage in existing custom importers.

Probably.

Just
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,534
Members
45,008
Latest member
Rahul737

Latest Threads

Top