Proposal: Inline Import

S

Shane Hathaway

Here's a heretical idea.

I'd like a way to import modules at the point where I need the
functionality, rather than remember to import ahead of time. This might
eliminate a step in my coding process. Currently, my process is I
change code and later scan my changes to make matching changes to the
import statements. The scan step is error prone and time consuming.
By importing inline, I'd be able to change code without the extra scan step.

Furthermore, I propose that the syntax for importing inline should be an
expression containing a dot followed by an optionally dotted name. For
example:

name_expr = .re.compile('[a-zA-Z]+')

The expression on the right causes "re.compile" to be imported before
calling the compile function. It is similar to:

from re import compile as __hidden_re_compile
name_expr = __hidden_re_compile('[a-zA-Z]+')

The example expression can be present in any module, regardless of
whether the module imports the "re" module or assigns a different
meaning to the names "re" or "compile".

I also propose that inline import expressions should have no effect on
local or global namespaces, nor should inline import be affected by
local or global namespaces. If users want to affect a namespace, they
must do so with additional syntax that explicitly assigns a name, such as:

compile = .re.compile

In the interest of catching errors early, it would be useful for the
Python parser to produce byte code that performs the actual import upon
loading modules containing inline import expressions. This would catch
misspelled module names early. If the module also caches the imported
names in a dictionary, there would be no speed penalty for importing
inline rather than importing at the top of the module.

I believe this could help many aspects of the language:

- The coding workflow will improve, as I mentioned.

- Code will become more self-contained. Self-contained code is easier
to move around or post as a recipe.

- There will be less desire for new builtins, since modules will be just
as accessible as builtins.

Thoughts?

Shane
 
M

Mike Meyer

Shane Hathaway said:
Here's a heretical idea.

Not really.
I'd like a way to import modules at the point where I need the
functionality, rather than remember to import ahead of time. This
might eliminate a step in my coding process. Currently, my process is
I change code and later scan my changes to make matching changes to
the import statements. The scan step is error prone and time
consuming. By importing inline, I'd be able to change code without the
extra scan step.

As others have pointed out, you can fix your process. That your
process doesn't work well with Python isn't a good reason for changing
Python.
Furthermore, I propose that the syntax for importing inline should be
an expression containing a dot followed by an optionally dotted name.
For example:

name_expr = .re.compile('[a-zA-Z]+')

The expression on the right causes "re.compile" to be imported before
calling the compile function. It is similar to:

from re import compile as __hidden_re_compile
name_expr = __hidden_re_compile('[a-zA-Z]+')

The example expression can be present in any module, regardless of
whether the module imports the "re" module or assigns a different
meaning to the names "re" or "compile".

It's actually an intriguing idea, but I'm going to have to give it a
-1.

The thing is, it's really only useful in a corner case - where you
want to refer to a module exactly once in your code. I'd hate to see
code that skips doing the import to do .re.compile a half-dozen times
instead of importing the name properly. In fact, avoiding this
situation just makes your process even worse: you'd have to go back
and scan for multiple uses of .module and fix the import statements.
I also propose that inline import expressions should have no effect on
local or global namespaces, nor should inline import be affected by
local or global namespaces. If users want to affect a namespace, they
must do so with additional syntax that explicitly assigns a name, such
as:
compile = .re.compile

You can do this now, with "from re import compile".
In the interest of catching errors early, it would be useful for the
Python parser to produce byte code that performs the actual import
upon loading modules containing inline import expressions. This would
catch misspelled module names early. If the module also caches the
imported names in a dictionary, there would be no speed penalty for
importing inline rather than importing at the top of the module.

No. No, no and no. That's -4, for those keeping count.

This is different from the semantics of the import statment, which
means your examples above are broken. This is a bad thing.

This means that some expressions are "magic", in that they are
automatically evaluated at compile time instead of execution
time. This is a bad thing.

I'm not sure what your cache would do. Modules are already cached in
sys.modules. Your implicit import would have to look up the module
name just like a regular import. The name following then has to be
looked up in that module - both of which are dictionary lookups. What
would you cache - and where - that would noticably improve on that?
I believe this could help many aspects of the language:
- The coding workflow will improve, as I mentioned.

I think it'll get worse.
- Code will become more self-contained. Self-contained code is easier
to move around or post as a recipe.

If you really want to do this, use __import__.
- There will be less desire for new builtins, since modules will be
just as accessible as builtins.

I don't see people asking for new builtins often enough to think that
this is a problem that needs fixing.

<mike
 
E

Erik Max Francis

Shane said:
I'd like a way to import modules at the point where I need the
functionality, rather than remember to import ahead of time.

You can already do this; import statements don't have to be at the top
of a Python script. This proposal is pretty much dead on arrival.
 
S

Shane Hathaway

Xavier said:
Shane said:
Thoughts?
import re; name_expr = re.compile('[a-zA-Z]+')
name_expr

the import statement can be called anywhere in the code, why would you
add strange syntactic sugar that doesn't actually bring anything?

That syntax is verbose and avoided by most coders because of the speed
penalty. It doesn't replace the idiom of importing everything at the
top of the module.

What's really got me down is the level of effort required to move code
between modules. After I cut 100 lines from a 500 line module and paste
them to a different 500 line module, I have to examine every import in
both modules as well as examine the code I moved for missing imports.
And I still miss a lot of cases. My test suite catches a lot of the
mistakes, but it can't catch everything.

If I could just avoid import statements altogether, moving code would be
easier, regardless of extra typing. But I can't avoid import statements
unless there's a different way to import that lots of people like.

Shane
 
S

Shane Hathaway

Mike said:
As others have pointed out, you can fix your process. That your
process doesn't work well with Python isn't a good reason for changing
Python.

Do you have any ideas on how to improve the process of maintaining
imports? Benji's suggestion of jumping around doesn't work for moving
code and it interrupts my train of thought. Sprinkling the code with
import statements causes a speed penalty and a lot of clutter.

I'm actually quite surprised that others aren't bothered by the process
of maintaining imports. Perhaps the group hasn't spent time in Eclipse
to see what a relief it is to have imports managed for you. The
difference isn't enough to make anyone jump ship to Java, but it's a
real improvement.

Shane
 
M

Mike Meyer

Shane Hathaway said:
Xavier said:
Shane said:
Thoughts?
import re; name_expr = re.compile('[a-zA-Z]+')
name_expr
the import statement can be called anywhere in the code, why would
you add strange syntactic sugar that doesn't actually bring anything?
That syntax is verbose and avoided by most coders because of the speed
penalty.

What speed penalty? "import re" is a cheap operation, every time but
the first one in a program.
What's really got me down is the level of effort required to move code
between modules. After I cut 100 lines from a 500 line module and
paste them to a different 500 line module, I have to examine every
import in both modules as well as examine the code I moved for missing
imports.

Has it ever occured to you that if you're cutting and pasting 500 line
blocks, you're doing something fundamentally wrong? One of the points
of modules and OO is that you don't *have* to do things like
that. Cut-n-paste means you wind up with two copies of the code to
maintain, so that bug fixes in one will have to be propogated to the
other "by hand". Rather than spend time fixing what you broke by
yanking the code out of it's context, you'd be better off refactoring
the code so you could use it in context. That'll cut down on the
maintenance in the future, and may well mean that the next time
someone needs the code, it'll already be properly refactored so they
can use it directly, without having to cut-n-paste-n-fix it again.

<mike
 
S

Shane Hathaway

Mike said:
What speed penalty? "import re" is a cheap operation, every time but
the first one in a program.

I'm talking about using imports *everywhere*. The penalty would be
appreciable.
Has it ever occured to you that if you're cutting and pasting 500 line
blocks, you're doing something fundamentally wrong? One of the points
of modules and OO is that you don't *have* to do things like
that. Cut-n-paste means you wind up with two copies of the code to
maintain, so that bug fixes in one will have to be propogated to the
other "by hand". Rather than spend time fixing what you broke by
yanking the code out of it's context, you'd be better off refactoring
the code so you could use it in context. That'll cut down on the
maintenance in the future, and may well mean that the next time
someone needs the code, it'll already be properly refactored so they
can use it directly, without having to cut-n-paste-n-fix it again.

I said cut and paste, not copy and paste. I'm moving code, not copying
it. Your advice is correct but doesn't apply to this problem.

Shane
 
K

Kent Johnson

Shane said:
I'm talking about using imports *everywhere*. The penalty would be
appreciable.

Have you tried it?

D:\Projects\CB>python -m timeit -s "import re" "import re"
1000000 loops, best of 3: 1.36 usec per loop

You need a lot of imports before 1 usec becomes "appreciable". And your
proposal is doing the import anyway, just under the hood. How will you
avoid the same penalty?

Kent
 
M

Mike Meyer

Shane Hathaway said:
I'm talking about using imports *everywhere*. The penalty would be
appreciable.

As Kent shows, it wouldn't. Are you sure you understand what import
really does?
I said cut and paste, not copy and paste. I'm moving code, not
copying it. Your advice is correct but doesn't apply to this problem.

In that case, dealing with importst is a minor part of your
problem. You have to check for every name in the global name space in
both the old and new files to make sure they get defined properly.

<mike
 
S

Stephen Prinster

Shane said:
Do you have any ideas on how to improve the process of maintaining
imports? Benji's suggestion of jumping around doesn't work for moving
code and it interrupts my train of thought. Sprinkling the code with
import statements causes a speed penalty and a lot of clutter.

I'm actually quite surprised that others aren't bothered by the process
of maintaining imports. Perhaps the group hasn't spent time in Eclipse
to see what a relief it is to have imports managed for you. The
difference isn't enough to make anyone jump ship to Java, but it's a
real improvement.

Shane

Have you looked at py lib? Particularly the py.std hook?

http://codespeak.net/py/current/doc/misc.html

It's not exactly what you want, but it might help you. I must agree
with everyone else, though. I have never felt a need for what you are
describing.

Steve Prinster
 
S

Shane Hathaway

Kent said:
Have you tried it?

D:\Projects\CB>python -m timeit -s "import re" "import re"
1000000 loops, best of 3: 1.36 usec per loop

You need a lot of imports before 1 usec becomes "appreciable".

Let me fully elaborate the heresy I'm suggesting: I am talking about
inline imports on every other line of code. The obvious implementation
would drop performance by a double digit percentage.
And your
proposal is doing the import anyway, just under the hood. How will you
avoid the same penalty?

The more complex implementation, which I suggested in the first message,
is to maintain a per-module dictionary of imported objects (distinct
from the global namespace.) This would make inline imports have almost
exactly the same runtime cost as a global namespace lookup.

But never mind, this proposal is a distraction from the real issue. See
the next thread I'm starting.

Shane
 
M

Mike Meyer

Shane Hathaway said:
Let me fully elaborate the heresy I'm suggesting: I am talking about
inline imports on every other line of code. The obvious
implementation would drop performance by a double digit percentage.

No, it wouldn't. The semantics of import pretty much require that the
drop in performance would most likely be negligible.
The more complex implementation, which I suggested in the first
message, is to maintain a per-module dictionary of imported objects
(distinct from the global namespace.) This would make inline imports
have almost exactly the same runtime cost as a global namespace lookup.

If you put an import near every reference to a module, then each
import would "have almost exactly the same runtime cost as a global
namespace lookup." Your per-module dictionary of imported object
doesn't represent a significant improvement in module lookup time.
The extra cost comes from having to look up the module in the
namespace after you import it. However, the actual import has at most
the same runtime cost as looking up the module name, and may cost
noticably less. These costs will be swamped by the lookup cost for
non-module symbols in most code. If looking up some symbols is a
noticable part of your run-time the standard fix is to bind the
objects you are finding into your local namespace. Import already
allows this, with "from foo import bar". That will make references to
the name run as much fater than your proposed inline import than it
runs faster than doing an import before every line that references a
module.

In summary, the performance hit from doing many imports may be
significant compared to the cost only doing one import, but that still
represents only a small fraction of the total runtime of most code. In
the cases where that isn't the case, we already have a solution
available with better performance than any of the previously discussed
methods.

<mike
 
T

Thomas Heller

Shane Hathaway said:
Xavier said:
Shane said:
Thoughts?
import re; name_expr = re.compile('[a-zA-Z]+')
name_expr
the import statement can be called anywhere in the code, why would
you add strange syntactic sugar that doesn't actually bring anything?

That syntax is verbose and avoided by most coders because of the speed
penalty. It doesn't replace the idiom of importing everything at the
top of the module.

What's really got me down is the level of effort required to move code
between modules. After I cut 100 lines from a 500 line module and
paste them to a different 500 line module, I have to examine every
import in both modules as well as examine the code I moved for missing
imports. And I still miss a lot of cases. My test suite catches a lot
of the mistakes, but it can't catch everything.

I understand this use case.

You can use pychecker to find NameErrors without actually running the
code. Unfortunately, it doesn't (at least not always) find imports
which are not needed.

Thomas
 
E

Erik Max Francis

Shane said:
Let me fully elaborate the heresy I'm suggesting: I am talking about
inline imports on every other line of code. The obvious implementation
would drop performance by a double digit percentage.

Module importing is already idempotent. If you try to import an
already-imported module, inline or not, the second (or subsequent)
imports are no-operations.
 
A

Alex Martelli

Erik Max Francis said:
Module importing is already idempotent. If you try to import an
already-imported module, inline or not, the second (or subsequent)
imports are no-operations.

Hmmm, yes, but they're rather SLOW no-operations...:

Helen:~ alex$ python -mtimeit -s'import sys' 'import sys'
100000 loops, best of 3: 3.52 usec per loop

Now this is just a humble ultralight laptop, to be sure, but still, to
put the number in perspective...:

Helen:~ alex$ python -mtimeit -s'import sys' 'sys=23'
10000000 loops, best of 3: 0.119 usec per loop

....we ARE talking about a factor of 30 or so slower than elementary
assignments (I'm wondering whether this may depend on import hooks, or,
what else...).


Alex
 
B

Bengt Richter

Here's a heretical idea.

I'd like a way to import modules at the point where I need the
functionality, rather than remember to import ahead of time. This might
eliminate a step in my coding process. Currently, my process is I
change code and later scan my changes to make matching changes to the
import statements. The scan step is error prone and time consuming.
By importing inline, I'd be able to change code without the extra scan step.

Furthermore, I propose that the syntax for importing inline should be an
expression containing a dot followed by an optionally dotted name. For
example:

name_expr = .re.compile('[a-zA-Z]+')

The expression on the right causes "re.compile" to be imported before
calling the compile function. It is similar to:

from re import compile as __hidden_re_compile
name_expr = __hidden_re_compile('[a-zA-Z]+')

The example expression can be present in any module, regardless of
whether the module imports the "re" module or assigns a different
meaning to the names "re" or "compile".

I also propose that inline import expressions should have no effect on
local or global namespaces, nor should inline import be affected by
local or global namespaces. If users want to affect a namespace, they
must do so with additional syntax that explicitly assigns a name, such as:

compile = .re.compile

Are you willing to type a one-letter prefix to your .re ? E.g.,
... def __getattr__(self, attr):
... return __import__(attr)
...
>>> I = I()
>>> name_expr = I.re.compile('[a-zA-Z+]')
>>> name_expr
>>> compile = I.re.compile
>>> compile
>>> pi = I.math.pi
>>> pi 3.1415926535897931
>>> I.math.sin(pi/6)
0.49999999999999994

Of course it does cost you some overhead that you could avoid.
In the interest of catching errors early, it would be useful for the
Python parser to produce byte code that performs the actual import upon
loading modules containing inline import expressions. This would catch
misspelled module names early. If the module also caches the imported
names in a dictionary, there would be no speed penalty for importing
inline rather than importing at the top of the module.

I believe this could help many aspects of the language:

- The coding workflow will improve, as I mentioned.

- Code will become more self-contained. Self-contained code is easier
to move around or post as a recipe.

- There will be less desire for new builtins, since modules will be just
as accessible as builtins.

Thoughts?
There are special caveats re imports in threads, but otherwise
I don't know of any significant downsides to importing at various
points of need in the code. The actual import is only done the first time,
so it's effectively just a lookup in sys.modules from there on.
Am I missing something?

Regards,
Bengt Richter
 
R

Robert Kern

Bengt said:
Are you willing to type a one-letter prefix to your .re ? E.g.,
... def __getattr__(self, attr):
... return __import__(attr)
[snip]

There are special caveats re imports in threads, but otherwise
I don't know of any significant downsides to importing at various
points of need in the code. The actual import is only done the first time,
so it's effectively just a lookup in sys.modules from there on.
Am I missing something?

Packages.

--
Robert Kern
(e-mail address removed)

"In the fields of hell where the grass grows high
Are the graves of dreams allowed to die."
-- Richard Harter
 
B

Bengt Richter

Bengt said:
Are you willing to type a one-letter prefix to your .re ? E.g.,
class I(object):
... def __getattr__(self, attr):
... return __import__(attr)
[snip]

There are special caveats re imports in threads, but otherwise
I don't know of any significant downsides to importing at various
points of need in the code. The actual import is only done the first time,
so it's effectively just a lookup in sys.modules from there on.
Am I missing something?

Packages.
Ok, if you're willing to add a trailing '._' to indicate the end of a package path,
and start it with a P instead of an I, you could try the following (just a hack, not tested beyond
what you see, (again ;-) )

----< impexpr.py >--------------------
class I(object):
__cache = {}
def __getattr__(self, attr, cache = __cache):
try: return cache[attr]
except KeyError:
cache[attr] = ret = __import__(attr)
return ret
getdotted = __getattr__

class P(I):
def __init__(self):
self.elems = []
def __getattr__(self, attr):
if attr == '_':
dotted = '.'.join(self.elems)
mod = self.getdotted(dotted)
for attr in self.elems[1:]:
mod = getattr(mod, attr)
self.elems = []
return mod
else:
self.elems.append(attr)
return self

P, I = P(), I()
--------------------------------------
>>> from ut.impexpr import I, P
>>> I.math.pi 3.1415926535897931
>>> I.os.path.isfile
>>> P.ut.miscutil._.prb
>>> type(I)._I__cache.keys() ['ut.miscutil', 'os', 'math']
>>> P.ut.miscutil._.disex
>>> type(I)._I__cache.keys()
['ut.miscutil', 'os', 'math']
<module 'ut' from 'c:\pywk\ut\__init__.pyc'>

I am not recommending this particularly. I just like to see how close
python already is to allowing the spelling folks initially think requires
a language change ;-)

Regards,
Bengt Richter
 
B

bonono

Shane said:
Do you have any ideas on how to improve the process of maintaining
imports? Benji's suggestion of jumping around doesn't work for moving
code and it interrupts my train of thought. Sprinkling the code with
import statements causes a speed penalty and a lot of clutter.
Is using import near where you need it really such a hit to speed ? May
be it is critical inside a function(may be called in a loop) but what
about something like this :

import <my needed modules for this_func>
def this_func:
use it here

In this way, it is no different than the @decorator in terms of "moving
codes around", i.e. just copy the extra import/@decorator lines if you
want to move this to a new module. In terms of speed, it won't be part
of the function but a module so the speed penalty would be only when
other import this module(one time or multiple time but still not the
same as calling a function in a loop), no different than putting
everything at the top of the file(well a bit but should be
neglecgible).
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,777
Messages
2,569,604
Members
45,234
Latest member
SkyeWeems

Latest Threads

Top