Making code run in both source tree and installation path

J

Javier Collado

Hello,

I would like to be able to run the main script in a python project
from both the source tree and the path in which it's installed on
Ubuntu. The script, among other things, imports a package which in
turns makes use of some data files that contains some metadata that is
needed in xml format.

The source tree has an structure such as this one:
setup.py
debian/ (packaging files)
src/ (source code)
src/lib (package files)
src/data (data files)
src/bin (main script)

However, when the project is installed using setup.py install, the
directory structure is approximately this way:
/usr/local/bin (main script)
/usr/local/share/<project_name> (data files)
/usr/local/lib/python2.x/dist-packages/<project_name> (library files)

And when installing the code through a package, the structure is the
same one, but removing "local".

Hence, the data files aren't always in the same relative directories
depending on we're executing code from the source tree or from the
installation. To make it possible to run the code from both places,
I've seen different approaches:
- distutils trick in setup.py to modify the installed script (i.e.
changing a global variable value) so that it has a reference to the
data files location.
- Heuristic in the package code to detect when it's being executed
from the source tree and when it has been the installed
- Just using an environment variable that the user must set according
to his needs

I guess that there are other options, for example, maybe using
buildout. What would you say it's the best/more elegant option to
solve this problem?

Best regards,
Javier
 
L

Lawrence D'Oliveiro

Javier said:
- distutils trick in setup.py to modify the installed script (i.e.
changing a global variable value) so that it has a reference to the
data files location.

This seems to me to be the cleanest solution, at least as a default.
- Heuristic in the package code to detect when it's being executed
from the source tree and when it has been the installed

By definition, a "heuristic" can never be fully reliable.
- Just using an environment variable that the user must set according
to his needs

This can be useful as a way to override default settings. But requiring it
means extra trouble for the user.

For configurable settings, best to observe a hierarchy like the following
(from highest to lowest priority):

* command-line options
* environment-variable settings
* user prefs in ~/.whatever
* system configs in /etc
* common data in /usr/share
* hard-coded
 
R

ryles

Hello,

I would like to be able to run the main script in a python project
from both the source tree and the path in which it's installed on
Ubuntu. The script, among other things, imports a package which in
turns makes use of some data files that contains some metadata that is
needed in xml format.

The source tree has an structure such as this one:
setup.py
debian/ (packaging files)
src/ (source code)
src/lib (package files)
src/data (data files)
src/bin (main script)

However, when the project is installed using setup.py install, the
directory structure is approximately this way:
/usr/local/bin (main script)
/usr/local/share/<project_name> (data files)
/usr/local/lib/python2.x/dist-packages/<project_name> (library files)

And when installing the code through a package, the structure is the
same one, but removing "local".

Hence, the data files aren't always in the same relative directories
depending on we're executing code from the source tree or from the
installation. To make it possible to run the code from both places,
I've seen different approaches:
- distutils trick in setup.py to modify the installed script (i.e.
changing a global variable value) so that it has a reference to the
data files location.
- Heuristic in the package code to detect when it's being executed
from the source tree and when it has been the installed
- Just using an environment variable that the user must set according
to his needs

I guess that there are other options, for example, maybe using
buildout. What would you say it's the best/more elegant option to
solve this problem?

Best regards,
   Javier

It's kludgey, but one option may be to try and use __file__ to figure
out where the script is installed. Something like os.path.dirname
(os.path.abspath(__file__)) could tell you if it's in src/ or in the
bin/ directory, and then data files could be found in the appropriate
place.

I like the distutils/variable option better. Your script is more
likely to still behave correctly when copied to another directory.
Plus its code definitely remains cleaner.
 
C

Carl Banks

I've seen different approaches:
- distutils trick in setup.py to modify the installed script (i.e.
changing a global variable value) so that it has a reference to the
data files location.


One of my biggest complaints about distutils is that it doesn't do
this, a limitation made worse by the fact that distutils allows you to
specify an alternate data file directory, but your scripts have no way
to know what that alternate directory is installed. Which really
limits the usefulness of that functionality.

The most common way I've seen people work around this issue is to
throw their data files into the package directories. Yuck.

At one point I hacked up a setup.py file to look in the distutils data
structure and pull out the data install location, and wrote out a
trivial python file listing that location. (The distutils build data
structure is helpfully returned by setup function.) I never felt good
about it, and it wouldn't work if the install was done in steps (such
as build as user, install as superuser).

If you care to navigate the murky internals of distutils to do this in
a more elegant way, more power to you, and if so I'd recommend doing
it that way.

- Heuristic in the package code to detect when it's being executed
from the source tree and when it has been the installed
- Just using an environment variable that the user must set according
to his needs

I guess I'd combine these two. Make a sensible guess about where the
data files are by checking out the environment, but if the data files
aren't there (such as if the user installs to a different data
location) then they are responsible for setting an env variable or
config option.


I guess that there are other options, for example, maybe using
buildout. What would you say it's the best/more elegant option to
solve this problem?

Another option is not to use distutils at all. If you are writing an
application, I think that would be a good idea. I don't think
applications really need to be located in site-packages.

If you have any C-extensions it might complicate this.


Carl Banks
 
R

Robert Kern

One of my biggest complaints about distutils is that it doesn't do
this, a limitation made worse by the fact that distutils allows you to
specify an alternate data file directory, but your scripts have no way
to know what that alternate directory is installed. Which really
limits the usefulness of that functionality.

The most common way I've seen people work around this issue is to
throw their data files into the package directories. Yuck.

Huh. I always found that to be a more elegant solution than hardcoding the data
location into the program at install-time.

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco
 
L

Lawrence D'Oliveiro

Robert said:
Huh. I always found that to be a more elegant solution than hardcoding the
data location into the program at install-time.

Think in terms of a portable OS, written to run on multiple architectures.
Read-only data is architecture-dependent, that's why it pays to separate it
from the code.

Otherwise you end up with a system that needs a reinstall every time the
hardware changes.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,007
Latest member
obedient dusk

Latest Threads

Top