How to refer to data files without hardcoding paths?

Discussion in 'Python' started by Matthew Wilson, Sep 6, 2009.

  1. When a python package includes data files like templates or images,
    what is the orthodox way of referring to these in code?

    I'm working on an application installable through the Python package
    index. Most of the app is just python code, but I use a few jinja2
    templates. Today I realized that I'm hardcoding paths in my app. They
    are relative paths based on os.getcwd(), but at some point, I'll be
    running scripts that use this code, these open(...) calls will fail.

    I found several posts that talk about using __file__ and then walking
    to nearby directories.

    I also came across pkg_resources, and that seems to work, but I don't
    think I understand it all yet.

    Matt
    Matthew Wilson, Sep 6, 2009
    #1
    1. Advertising

  2. Matthew Wilson

    Dave Angel Guest

    Ben Finney wrote:
    > Matthew Wilson <> writes:
    >
    >
    >> Today I realized that I'm hardcoding paths in my app. They are
    >> relative paths based on os.getcwd(), but at some point, I'll be
    >> running scripts that use this code, these open(...) calls will fail.
    >>

    >
    > The conventional solution to this is:
    >
    > * Read configuration settings, whether directory paths or anything else,
    > from a configuration file of declarative options.
    >
    > * Have the program read that configuration file from one location (or a
    > small number of locations), and make those locations well-known in the
    > documentation of the program.
    >
    > Python's standard library has the ‘configparser’ module, which is one
    > possible implementation of this.
    >
    >

    Before you can decide what libraries to use, you need to determine your
    goal. Usually, you can separate the data files your application uses
    into two groups. One is the read-only files. Those ship with the
    application, and won't be edited after installation, or if they are,
    they would be deliberate changes by the administrator of the machine,
    not the individual user. Those should be located with the shipped .py
    and .pyc files.

    The other group (which might in turn be subdivided) is files that are
    either created by the application for configuration purposes (config
    files), or for the user (documents), or temp files (temp).

    The first files can/should be found by looking up the full path to a
    module at run time. Use the module's __file__ to get the full path, and
    os.path.dirname() to parse it.

    The second group of files can be located by various methods, such as
    using the HOMEPATH
    environment variable. But if there is more than one such location, one
    should generally create a config file first, and have it store the
    locations of the other files, after consulting with the end-user.

    Once you've thought about your goals, you should then look at supporting
    libraries to help with it. configparser is one such library, though both
    its name and specs have changed over the years.

    DaveA
    Dave Angel, Sep 6, 2009
    #2
    1. Advertising

  3. Matthew Wilson wrote:
    > When a python package includes data files like templates or images,
    > what is the orthodox way of referring to these in code?
    >
    > I'm working on an application installable through the Python package
    > index. Most of the app is just python code, but I use a few jinja2
    > templates. Today I realized that I'm hardcoding paths in my app. They
    > are relative paths based on os.getcwd(), but at some point, I'll be
    > running scripts that use this code, these open(...) calls will fail.
    >
    > I found several posts that talk about using __file__ and then walking
    > to nearby directories.
    >
    > I also came across pkg_resources, and that seems to work, but I don't
    > think I understand it all yet.
    >
    > Matt
    >


    sys.path[0] should give you the path to your script. By reading the
    documentation I would say it would give the path to the first script
    passed to the interpreter at launch, but after using it I find it also
    gives the current script path inside an imported file. So I use it to
    group the script files in my application into subdirectories, and import
    them as necessary from there.

    My app works regardless of the current working directory, and can import
    scripts and load icons from its various subdirectories.

    Still I would like to know why it works in imported scripts, since the
    doc page says sys.path[0] is the path to the script that caused the
    interpreter to launch. What would that mean ?


    Timothy Madden
    Timothy Madden, Sep 6, 2009
    #3
  4. En Sun, 06 Sep 2009 10:44:38 -0300, Timothy Madden
    <> escribió:
    > Matthew Wilson wrote:


    >> When a python package includes data files like templates or images,
    >> what is the orthodox way of referring to these in code?
    >> I also came across pkg_resources, and that seems to work, but I don't
    >> think I understand it all yet.

    >
    > sys.path[0] should give you the path to your script. By reading the
    > documentation I would say it would give the path to the first script
    > passed to the interpreter at launch, but after using it I find it also
    > gives the current script path inside an imported file. So I use it to
    > group the script files in my application into subdirectories, and import
    > them as necessary from there.


    No, I think you got it wrong. sys.argv[0] is the name of the script being
    executed; you can get its full path using os.path.abspath(sys.argv[0])
    sys.path[0] is the directory containing the script being executed right
    when the program starts. Later, any module is free to add and remove
    entries from sys.path, so one should not rely on sys.path[0] being that
    specific directory.

    What you refer as "script files" are actually modules, and they're
    imported, not executed. There is only one script being executed, the one
    named in the command line (either as `python scriptname.py` or
    `scriptname.py` or just `scriptname` or by double-clicking scriptname.py)

    > My app works regardless of the current working directory, and can import
    > scripts and load icons from its various subdirectories.
    >
    > Still I would like to know why it works in imported scripts, since the
    > doc page says sys.path[0] is the path to the script that caused the
    > interpreter to launch. What would that mean ?


    The script that is being executed, scriptname.py in the example above.
    Even if you later import module `foo` from package `bar`, sys.argv[0]
    doesn't change.

    To determine the directory containing the main script being executed, put
    these lines near the top of it:

    import os,sys
    main_directory = os.path.dirname(os.path.abspath(sys.argv[0]))

    You may locate other files relative to that directory. But that doesn't
    work if some components aren't actually on the filesystem (egg files,
    zipped libraries, or programs deployed using py2exe or similar). I prefer
    to use pkgutil.get_data(packagename, resourcename) because it can handle
    those cases too.

    --
    Gabriel Genellina
    Gabriel Genellina, Sep 8, 2009
    #4
  5. On Mon 07 Sep 2009 10:57:01 PM EDT, Gabriel Genellina wrote:
    > I prefer
    > to use pkgutil.get_data(packagename, resourcename) because it can handle
    > those cases too.


    I didn't know about pkgutil until. I thought I had to use setuptools to
    do that kind of stuff. Thanks!

    Matt
    Matthew Wilson, Sep 9, 2009
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Alen Smith

    datagrid hardcoding...

    Alen Smith, Jun 30, 2005, in forum: ASP .Net
    Replies:
    6
    Views:
    1,592
    Alen Smith
    Jun 30, 2005
  2. Gyruss
    Replies:
    6
    Views:
    4,070
    Kenneth P. Turvey
    Jun 20, 2005
  3. cgian31
    Replies:
    11
    Views:
    758
    Oliver Wong
    Oct 21, 2005
  4. Noah
    Replies:
    5
    Views:
    767
  5. Ohad Lutzky

    Paths, gentleman, paths

    Ohad Lutzky, Nov 6, 2006, in forum: Ruby
    Replies:
    2
    Views:
    185
    David Vallner
    Nov 7, 2006
Loading...

Share This Page