ConfigParser shootout, preliminary entry

M

Michael Chermside

A few weeks ago, the suggestion was made on Python-Dev that it might be time
to consider replacing the ConfigParser module and that we should hold a
"shootout" (ie ask for implementations and see what we get).

Since then I've been playing around with this... not the parsing part (which
so far I have completely ignored) but the programmer interface. There needs
to be a well-thought-out data model for the information stored, and the user
interface needs to be very easy to use, yet not so "magical" that it becomes
difficult to understand.

I have put together what I think is probably my best proposal. It is based
on a superset of ini config files and java .property files. There is a
convenient access mechanism ("config.my_app.some_value") as well as more
general approaches ("config.values['my_app.serviceByPort.80']"). I have
tried to consider issues like unicode (I permit fairly lenient mixing of
unicode and str), and unit testing ("... call config.clear_all() in the
tearDown() method of any unittests that use the config module..."). I have
even considered carefully what to leave OUT (converting to non-string data
types, interpolating values, things like that).

I think that I am now at the point where I could really use some input from
others. So I'd like to invite people to review my design and send me your
suggestions. I'm not expecting this as a *useful* module yet (it doesn't
yet parse files!), but it seemed like a good stage at which to ask for
feedback. I'm attaching two files, config.py and configTest.py, and they
are also available from these urls:

http://www.mcherm.com/publish/2004-10-17/config.py
http://www.mcherm.com/publish/2004-10-17/configTest.py

Thanks in advance for reviewing this.

-- Michael Chermside
 
T

Tim Daneliuk

Michael said:
A few weeks ago, the suggestion was made on Python-Dev that it might be time
to consider replacing the ConfigParser module and that we should hold a
"shootout" (ie ask for implementations and see what we get).

Since then I've been playing around with this... not the parsing part (which
so far I have completely ignored) but the programmer interface. There needs
to be a well-thought-out data model for the information stored, and the user
interface needs to be very easy to use, yet not so "magical" that it becomes
difficult to understand.

I have put together what I think is probably my best proposal. It is based
on a superset of ini config files and java .property files. There is a
convenient access mechanism ("config.my_app.some_value") as well as more
general approaches ("config.values['my_app.serviceByPort.80']"). I have
tried to consider issues like unicode (I permit fairly lenient mixing of
unicode and str), and unit testing ("... call config.clear_all() in the
tearDown() method of any unittests that use the config module..."). I have
even considered carefully what to leave OUT (converting to non-string data
types, interpolating values, things like that).

I think that I am now at the point where I could really use some input from
others. So I'd like to invite people to review my design and send me your
suggestions. I'm not expecting this as a *useful* module yet (it doesn't
yet parse files!), but it seemed like a good stage at which to ask for
feedback. I'm attaching two files, config.py and configTest.py, and they
are also available from these urls:

http://www.mcherm.com/publish/2004-10-17/config.py
http://www.mcherm.com/publish/2004-10-17/configTest.py

Thanks in advance for reviewing this.

-- Michael Chermside

I doubt it conforms to your thinking on the matter, but some time ago I
wrote and released such a creature of my own invention:

https://www.tundraware.com/Software/tconfpy/
 
I

Istvan Albert

From the docs:
> "The config module can read config files in Microsoft's ini file format,
> java's properties file format, or its own python config format -- these
> can even be mixed."

To me this does not sound appealing. People might just end
up being confused of what the actual file format is.
All these formats are so simple that supporting them all
only makes the usage more complicated.
> "as long as the components of the path are valid Python identifiers,
> there is a more convenient attribute syntax available:"

This means that some features can only be used if the parameter
names are valid python identifiers, right? When I put it that way,
it is a bit less attractive.

Istvan.
 
M

Michael Foord

Config file reading is an area where python is 'well served' with
various options.
For sheer simplicty of use you can't beat my ConfigObj. It reads the
file and presents the values as a dictionary (keyed by keyword of
course). It supports writing hte file as well. The trouble with with
using attribute names is that you will have problems with keywords
that are reserved - like 'print' and 'pass'.

Of course the obligatory URL
http://www.voidspace.org.uk/atlantibots/configobj.html

Regards,

Fuzzyman

http://www.voidpace.org.uk/atlantibots/pythonutils.html
 
M

Michael Foord

Istvan Albert said:
From the docs:


To me this does not sound appealing. People might just end
up being confused of what the actual file format is.
All these formats are so simple that supporting them all
only makes the usage more complicated.


I don't think this is a problem. ini file format is simple enough - if
people just want a simple config file format the following is
straightforward enough :

[section name]
keyword = value

However the fact that it can support alternative formats is a good
thing. The important thing then becomes how good is the documentation.
If you document the simple use case first with alternatives afterwards
it should be easy enough.

This means that some features can only be used if the parameter
names are valid python identifiers, right? When I put it that way,
it is a bit less attractive.

Istvan.

This is a valid point though. Dictionary syntax is good ;-)

Regards,

Fuzzy

http://www.voidspace.org.uk/atlantibots/pythonutils.html
 
E

Eric S. Johansson

Michael said:
Config file reading is an area where python is 'well served' with
various options.
For sheer simplicty of use you can't beat my ConfigObj. It reads the
file and presents the values as a dictionary (keyed by keyword of
course). It supports writing hte file as well. The trouble with with
using attribute names is that you will have problems with keywords
that are reserved - like 'print' and 'pass'.

well served yes but they are all basically the same. For camram, I
needed a different configuration file service and I put a wrapper around
the stock with a configuration file utility.

Configuration data is composite of three data sources, global, shadow,
and user. In global are the system defaults, shadow contains local
system overrides, and user contains user specific overrides. the global
configuration file contains the patterns describing the locations of the
shadow and user configuration files.

I was experimenting with some other features based on how people might
want to use and anti-spam system, I added the ability for the user to
delegate or merge certain aspects of their profile of another user.

I will admit it's a bit of a dogs breakfast because I've made so many
passes and tried out a bunch of different things over the past year.
One of these days I'll go back over it and simplify things.

---eric
 
S

Skip Montanaro

MC> A few weeks ago, the suggestion was made on Python-Dev that it might
MC> be time to consider replacing the ConfigParser module and that we
MC> should hold a "shootout" (ie ask for implementations and see what we
MC> get).

...

FM> Config file reading is an area where python is 'well served' with
FM> various options.

FM> For sheer simplicty of use you can't beat my ConfigObj.

Before we get too much stuff posted (which will just get lost), might I
suggest a ConfigurationParserShootout page for the wiki?

Thx,

Skip
 
D

Dan Gass

Mike,

Ditto on "I got one too". I wrote a configuration module that is more
powerful than any I've seen to date, yet it has a simple interface and
simple configuration file syntax. In particular it handles arbitrary
levels of hierarchical organizations of settings quite nicely. Its
drawback (and the reason it would probably never make it into the
standard python distribution) is that it has a security hole. The
configuration file format is Python (the configuration module just
executes it rather than parses it).

That said, you may want to look at it for ideas. I'd be happy to
participate in discussions.

Good Luck,
Dan Gass
 
M

Michael Chermside

Istvan comments:
This means that some features can only be used if the parameter
names are valid python identifiers, right? When I put it that way,
it is a bit less attractive.

I'm afraid that I can't take credit for the idea of using attribute
syntax for accessing config values... it was Guido's suggestion in
the first place. But I'm not interested in an "appeal to authority"
argument here -- I think I can provide a good argument for why the
attribute syntax is a good idea (so long as there is a different
syntax available also for use with possible non-identifiers).

Code that accesses configuration values usually uses a constant for
the "key". In other words, you often look up "config.get('my_app.maxRows')",
but rarely something like "config.get('my_app.%s' % some_var)".
Furthermore, the keys themselves are rarely determined by external
forces... the programmer is usually free to select whatever name she
likes. Because of this, it is (almost always) quite easy to ensure that
only valid identifiers are used. With little downside, the convenience
to the programmer of typing "config.my_app.maxRows" rather than
"config.values['my_app.maxRows']" or "config.get('my_app.maxRows') is
worth considering.

Of course, there are still a few issues. One is explicitness... it's
unwise to have too much "magic". On the other hand, the more common
something becomes the more it becomes an idiom in its own right, and
having a somewhat "magical" syntax is more acceptable. For instance,
something like "dup_list = a_list[:]" looks like some kind of peculiar
smiley the first time any user sees it, but once you're used to it
you instantly recognize it as the Python idiom for making a copy.
(Although the new idiom: "dup_list = list(a_list)" is probably better.)
Features like logging and configuration which are ubiquitous are good
canidates for "convenience syntax".

And the other issue is that SOMETIMES one DOES use arbitrary strings
within config files. It isn't always a good idea to tell users to enter
something like "my_app.<name-of-the-server>.numConnections", but when
you DO something like this the server could be named "def", or "44832"
or even something truly dangerous like u"xp\u01a9" or "". So there
MUST be some OTHER means of accessing values in addition to the
attribute syntax so that arbitrary values can be passed.

Anyway... I certainly think that the matter is less than clear-cut, but
in my opinion, this is one of those areas where it's worth having a
little well-bounded "magic" to make the code that much more readable.

-- Michael Chermside
 
M

Michael Foord

Anyway... I certainly think that the matter is less than clear-cut, but
in my opinion, this is one of those areas where it's worth having a
little well-bounded "magic" to make the code that much more readable.

-- Michael Chermside

Of course the good thing about python is that doing both is easy.
Using __setattr__ and sublassing dict will provide both methods with
minimal fuss.

I subclass caselessDict to get a case insensitive dictionary access -
but have been thinking about adding attributes - with a little magic
to avoid overriding any existing methods/attributes. (Just do a
dir(config) on an empty config object and that is the list of reserved
names...)

Regards,

Fuzzy

http://www.voidspace.org.uk/atlantibots/pythonutils.html
 
S

Sion Arrowsmith

Michael Chermside said:

One thing that's really obviously missing from this as an interface,
possibly because you're seeing it as a parsing issue, is how does
config know what file (or files) the data is in? That is, given:

config.my_app.some_value

how is it going to map between "my_app" and a file name? I ask
through bitter experience of having to work around wx's "wxConfig
knows best" to supply alternative and shared configurations.
 
P

Paramjit Oberoi

Since then I've been playing around with this... not the parsing part (which
so far I have completely ignored) but the programmer interface. There needs
to be a well-thought-out data model for the information stored, and the user
interface needs to be very easy to use, yet not so "magical" that it becomes
difficult to understand.

From the me-too department:

For me personally, the problems with ConfigParser were inconvenient value
access/update and lack of order-preservation by the INI parser/writer. I
have developed my own solution which:

* Is convenient: both attribute access as well as container syntax can be
used to access as well as update values:

from cfgparse.iniparser import ini_namespace
conf = ini_namespace()
conf.user.name = 'Oberoi'
print conf['user']['name']
'Oberoi'

* Preserves order/indentation/spacing when the INI file is updated and
when the data is accessed.

* I have given some thought to the data model and interface question, but
probably not as much as you. The data model simply is that of arbitrarily
nested namespaces, each containing values (although each specific
implementation of the model can have it's own restrictions - for example,
INI files don't allow top-level values). The abstract model/interface
allos conversion from one config format to another.

* I have some code written using the ConfigParser API which I didn't want
to have to update, so an API-compatible interface is also available which
implements the old interface, as well as allowing access to the new one.

The code can be found here:

http://www.cs.wisc.edu/~param/software/cfgparse/

-param
 
C

Carlos Ribeiro

Since then I've been playing around with this... not the parsing part (which
so far I have completely ignored) but the programmer interface. There needs
to be a well-thought-out data model for the information stored, and the user
interface needs to be very easy to use, yet not so "magical" that it becomes
difficult to understand.
From the me-too department:

For me personally, the problems with ConfigParser were inconvenient value
access/update and lack of order-preservation by the INI parser/writer. I
have developed my own solution which:

* Is convenient: both attribute access as well as container syntax can be
used to access as well as update values:

from cfgparse.iniparser import ini_namespace
conf = ini_namespace()
conf.user.name = 'Oberoi'
print conf['user']['name']
'Oberoi'

* Preserves order/indentation/spacing when the INI file is updated and
when the data is accessed.

* I have given some thought to the data model and interface question, but
probably not as much as you. The data model simply is that of arbitrarily
nested namespaces, each containing values (although each specific
implementation of the model can have it's own restrictions - for example,
INI files don't allow top-level values). The abstract model/interface
allos conversion from one config format to another.

* I have some code written using the ConfigParser API which I didn't want
to have to update, so an API-compatible interface is also available which
implements the old interface, as well as allowing access to the new one.

The code can be found here:

http://www.cs.wisc.edu/~param/software/cfgparse/

-param


--
Carlos Ribeiro
Consultoria em Projetos
blog: http://rascunhosrotos.blogspot.com
blog: http://pythonnotes.blogspot.com
mail: (e-mail address removed)
mail: (e-mail address removed)
 
W

WakeBdr

So, I've been working on this a little yesterday and today and I've got
something that I like so far. I just thought about a "feature" and I'd
like to get some opinions on it. Should the ConfigParser object be
able to read and store configs from multiple config files or should
each config file require a new ConfigParser object?

If the ConfigParser object could read and store multiple files, I see
referencing it like so

# Load the config files
cpObj.load(file0)
cpObj.load(file1)

# get values from them
ret0 = cpObj.file0.property0
ret1 = cpObj.file1.property0

where file0 and file1 can be whatever you would like to name them. I
ask this because I like to put a lot of information in config files
making my code as configurable as possible. Sometimes, managing the
config file can be as messy as managing the code itself. So, breaking
the config files up into several smaller files may be handy.
Any thoughts on this?
 
W

WakeBdr

I've given this some thought yesterday and today and would like to pose
another question. Should the configparser be able to read and store
properties from multiple config files or should each config file
require a separate configparser?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,482
Members
44,900
Latest member
Nell636132

Latest Threads

Top