why use special config formats?

T

tomerfiliba

hey

i've been seeing lots of config-file-readers for python. be it
ConfigObj (http://www.voidspace.org.uk/python/configobj.html) or the
like. seems like a trend to me.
i came to this conclusion a long time ago: YOU DON'T NEED CONFIG FILES
FOR PYTHON. why re-invent stuff and parse text by yourself, why the
interpreter can do it for you? and anyway, i find this a very ugly
format:
http://www.voidspace.org.uk/python/configobj.html#the-config-file-format

there are two use cases for configuration: static vs. dynamic
configuration.

for the most common case, static configuration, you just have a
human-edited config file holding key-and-value pairs. so just add to
your package a file called config.py, and import it.

for example, if that's our package structure:
PyApache/
__init__.py
config.py
server.py

then server.py would do:
....
import config
listener_sock.bind((config.host, config.port))
....

and config.py would look like:
# the port to bind to
port = 80
host = "localhost"
timeout = 300
enable_keep_alives = False
options = [1, 2, 3]
....

isn't python suitable enough to hold your configuration?

the second case, dynamic configuration, is when you need to alter your
configuration at runtime or programatically, so the configuration
doesnt need to be human-readable. for that case -- use pickle. and
Bunch (as shown on the aspn python cookbook)

class Bunch(object):
def __init__(self, **kw):
self.__dict__.update(kw)

create the initial config file:
config = Bunch(port = 80, host = "localhost", timeout = 300, ...)
pickle.dump(open("config.pkl", "wb"), config)

of course you can nest Bunch'es inside one another, i.e.,
config = Bunch(
# global config
port = 80,
host = "localhost",

# this is per-user configuration
users = {
"malcom_x" : Bunch(
http_path = "/home/joe/httpdocs",
cgi_path = "/home/joe/cgi-bin",
options = ["i love lucy", "bush is gay"]
),
...
},
...
)

and now you use it:
# global configuration
config = pickle.load(open("config.pkl"))
listener_sock.bind((config.host, config.port))
# and per-user configuration
from getpass import getuser
print config.users[getuser()].http_path
....

that way, if you need to programatically change your configuration,
just change and pickle.dump() it.

hope it helps,
-tomer
 
S

Sybren Stuvel

(e-mail address removed) enlightened us with:
i came to this conclusion a long time ago: YOU DON'T NEED CONFIG
FILES FOR PYTHON. why re-invent stuff and parse text by yourself,
why the interpreter can do it for you?

Because you generally don't want to give the configuration file writer
full control over the Python virtual machine.
for the most common case, static configuration, you just have a
human-edited config file holding key-and-value pairs. so just add to
your package a file called config.py, and import it.

Which only works if there is only one configuration file per
installation of your package, and is writable by the users that need
to configure it. For example, per-user database connection parameters
should be in $HOME/.programrc on UNIX systems. A program's preference
settings should be stored in a user-writable file to, preferably in
the user's homedir.

Sybren
 
T

tomerfiliba

if you are really so scared of letting others exploit your config
scripts, then use the second, pickled fashion. that way you can store
the file at $HOME/blah-config.pkl, and everybody's happy.

still, my point is we dont need special config mechanisms, since the
builtin ones, like object persistency (sp) or python scripts are good
enough, less buggy, and dont require you to learn thousands of config
formats.

and you can even edit pickled files by hand (protocol 0 i believe).
it's not that complicated.


-tomer
 
S

Steve Holden

if you are really so scared of letting others exploit your config
scripts, then use the second, pickled fashion. that way you can store
the file at $HOME/blah-config.pkl, and everybody's happy.
Except the user who wants to edit the config file.
still, my point is we dont need special config mechanisms, since the
builtin ones, like object persistency (sp) or python scripts are good
enough, less buggy, and dont require you to learn thousands of config
formats.

and you can even edit pickled files by hand (protocol 0 i believe).
it's not that complicated.
Fine. Kindly write the "How to Edit Your Configuration" instructions for
naive users. I think you might find they object to such an obscure format.

regards
Steve
 
J

Joel Hedlund

I agree with Steve and I agree Sybren.

Also:
This is a Bad Idea, since you should never add more complexity than needed. Imports, computation, IO and so on are generally not needed for program configuration, so standard configfile syntax should therefore not allow it. Otherwise you may easily end up with hard-to-debug errors, or even worse - weird program behavior.

/Joel
 
S

Sybren Stuvel

(e-mail address removed) enlightened us with:
if you are really so scared of letting others exploit your config
scripts, then use the second, pickled fashion. that way you can
store the file at $HOME/blah-config.pkl, and everybody's happy.

Ehm... and how is a user supposed to edit that? I wouldn't be happy...
still, my point is we dont need special config mechanisms, since the
builtin ones, like object persistency (sp) or python scripts are
good enough, less buggy, and dont require you to learn thousands of
config formats.

Oh, and the ConfigParser module requires you to learn *thousands* of
config formats. Right.

I think you need to get real.

Sybren
 
F

Fredrik Lundh

isn't python suitable enough to hold your configuration?

that depends on the target application, and, more importantly, the
target audience and what kind of configuration they're expected to
do.

there's no "one rule to rule them all" for configuration issues.

(except, possibly, that zero configuration is often easier to use than
any configuration file format...)

</F>
 
T

tomerfiliba

i dont know about your experience with config files, but there
thousands of formats. on the python side -- just in this conversation,
we mentioned ConfigObj, ConfigParser and the Config module i linked to.
when everybody writes his own config, you get loads of unique formats.

anyway, for all the cry-babies here that can't edit pickle files. okay
-- just load() them, change what you want, and dump() them. don't cry.

and if you insist, i'm sure there's a python serializer to
XML/SOAP/whatever other readble format. persistency is far better for
configuration than config files. they are limited, have weird syntaxes,
hard to extend, and are never generic enough. with my approach --
anything you can do in python, or anything you can pickle -- is
possible.

and for security issues -- usually config files are edited by admins,
so that's not a problem. and per-user config files (at $HOME), can
easily be achieved with execfile(). the point is NOT TO WRITE A PARSER
for every config file.

you can easily crash your web server (or make it non functional) if you
pass an invalid port or host, or make it act weird by changing the
timeouts or paths... so yeah, if the admin writes a config script that
does os.system("rm -rf /"), well, too bad. but then the admin can do
stupid things at the shell level as well.

again -- the points are:
* python is readable and easy to write config files with
* usually admins change the configuration, and they have too much power
anyway
* if you worry about security/too much power, pickle your config
* if you need to edit your pickled config on a regular basis, serialize
it with some other textual serializer (xml, etc).

but inventing proprietary formats with unique syntaxes, and having to
write and debug parsers for them -- that's stupid. a configuration is
just a persistent state of your program. it shouldnt be any more
complex than that.

-tomer
 
S

Sybren Stuvel

(e-mail address removed) enlightened us with:
i dont know about your experience with config files, but there
thousands of formats.

All the config files I needed were either very easy to learn, or well
documented in comments.
on the python side -- just in this conversation, we mentioned
ConfigObj, ConfigParser and the Config module i linked to. when
everybody writes his own config, you get loads of unique formats.

Hence the Python modules.
anyway, for all the cry-babies here that can't edit pickle files.
okay -- just load() them, change what you want, and dump() them.
don't cry.

You really need to get real here. Configuration files are for *users*,
not programmers. You can't expect a user to learn about Python in
general and about pickle in specific.
and if you insist, i'm sure there's a python serializer to
XML/SOAP/whatever other readble format.

Which then gives you another configuration format to learn...
and for security issues -- usually config files are edited by
admins, so that's not a problem.

You go explain that to someone who just wants to edit his mail
client's config file.
and per-user config files (at $HOME), can easily be achieved with
execfile().

Which is then totally insecure. An exploit can easily be made then -
just inject a rootkit downloading & starting script into someone's
email client configuration file and boom, computer is hacked.
the point is NOT TO WRITE A PARSER for every config file.

Hence standard config file formats and parser modules.
* usually admins change the configuration, and they have too much
power anyway

Admins have too much power? Go get an education.
* if you worry about security/too much power, pickle your config

Sure, and where would you keep your comments explaining the
configuration fields?
but inventing proprietary formats with unique syntaxes, and having
to write and debug parsers for them -- that's stupid.

Which is why there are standard modules for them.

Sybren
 
F

Fuzzyman

if you are really so scared of letting others exploit your config
scripts, then use the second, pickled fashion. that way you can store
the file at $HOME/blah-config.pkl, and everybody's happy.

still, my point is we dont need special config mechanisms, since the
builtin ones, like object persistency (sp) or python scripts are good
enough, less buggy, and dont require you to learn thousands of config
formats.

Well... ConfigObj uses the same format as ConfigParser, which the basic
ini style.

The message is that config files are for users, and so should be in a
format convenient for them - not for the machine.

Call your users cry-babies if yu want, you won't have many...
and you can even edit pickled files by hand (protocol 0 i believe).
it's not that complicated.

If you're happy with a hardwired config file, you don't need a config
file at all.

Fuzzyman
http://www.voidspace.org.uk/python/index.shtml
 
S

Steven D'Aprano

hey

i've been seeing lots of config-file-readers for python. be it
ConfigObj (http://www.voidspace.org.uk/python/configobj.html) or the
like. seems like a trend to me.
i came to this conclusion a long time ago: YOU DON'T NEED CONFIG FILES
FOR PYTHON.

Of course you do.

Sometimes you have to be able to parse and understand existing config
files that have come from somewhere else. If your task is "read and parse
a .ini file", insisting the user re-writes their ini file as Python code
isn't helpful.

Separating code from data is always a good idea. I hope I don't need to
explain why. So you want config files, the only question is, what format
should they be in?

Sometimes it can be useful, especially for quick and dirty apps, to use a
Python module as a config file. But that's not a good idea for production
level code where end users are expected to edit the data:

# config.py
value = 2.5
colour = "blue"

The user edits value to 3.1, but accidentally puts in a typo "3,1".
Now when your application imports the config.py module, it silently
assigns the tuple (3, 1) to value, and your app dies an unpleasant death
somewhere a long way away. You have no idea why.

So you end up coding defensively to protect against user typos or
stupidity (and believe me, even if your users are technically minded IT
professionals, they will screw up your config files):

# config.py
import logger, sys
value = 2.5 # warning: value must be a float
colour = "blue" # warning: colour must be one of "red", "blue", "green"
# warning: quotes are compulsory
try:
colour = colour.strip()
except AttributeError:
pass
if type(value) != float or value < 0.0:
logger.log("Value is %s" % value)
print >>sys.stderr("Bad value, using default")
value = 2.5
if colour not in ("blue", "red", "green"):
logger.log("Colour is %s" % value)
print >>sys.stderr("Bad colour, using default")
colour = "bleu" # oops, a bug

and now your config file is code instead of data, and you expect your
users to hack code to change a default value. B--A--D idea.

Using a data config file means you can separate the user-editable data
from the code that verifies that it has sensible values. Your config file
becomes simple again:

# config.py
value = 2.5
colour = "blue"

and your users no longer have to wade through complex code to change a few
defaults, but you still have full flexibility to vet their data.
why re-invent stuff and parse text by yourself, why the
interpreter can do it for you? and anyway, i find this a very ugly
format:
http://www.voidspace.org.uk/python/configobj.html#the-config-file-format

You are joking right? Pulling our legs?

Here is the config file format, according to the link you supply:

# comment line
# comment line
keyword = value # inline comment

Here is the equivalent written in pure Python:

# comment line
# comment line
keyword = value # inline comment


Why is the first uglier than the second?
 
S

Steven D'Aprano

you can easily crash your web server (or make it non functional) if you
pass an invalid port or host, or make it act weird by changing the
timeouts or paths... so yeah, if the admin writes a config script that
does os.system("rm -rf /"), well, too bad.

Not if the code is being run on YOUR webserver and the config file is
being edited on some compromised PC in Romania.

again -- the points are:
* python is readable and easy to write config files with
* usually admins change the configuration, and they have too much power
anyway

So why do you want to give them MORE power?
* if you worry about security/too much power, pickle your config

Huh? You think a competent sys admin can't learn enough Python to hack
your pickled file?

Binary configs only keep out legitimate users who don't have the time or
ability to learn how to hack the binary format. Black hats and power users
will break your binary format and hack them anyway.
* if you need to edit your pickled config on a regular basis, serialize
it with some other textual serializer (xml, etc).

But you forget the most important point of all:

* keep your data separate from your code.

but inventing proprietary formats with unique syntaxes, and having to
write and debug parsers for them -- that's stupid. a configuration is
just a persistent state of your program. it shouldnt be any more
complex than that.

Exactly. And that's why we have two or three common config file formats,
such as xml, ini files, etc. Pick one of them and stick to it.
 
G

gangesmaster

Huh? You think a competent sys admin can't learn enough Python to hack
your pickled file?

Binary configs only keep out legitimate users who don't have the time or
ability to learn how to hack the binary format. Black hats and power users
will break your binary format and hack them anyway.

then you dont know what pickle is. pickle code is NOT python bytecode.
it's a bytecode they made in order to represent objects. you cannot
"exploit" in in the essence of running arbitrary code, unless you find
a bug in the pickle module. and that's less likely than you find a bug
in the parser of the silly config file formats you use.

i'm not hiding the configuration in "binary files", that's not the
point. pickle is just more secure by definition.

aah. you all are too stupid.


-tomer
 
G

gangesmaster

Why is the first uglier than the second?
YES THATS THE POINT. PYTHON CAN BE USED JUST LIKE A CONFIG FILE.

and if your users did
timeout = "300"
instead of
timeout = 300

then either your config parser must be uber-smart and all-knowing, and
check the types of key-value pairs, or your server would crash. either
way is bad, and i prefer crash-on-use then
know-everything-and-check-at-the-parser-level.



good night,
-tomer
 
F

Fredrik Lundh

gangesmaster said:
then you dont know what pickle is. pickle code is NOT python bytecode.
it's a bytecode they made in order to represent objects. you cannot
"exploit" in in the essence of running arbitrary code

import pickle
print pickle.loads("cos\nsystem\np0\n(S'echo really?'\np1\ntp2\nRp3\n.")

</F>
 
S

Sybren Stuvel

gangesmaster enlightened us with:
YES THATS THE POINT. PYTHON CAN BE USED JUST LIKE A CONFIG FILE.

AND CAN ALSO BE MISUSED AND HARDER TO USE THAN A SIMPLE CONFIG FILE.
Get it into your thick head that you're plain wrong here.

Sybren
 
S

Steve Holden

gangesmaster said:
then you dont know what pickle is. pickle code is NOT python bytecode.
it's a bytecode they made in order to represent objects. you cannot
"exploit" in in the essence of running arbitrary code, unless you find
a bug in the pickle module. and that's less likely than you find a bug
in the parser of the silly config file formats you use.

i'm not hiding the configuration in "binary files", that's not the
point. pickle is just more secure by definition.

aah. you all are too stupid.
Great way to win an argument. Pity we aren't as intelligent as you ...

regards
Steve
 
S

Steven D'Aprano

YES THATS THE POINT. PYTHON CAN BE USED JUST LIKE A CONFIG FILE.

and if your users did
timeout = "300"
instead of
timeout = 300

then either your config parser must be uber-smart and all-knowing, and
check the types of key-value pairs, or your server would crash. either
way is bad, and i prefer crash-on-use then
know-everything-and-check-at-the-parser-level.

Well, I think this puts a new light on the argument from Tomer: he'd
prefer his server to crash than to spend some time validating his data.

Would you mind telling us what software you've been involved in writing,
so we know what software to avoid?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,770
Messages
2,569,583
Members
45,073
Latest member
DarinCeden

Latest Threads

Top