My first Python program

S

Seebs

So, I'm new to Python, though I've got a bit of experience in a few other
languages. My overall impressions are pretty mixed, but overall positive;
it's a reasonably expressive language which has a good mix between staying
out of my way and taking care of stuff I don't want to waste attention on.

My first project was to replace a shell script with a Python script. The
context is a project ("pseudo") which does some really hairy stuff in C.
Part of what it does involves creating hundreds of stub functions. The
existing shell script did this successfully, but wasn't particularly
fast, and was basically unmaintainable. So I've redone it in Python.

The input is a list of C function declarations, such as:
int foo(char *s);
and the output is several files, which include lists of these functions,
declarations for wrappers for them, and so on. So that would produce
something to the effect of:
int foo(char *s) {
/* various magic */
int rc = -1;
/* stuff happens */
rc = wrap_foo(s);
/* more magic */
return rc;
}

Where it gets complicated is that there are, of course, special cases;
for instance, the wrapper for 'int open(char *path, int mode, int flags)' has
to know that the flags argument is conditional, and not always provided, so
it declares open as "int open(char *path, int mode, ...)", then extracts
flags using a va_list. Weird stuff ensues. It's a pretty weird hunk of
code.

The source in its current form:

http://github.com/wrpseudo/pseudo/blob/master/makewrappers

The underlying task is fairly ugly, and it's my first Python project,
so the result isn't particularly pretty, but I'd be interested in
feedback on it. However, I'm not at all sure whether it's appropriate for
this group to post 467 lines of code with no question beyond "how am
I screwing this up?"

At this point, it does everything I want it to do, so the question isn't
"how can I do this?", but "how should I have done this more idiomatically?"

There's a few quirks; one is that I have to run on whatever Python happens
to be installed on a variety of systems, from RHEL4 to Fedora 13 or so.
(It is, at least for now, completely unimportant whether I run on non-Linux
systems.) I can't rely on packages or extensions which aren't going to
be in the base Python install.

Apart from that... I'm interested in feedback. I'm not expecting that
this is good or idiomatic Python, but I'd like to learn to write Python
correctly and expressively, and there's nothing like criticism to improve
code. And if people would prefer that I post the code here, I could,
I just figured that it was a bit large.

-s
 
J

Jonas H.


Just a few pointers, looks quite good to me for a newbie :)

* Less action in __init__.
* Use `open` instead of `file` to open a file
* Have a look at context managers for file handling (avoids doing
error-prune stuff like __del__)
* Your `del` in line 464 is useless. A reference will be removed from
the object bound to the local variable 'source' anyway because of the
re-assignment.
* according to common Python style guides you should not use underscores
in class names.
* no need for 'r' in `open` calls ('r' is the default mode)
* `line == ''` can be written more pythonic as `not line`
* `str.{r,l,}strip` removes '\n\t\r ' by default, no need for an
argument here (line 440 for example)
* you might want to pre-compile regular expressions (`re.compile`)
* raising `Exception` rather than a subclass of it is uncommon.

Hope that helps :)

Jonas
 
C

Chris Rebert

So, I'm new to Python, though I've got a bit of experience in a few other
languages.  My overall impressions are pretty mixed, but overall positive;
it's a reasonably expressive language which has a good mix between staying
out of my way and taking care of stuff I don't want to waste attention on..

My first project was to replace a shell script with a Python script.
The source in its current form:

http://github.com/wrpseudo/pseudo/blob/master/makewrappers

The underlying task is fairly ugly, and it's my first Python project,
so the result isn't particularly pretty, but I'd be interested in
feedback on it.  However, I'm not at all sure whether it's appropriate for
this group to post 467 lines of code with no question beyond "how am
I screwing this up?"

At this point, it does everything I want it to do, so the question isn't
"how can I do this?", but "how should I have done this more idiomatically?"

1.
class SourceFile(object):
"A template for creating a source file"

Docstrings are typically triple-quoted, so ==>

class SourceFile(object):
"""A template for creating a source file"""

2.
self.f = file(path, 'r')
if not self.f:
return None

The "if" here is pointless; I'm reasonably sure files are always
considered boolean true.

3.
if not section in self.sections:

Python's grammar has "not in" as an "operator" for just such occasions ==>

if section not in self.sections:

4.
def emit(self, template, func = None):

PEP 8: "Don't use spaces around the '=' sign when used to indicate
[...] a default parameter value." ==>

def emit(self, template, func=None):


I stopped reading around line 272.

Cheers,
Chris
 
M

MRAB

So, I'm new to Python, though I've got a bit of experience in a few other
languages. My overall impressions are pretty mixed, but overall positive;
it's a reasonably expressive language which has a good mix between staying
out of my way and taking care of stuff I don't want to waste attention on.

My first project was to replace a shell script with a Python script. The
context is a project ("pseudo") which does some really hairy stuff in C.
Part of what it does involves creating hundreds of stub functions. The
existing shell script did this successfully, but wasn't particularly
fast, and was basically unmaintainable. So I've redone it in Python.

The input is a list of C function declarations, such as:
int foo(char *s);
and the output is several files, which include lists of these functions,
declarations for wrappers for them, and so on. So that would produce
something to the effect of:
int foo(char *s) {
/* various magic */
int rc = -1;
/* stuff happens */
rc = wrap_foo(s);
/* more magic */
return rc;
}

Where it gets complicated is that there are, of course, special cases;
for instance, the wrapper for 'int open(char *path, int mode, int flags)' has
to know that the flags argument is conditional, and not always provided, so
it declares open as "int open(char *path, int mode, ...)", then extracts
flags using a va_list. Weird stuff ensues. It's a pretty weird hunk of
code.

The source in its current form:

http://github.com/wrpseudo/pseudo/blob/master/makewrappers

The underlying task is fairly ugly, and it's my first Python project,
so the result isn't particularly pretty, but I'd be interested in
feedback on it. However, I'm not at all sure whether it's appropriate for
this group to post 467 lines of code with no question beyond "how am
I screwing this up?"

At this point, it does everything I want it to do, so the question isn't
"how can I do this?", but "how should I have done this more idiomatically?"

There's a few quirks; one is that I have to run on whatever Python happens
to be installed on a variety of systems, from RHEL4 to Fedora 13 or so.
(It is, at least for now, completely unimportant whether I run on non-Linux
systems.) I can't rely on packages or extensions which aren't going to
be in the base Python install.

Apart from that... I'm interested in feedback. I'm not expecting that
this is good or idiomatic Python, but I'd like to learn to write Python
correctly and expressively, and there's nothing like criticism to improve
code. And if people would prefer that I post the code here, I could,
I just figured that it was a bit large.
The code does require Python 2 and the use of except ... as ... requires
at least version 2.6.


Line 51

The __init__ method should always return None. There's no need to be
explicit about it, just use a plain "return".


Line 68

Instead of:

if not section in self.sections:

use:

if section not in self.sections:


Line 78

This:

file(self.path, 'w')

will never return None. If it can't open the file then it'll raise an
exception.

The error message says:

"Couldn't open %s to read a template."

but it's opening the file for writing.


Line 82

You can't really rely on the destructor __del__ being called.


Line 333

Shouldn't you be checking that the name of the attribute you're setting
doesn't clash with one of the existing attributes? Are you sure that a
dict wouldn't be a better idea?


Line 447

The form:

except ... as ...

is in Python versions >= 2.6, but not earlier.


Line 464

This use of del just deletes the name from the namespace and won't
necessarily call the __del__ method of the 'source' object. It's better
to rely on something more explicit like a 'close' method. (If you can't
be sure which version of Python it'll use then context managers are
probably out anyway!)
 
M

MRAB

Just a few pointers, looks quite good to me for a newbie :)

* Less action in __init__.
* Use `open` instead of `file` to open a file
* Have a look at context managers for file handling (avoids doing
error-prune stuff like __del__)
* Your `del` in line 464 is useless. A reference will be removed from
the object bound to the local variable 'source' anyway because of the
re-assignment.
* according to common Python style guides you should not use underscores
in class names.
* no need for 'r' in `open` calls ('r' is the default mode)
* `line == ''` can be written more pythonic as `not line`
* `str.{r,l,}strip` removes '\n\t\r ' by default, no need for an
argument here (line 440 for example)
* you might want to pre-compile regular expressions (`re.compile`)

The regex will be cached anyway by the re module so it's not a big
problem.
 
H

Hallvard B Furuseth

Seebs said:
self.f = file(path, 'r')
if not self.f:
return None

No. Failures tend to raise exceptions, not return error codes.
Except in os.path.exists() & co.

$ pythonTraceback (most recent call last):

So,
import errno
...
try:
self.f = file(path, 'r')
except IOError:
if e.errno != errno.ENOENT: raise # if you are picky
return None

Nitpicks:
if not section in self.sections:

if section not in self.sections:
list = map(lambda x: x.call(), self.args)
return ', '.join(list)

return ', '.join([x.call() for x in self.args])
self.type, self.name = None, None

Actually you can write self.type = self.name = None,
though assignment statements are more limited than in C.
(And I think they're assigned left-to-right.)
match = re.match('(.*)\(\*([a-zA-Z0-9_]*)\)\((.*)\)', text)

Make a habit of using r'' for strings with lots of backslashes,
like regexps.
 
H

Hallvard B Furuseth

I said:
except IOError:
if e.errno != errno.ENOENT: raise # if you are picky

Argh, I meant "except IOError, e:". That's for Python 2 but not
Python 3. "except IOError as e:" works on Python 2.6 and above.
 
S

Seebs

Just a few pointers, looks quite good to me for a newbie :)
Thanks!

* Less action in __init__.

I'm a bit curious about this. The __init__ functions in this are, at
least for now, pretty much doing only what's needed to create the objects
from their inputs.
* Use `open` instead of `file` to open a file
Done.

* Have a look at context managers for file handling (avoids doing
error-prune stuff like __del__)

Okay. I wasn't at all sure the __del__ was needed.
* Your `del` in line 464 is useless. A reference will be removed from
the object bound to the local variable 'source' anyway because of the
re-assignment.

Oh. D'oh. You can see the C programmer instinct there; that was the
sort of thing that would, in C, be:
for (i = 0; i < n; ++i)
free(array);

But of course, that doesn't work. I guess that leads to a general question:
Is it safe for me to assume that all my files will have been flushed and
closed? I'd normally assume this, but I seem to recall that not every
language makes those guarantees.
* according to common Python style guides you should not use underscores
in class names.

Makes sense. I was finding "CArgument" hard to read, and couldn't think
of a better name.
* no need for 'r' in `open` calls ('r' is the default mode)
'k.

* `line == ''` can be written more pythonic as `not line`
'k.

* `str.{r,l,}strip` removes '\n\t\r ' by default, no need for an
argument here (line 440 for example)

Ahh, good to know.
* you might want to pre-compile regular expressions (`re.compile`)

Thought about it, but decided that it was probably more complexity than I
need -- this is not a performance-critical thing. And even if it were, well,
I'm pretty sure it's I/O bound. (And on my netbook, the time to run this
is under .2 seconds in Python, compared to 15 seconds in shell, so...)
* raising `Exception` rather than a subclass of it is uncommon.

Okay. I did that as a quick fix when, finally having hit one of them,
I found out that 'raise "Error message"' didn't work. :) I'm a bit unsure
as to how to pick the right subclass, though.

-s
 
S

Seebs

2.
self.f = file(path, 'r')
if not self.f:
return None

The "if" here is pointless; I'm reasonably sure files are always
considered boolean true.

I actually seem to have done this wrong anyway -- I was thinking in
terms of the C-like idiom of returning NULL when a constructor-like thing
fails. This ought to have been a raise of some sort to prevent the caller
from getting an object that didn't work out.
Python's grammar has "not in" as an "operator" for just such occasions ==>

Ahh!

-s
 
S

Seebs

The code does require Python 2 and the use of except ... as ... requires
at least version 2.6.
Whoops.

Line 51
The __init__ method should always return None. There's no need to be
explicit about it, just use a plain "return".

The real issue here is that I was assuming that open('nonexistent') returned
None rather than raising an exception.
The error message says:
"Couldn't open %s to read a template."
but it's opening the file for writing.

Ahh, my famed attention to detail strikes again. :)
You can't really rely on the destructor __del__ being called.

Interesting. Do I just rely on files getting closed?
Shouldn't you be checking that the name of the attribute you're setting
doesn't clash with one of the existing attributes? Are you sure that a
dict wouldn't be a better idea?

Actually, not sure at all. There's a sort of gradual evolution going on
here, in that I was trying to avoid having a separate dict that I had to
populate with a bunch of stuff -- thus the switch to a hand-written
__getitem__. What I probably ought to do/have done is use a dict first,
and set things in the dict from here, then check the dict first, then
regular getattr, and so on.
Line 447

The form:

except ... as ...

is in Python versions >= 2.6, but not earlier.

Oops. Is there a way to interact with the caught exception in earlier
Python?
This use of del just deletes the name from the namespace and won't
necessarily call the __del__ method of the 'source' object. It's better
to rely on something more explicit like a 'close' method. (If you can't
be sure which version of Python it'll use then context managers are
probably out anyway!)

Okay, good to know. I'll fix that.

-s
 
S

Seebs

list = map(lambda x: x.call(), self.args)
return ', '.join(list)

return ', '.join([x.call() for x in self.args])

I think I wrote that before I found out about list comprehensions. How
new are list comprehensions?

I do like that, it's clearer.
Actually you can write self.type = self.name = None,
though assignment statements are more limited than in C.
(And I think they're assigned left-to-right.)
Okay.
match = re.match('(.*)\(\*([a-zA-Z0-9_]*)\)\((.*)\)', text)
Make a habit of using r'' for strings with lots of backslashes,
like regexps.

Hmm. There's an interesting question -- does this work as-is? I'm
assuming it must or it would have blown up on me pretty badly, so
presumably those backslashes are getting passed through untouched
already. But if that's just coincidence (they happen not to be valid
\-sequences), I should definitely fix that.

-s
 
C

Chris Rebert

On 2010-10-12, MRAB <[email protected]> wrote:


The real issue here is that I was assuming that open('nonexistent') returned
None rather than raising an exception.

For future reference, the significant majority of things in Python
raise exceptions upon encountering errors rather than returning error
values of some sort.
Aside from APIs which explicitly provide a parameter to be returned as
a default value in case of error (e.g. getattr(obj, name, default)),
the only common exception* I can come up with off the top of my head
is str.find()**, and even that has an exception-throwing cousin,
str.index().

Cheers,
Chris
 
E

Ethan Furman

Python 2.5.4 (r254:67916, Dec 23 2008, 15:10:54) [MSC v.1310 32 bit
(Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
--> a = 2
--> b = 7
--> c = 13
--> a = b = c = 'right to left'
--> a, b, c
('right to left', 'right to left', 'right to left')
match = re.match('(.*)\(\*([a-zA-Z0-9_]*)\)\((.*)\)', text)
Make a habit of using r'' for strings with lots of backslashes,
like regexps.

Hmm. There's an interesting question -- does this work as-is? I'm
assuming it must or it would have blown up on me pretty badly, so
presumably those backslashes are getting passed through untouched
already. But if that's just coincidence (they happen not to be valid
\-sequences), I should definitely fix that.

Unknown backslash sequences are passed through as-is.

~Ethan~
 
J

Jean-Michel Pichavant

Seebs said:
So, I'm new to Python, though I've got a bit of experience in a few other
languages. My overall impressions are pretty mixed, but overall positive;
it's a reasonably expressive language which has a good mix between staying
out of my way and taking care of stuff I don't want to waste attention on.

My first project was to replace a shell script with a Python script. The
context is a project ("pseudo") which does some really hairy stuff in C.
Part of what it does involves creating hundreds of stub functions. The
existing shell script did this successfully, but wasn't particularly
fast, and was basically unmaintainable. So I've redone it in Python.

The input is a list of C function declarations, such as:
int foo(char *s);
and the output is several files, which include lists of these functions,
declarations for wrappers for them, and so on. So that would produce
something to the effect of:
int foo(char *s) {
/* various magic */
int rc = -1;
/* stuff happens */
rc = wrap_foo(s);
/* more magic */
return rc;
}

Where it gets complicated is that there are, of course, special cases;
for instance, the wrapper for 'int open(char *path, int mode, int flags)' has
to know that the flags argument is conditional, and not always provided, so
it declares open as "int open(char *path, int mode, ...)", then extracts
flags using a va_list. Weird stuff ensues. It's a pretty weird hunk of
code.

The source in its current form:

http://github.com/wrpseudo/pseudo/blob/master/makewrappers

The underlying task is fairly ugly, and it's my first Python project,
so the result isn't particularly pretty, but I'd be interested in
feedback on it. However, I'm not at all sure whether it's appropriate for
this group to post 467 lines of code with no question beyond "how am
I screwing this up?"

At this point, it does everything I want it to do, so the question isn't
"how can I do this?", but "how should I have done this more idiomatically?"

There's a few quirks; one is that I have to run on whatever Python happens
to be installed on a variety of systems, from RHEL4 to Fedora 13 or so.
(It is, at least for now, completely unimportant whether I run on non-Linux
systems.) I can't rely on packages or extensions which aren't going to
be in the base Python install.

Apart from that... I'm interested in feedback. I'm not expecting that
this is good or idiomatic Python, but I'd like to learn to write Python
correctly and expressively, and there's nothing like criticism to improve
code. And if people would prefer that I post the code here, I could,
I just figured that it was a bit large.

-s
If you plan to code in python, then you may consider using a linter.
These a formidable tools to help you set your coding rules, and most of
these tools can detect most of the basics errors, or coding wrong habits.

If you wonder about some defects reported by such linters, you can then
ask in this list why something is not that good, because it may not be
always obvious.

'pylint' is one them, pretty effective.

JM
 
M

MRAB

For future reference, the significant majority of things in Python
raise exceptions upon encountering errors rather than returning error
values of some sort.
Aside from APIs which explicitly provide a parameter to be returned as
a default value in case of error (e.g. getattr(obj, name, default)),
the only common exception* I can come up with off the top of my head
is str.find()**, and even that has an exception-throwing cousin,
str.index().

Cheers,
Chris

The re.search and so forth return Match objects or None.
 
J

Jonas H.

Is it safe for me to assume that all my files will have been flushed and
closed? I'd normally assume this, but I seem to recall that not every
language makes those guarantees.

Not really. Files will be closed when the garbage collector collects the
file object, but you can't be sure the GC will run within the next N
seconds/instructions or something like that. So you should *always* make
sure to close files after using them. That's what context managers were
introduced for.

with open('foobar') as fileobject:
do_something_with(fileobject)

basically is equivalent to (simplified!)

fileobject = open('foobar')
try:
do_something_with(fileobject)
finally:
fileobject.close()

So you can sure `fileobject.close()` is called in *any* case.
Thought about it, but decided that it was probably more complexity than I
need -- this is not a performance-critical thing. And even if it were, well,
I'm pretty sure it's I/O bound. (And on my netbook, the time to run this
is under .2 seconds in Python, compared to 15 seconds in shell, so...)

Forget about my suggestion. As someone pointed out in a another post,
regular expressions are cached anyway.
I'm a bit unsure as to how to pick the right subclass, though.

There are a few pointers in the Python documentation on exceptions.

Jonas
 
C

Chris Torek

Okay. I did that as a quick fix when, finally having hit one of them,
I found out that 'raise "Error message"' didn't work. :) I'm a bit unsure
as to how to pick the right subclass, though.

For exceptions, you have two choices:

- pick some existing exception that seems to make sense, or
- define your own.

The obvious cases for the former are things like ValueError or
IndexError. Indeed, in many cases, you just let a work-step
raise these naturally:

def frobulate(self, x):
...
self.table[x] += ... # raises IndexError when x out of range
...

For the latter, make a class that inherits from Exception. In
a whole lot of cases a trivial/empty class suffices:

class SoLongAndThanksForAllTheFish(Exception):
pass

def ...:
...
if somecondition:
raise SoLongAndThanksForAllTheFish()

Since Exception provides a base __init__() function, you can
include a string:

raise SoLongAndThanksForAllTheFish('RIP DNA')

which becomes the .message field:
'RIP DNA'
 
C

Chris Torek

Not really. Files will be closed when the garbage collector collects the
file object, but you can't be sure the GC will run within the next N
seconds/instructions or something like that. So you should *always* make
sure to close files after using them. That's what context managers were
introduced for.

with open('foobar') as fileobject:
do_something_with(fileobject)

basically is equivalent to (simplified!)

fileobject = open('foobar')
try:
do_something_with(fileobject)
finally:
fileobject.close()

So you can sure `fileobject.close()` is called in *any* case.

Unfortunately "with" is newish and this code currently has to
support python 2.3 (if not even older versions).
 
S

Seebs

For future reference, the significant majority of things in Python
raise exceptions upon encountering errors rather than returning error
values of some sort.

Yes. I'm getting used to that -- it's a bit of a shift, because I'm
used to exceptions being *exceptional* -- as in, not a failure mode
you would expect to see happening. So for instance, I wouldn't expect
to get an exception for EOF, because that's not exceptional, that's
virtually guaranteed to happen whenever you interact with files. I am
gonna have to retrain a bit.
Aside from APIs which explicitly provide a parameter to be returned as
a default value in case of error (e.g. getattr(obj, name, default)),
the only common exception* I can come up with off the top of my head
is str.find()**, and even that has an exception-throwing cousin,
str.index().

Interesting! That may take me some getting used to.

-s
 
S

Seebs

If you wonder about some defects reported by such linters, you can then
ask in this list why something is not that good, because it may not be
always obvious.
'pylint' is one them, pretty effective.

Okay, several questions about stuff pylint found.

1. If I have a message that I wish to print, it is quite possible
that message + indentation exceeds 80 lines. What's the idiomatic way
to solve this? Do I just break the string up into parts, or do I just
accept that some lines are over 80 characters, or what?
2. It warns about **kw, both because the name is short and because of
the "** magic". I was previously advised that the name "kw" is canonical
for that usage, in which case, why am I getting linted at?
3. More generally, the usage of **kw is probably not really right, but I'm
not sure what to do.

The issue here is as follows:

In C, some functions have an optional argument with the curious trait that,
if present, it is always of the same type. e.g., open(2). In my wrappers,
I indicate this by declaring it as something like "...{mode_t mode}". The
intent is that in a declaration, this will turn into:
foo(blah blah blah, ... /* mode_t mode */)
so there's an indication of why there's a "..." there.
But C comments don't nest. And there's a couple of points where I want to
put the declaration in a comment. So in those cases, I need to do something
other than /*...*/:
/* foo(blah blah blah, ... mode_t mode) */

The problem: The determination of whether I wish to do this is at the level
of the "represent this function in the following way" (Function.comment()),
but the implementation is two layers down. So right now, I am passing
down 'comment = True' from the top level through, and I'm doing it using
a **kw keyword argument.

The rationale is that I could just use
def decl(comment = False):
...
but then the invocation:
func.decl(True)
would be utterly opaque. I want to name the thing I'm passing or otherwise
indicate what it is.

It could be that I am Doing This Wrong. Is there a good idiom for labeling
optional arguments, and passing them on (if they exist)?

-s
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,768
Messages
2,569,574
Members
45,051
Latest member
CarleyMcCr

Latest Threads

Top