Re: seeking to improve Python skills

Discussion in 'Python' started by Jean-Paul Calderone, Jan 23, 2009.

  1. On Fri, 23 Jan 2009 12:17:41 +0100, mk <> wrote:
    >Hello everyone,
    >
    >I wrote the following program mainly for educational purpose of improving my
    >Python programming skills -- would Pythonistas here please look at it and
    >point out areas that could use improvement?
    >
    >This thing analyzes the stored load average record file, calculates simple
    >moving average and produces Gnuplot program to plot the thing.
    >
    >Constructive criticism is welcome.
    >
    >#!/usr/bin/python
    >
    >import re
    >import sys
    >import tempfile
    >import subprocess
    >import os
    >
    >class MovingAvg(object):


    You should have a docstring here. It should describe the overall purpose
    of the MovingAvg class and it should probably document the meaning of each
    attribute instances of the class have.

    > def __init__(self, fname):
    > try:
    > self.fo = open(fname)
    > except IOError, e:
    > print "Problem with opening file:", e
    > sys.exit(4)


    This isn't very good. Your MovingAvg class is otherwise relatively general-
    purpose and re-usable, but this makes it unusable instead. You shouldn't
    use print to report errors in general code and you shouldn't turn specific
    exceptions into SystemExit. This kind of whole-program error handling
    belongs in a different layer if your code. Your moving average code should
    just focus on computing a moving average.

    Also, filenames are rather clunky. Another improvement here would be to
    pass in a file object which the calling code has already opened. Or, better
    still, just pass in an iterator of data. Do the file handling and the
    parsing in a different layer. Then you'll be able to re-use your moving
    average class with data files in a different format - all you'll have to do
    is write another parser for the new format.

    > self.reslist = []
    > self.smalist = []
    > self.maxval = 0
    >
    > def extrfromfile(self):


    There should be a docstring here describing what this method does.

    > vre = re.compile("(\d+-\d+-\d+) (\d\d:\d\d) (\d+\.\d+)")
    > for line in self.fo:
    > res = vre.search(line)
    > if res:
    > self.reslist.append({'day':res.group(1),\
    > 'time':res.group(2),\
    > 'val':float(res.group(3))})


    The trailing backslashes on the previous lines are superfluous. The code
    is just as valid without them.

    > def calc_sma(self, smalen=4):


    Another missing docstring. Make sure you also document the meaning of the
    parameter the method accepts.

    > if smalen == 0:
    > raise AssertionError, "Moving Average sample length
    >cannot be 0"
    > if not isinstance(smalen, int):
    > raise AssertionError, "Moving Average sample length
    >has to be integer"


    The conventional way to write the previous lines would be:

    assert smallen != 0, "Moving Average sample length cannot be 0"
    assert isinstance(smallen, int), "Moving Average Sample length has to be integer"

    However, I would just leave them out. If you document the meaning of the
    smalen parameter, then callers will know it can't be 0 and must be an int.

    > total = 0


    Superfluous line above.

    > total = sum( [ x['val'] for x in self.reslist[0:smalen] ] )
    > sma = total / smalen
    > smaidx = int(smalen/2)
    > self.smalist.append((self.reslist[0]['day'],\
    > self.reslist[0]['time'],\
    > self.reslist[0]['val'],\
    > self.reslist[0]['val']))


    Superfluous backslashes again.

    > for i in range(smalen, len(self.reslist)):
    > curval = self.reslist['val']
    > self.maxval = max(self.maxval, curval)
    > total += curval
    > total -= self.reslist[i - smalen]['val']
    > sma = total / smalen
    > smaidx += 1
    > self.smalist.append((self.reslist[smaidx]['day'],\
    > self.reslist[smaidx]['time'],\
    > self.reslist[smaidx]['val'],\
    > sma))


    And again.

    >
    > def return_results(self):


    Missing docstring.

    > return (self.reslist, self.smalist, self.maxval)


    Generally, I wonder whether MovingAvg should just be a function rather
    than a class. The only place it maintains state it does so confusingly.
    I would just make the input data a parameter to calc_sma and have it return
    the results it computes. Drop the rest of the class and make calc_sma a free
    function.

    >class GnuplotWrapper(object):


    Docstring.

    > def __init__(self, smalist, maxval, outfname = "calc.png",
    >graphlen=640, graphheight=480, gnuplot = '/usr/bin/gnuplot'):


    Again, the comment about filenames. You could just make outfname a file-like
    object and skip the naming.

    > self.outfname = outfname
    > self.smalist = smalist
    > self.gnuplot = gnuplot
    > self.gprog = None
    > self.gdata = None
    > self.graphlen = graphlen
    > self.graphheight = graphheight
    >
    > def _writefiles(self):


    Docstring.

    > self.gprog = tempfile.mkstemp()
    > self.gdata = tempfile.mkstemp()
    > self.gprogfile = open(self.gprog[1], 'wb')
    > self.gdatafile = open(self.gdata[1], 'wb')
    > labelnum = int(self.graphlen / 110)
    > labelstep = int(len(self.smalist) / labelnum)
    > labels = []
    > for i in range(0, len(self.smalist), labelstep):
    > labels.append("\"%s %s\" %d" % (self.smalist[0],
    >self.smalist[1], i))
    > labelstr = ", ".join(labels)
    >
    > self.gprogfile.write("""set terminal png size %d, %d
    >set style line 1 lt 1 lw 2
    >set style line 2 lt 2 lw 2
    >set output "%s"
    >set xtics(%s)
    >set yrange [0:%f]
    >set y2range [0:%f]
    >plot "%s" using 1 with lines ls 1 title "orig" axes x1y1, "%s" using 2 with
    >lines ls 2 title "Moving Average" axes x1y2
    >""" % (self.graphlen, self.graphheight, self.outfname, labelstr,
    >float(maxval), float(maxval),\


    Superfluous backslash.

    > self.gdata[1], self.gdata[1]) )
    > self.gprogfile.close()
    > for elem in self.smalist:
    > self.gdatafile.write("%f, %f\n" % (elem[2],
    >elem[3]))
    > self.gdatafile.close()
    >
    > def plot(self):


    Docstring.

    > self._writefiles()
    > gplot = subprocess.Popen(self.gnuplot + " " +
    >self.gprog[1],\
    > shell=True, stdout=subprocess.PIPE,
    >stderr=subprocess.PIPE)


    You should *avoid* using shell=True with subprocess.Popen. There's no
    reason to use it here, as far as I can tell.

    > print "Plotting data (%s)..." % self.outfname
    > so, se = gplot.communicate()
    > if se:
    > print "Gnuplot problem output:", se
    > os.remove(self.gprog[1])
    > os.remove(self.gdata[1])


    Reporting the gnuplot failure with print is not the best thing. Reporting
    errors to users should be done at a separate layer. This is your gnuplot
    wrapper - its job is to wrap gnuplot, not to talk to the user. If there's
    a problem, make it available via some documented API (for example, the plot
    method might raise an exception if there is a gnuplot problem). Handle the
    exception at a higher level in your application where you know it's correct
    to print things for the user to read.

    >
    >
    >if __name__ == "__main__":
    > try:
    > fname = sys.argv[1]
    > except IndexError:
    > print "Specify filename with data as first argument."
    > sys.exit(1)
    > except Exception, e:
    > print "Error:", e
    > sys.exit(2)


    What is the second exception handler doing? There is no expected other
    exception from the code being protected, so you should just get rid of
    this handler. If, through some crazy accident, another exception gets
    raised, Python will take care of reporting it to the user, and it will
    do so with far more information - information that will make your job of
    debugging the problem vastly easier.

    > values = MovingAvg(fname)
    > values.extrfromfile()
    > values.calc_sma(smalen = 10)
    > (reslist, smalist, maxval) = values.return_results()
    > gp = GnuplotWrapper(smalist, maxval)
    > gp.plot()
    >


    The most significant thing missing from this code is unit tests. Developing
    automated tests for your code will help you learn a lot.

    Jean-Paul
     
    Jean-Paul Calderone, Jan 23, 2009
    #1
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. vincent wehren
    Replies:
    0
    Views:
    392
    vincent wehren
    Oct 30, 2004
  2. Richard Blackwood

    Simulation Programming Skills and Python

    Richard Blackwood, Mar 6, 2006, in forum: Python
    Replies:
    4
    Views:
    458
    Mc Osten
    Mar 7, 2006
  3. Jeff Rush
    Replies:
    0
    Views:
    383
    Jeff Rush
    Mar 9, 2007
  4. mk
    Replies:
    0
    Views:
    310
  5. Zouplaz
    Replies:
    9
    Views:
    199
    Giles Bowkett
    Dec 14, 2006
Loading...

Share This Page