Re: Organisation of python classes and their methods

Discussion in 'Python' started by Paul Rubin, Nov 2, 2012.

  1. Paul Rubin

    Paul Rubin Guest

    Martin Hewitson <> writes:
    > So, is there a way to put these methods in their own files and have
    > them 'included' in the class somehow? ... Is there an official python
    > way to do this? I don't like having source files with 100's of lines
    > of code in, let alone 1000's.


    That code sounds kind of smelly... why are there so many methods per
    class?

    Python lets you inject new methods into existing classes (this is
    sometimes called duck punching) but I don't recommend doing this.

    A few hundred lines of code in a file is completely reasonable. Even a
    few thousand is ok. IME it starts getting unwieldy at 3000 or so.
     
    Paul Rubin, Nov 2, 2012
    #1
    1. Advertising

  2. On 02/11/2012 08:08, Martin Hewitson wrote:
    >
    > Even if one takes reasonable numbers: 20 methods, each method has 20 lines of documentation, then we immediately have 400 lines in the file before writing a line of code. It would seem much more natural to me to have these methods in their own file, grouped nicely in sub-directories. But it seems this is not the python way. Sigh.
    >
    > Thanks for your thoughts,
    >
    > Martin
    >


    20 lines of documentation per method? As far as I'm concerned that's
    not a smell, that's a stink.

    --
    Cheers.

    Mark Lawrence.
     
    Mark Lawrence, Nov 2, 2012
    #2
    1. Advertising

  3. On Fri, Nov 2, 2012 at 7:08 PM, Martin Hewitson <> wrote:
    >
    > On 2, Nov, 2012, at 08:38 AM, Paul Rubin <> wrote:
    >
    >> Martin Hewitson <> writes:
    >>> So, is there a way to put these methods in their own files and have
    >>> them 'included' in the class somehow? ... Is there an official python
    >>> way to do this? I don't like having source files with 100's of lines
    >>> of code in, let alone 1000's.

    >>
    >> That code sounds kind of smelly... why are there so many methods per
    >> class?

    >
    > Simply because there are many different ways to process the data. The class encapsulates the data, and the user can process the data in many ways. Of course, one could have classes which encapsulate the algorithms, as well as the data, but it also seems natural to me to have algorithms as methods which are part of the data class, so the user operates on the data using methods of that class.


    Are these really needing to be methods, or ought they to be
    module-level functions? Remember, Python code doesn't have to be
    organized into classes the way Java code is.

    ChrisA
     
    Chris Angelico, Nov 2, 2012
    #3
  4. Am 02.11.2012 09:08, schrieb Martin Hewitson:
    > On 2, Nov, 2012, at 08:38 AM, Paul Rubin <>
    > wrote:
    >> Martin Hewitson <> writes:
    >>> So, is there a way to put these methods in their own files and
    >>> have them 'included' in the class somehow? ... Is there an
    >>> official python way to do this? I don't like having source files
    >>> with 100's of lines of code in, let alone 1000's.

    >>
    >> That code sounds kind of smelly... why are there so many methods
    >> per class?

    >
    > Simply because there are many different ways to process the data. The
    > class encapsulates the data, and the user can process the data in
    > many ways. Of course, one could have classes which encapsulate the
    > algorithms, as well as the data, but it also seems natural to me to
    > have algorithms as methods which are part of the data class, so the
    > user operates on the data using methods of that class.


    This is largely a matter of taste and a question of circumstances, but
    I'd like to point out here that your "natural" is not universal. If you
    take a different approach, namely that a class should encapsulate in
    order to maintain its internal consistency but otherwise be as small as
    possible, then algorithms operating on some data are definitely not part
    of that data. The advantage is that the data class gets smaller, and in
    the algorithms you don't risk ruining the internal integrity of the used
    data.

    Further, encapsulating algorithms into classes is also not natural.
    Algorithms are often expressed much better as functions. Shoe-horning
    things into a class in the name of OOP is IMHO misguided.

    Concerning mixins, you can put them into separate modules[1]. If it is
    clearly documented that class FooColourMixin handles the colour-related
    stuff for class Foo, and reversely that class Foo inherits FooShapeMixin
    and FooColourMixin that provide handling of shape and colour, then that
    is fine. It allows you to not only encapsulate things inside class Foo
    but to partition things inside Foo. Note that mixins are easier to write
    than in C++. If the mixin needs access to the derived class' function
    bar(), it just calls self.bar(). There is no type-casting or other magic
    involved. The same applies to data attributes (non-function attributes),
    basically all attributes are "virtual". The compile-time, static type
    checking of e.g. C++ doesn't exist.


    >> Python lets you inject new methods into existing classes (this is
    >> sometimes called duck punching) but I don't recommend doing this.

    >
    > Is there not a way just to declare the method in the class and put
    > the actual implementation in another file on the python path so that
    > it's picked up a run time?


    To answer your question, no, not directly. Neither is there a separation
    like in C++ between interface and implementation, nor is there something
    like in C# with partial classes. C++ interface/implementation separation
    is roughly provided by abstract base classes. C# partial classes are
    most closely emulated with mixins.

    That said, modifying classes is neither magic nor is it uncommon:

    class foo:
    pass

    import algo_x
    foo.algo = algo_x.function

    Classes are not immutable, you can add and remove things just like you
    can do with objects.


    BTW: If you told us which language(s) you have a background in, it could
    be easier to help you with identifying the idioms in that language that
    turn into misconceptions when applied to Python.

    Greetings!

    Uli

    [1] Actually, modules themselves provide the kind of separation that I
    think you are after. Don't always think "class" if it comes to
    encapsulation and modularization!
     
    Ulrich Eckhardt, Nov 2, 2012
    #4
  5. On Fri, 02 Nov 2012 08:40:06 +0000, Mark Lawrence wrote:

    > On 02/11/2012 08:08, Martin Hewitson wrote:
    >>
    >> Even if one takes reasonable numbers: 20 methods, each method has 20
    >> lines of documentation, then we immediately have 400 lines in the file
    >> before writing a line of code. It would seem much more natural to me to
    >> have these methods in their own file, grouped nicely in
    >> sub-directories. But it seems this is not the python way. Sigh.
    >>
    >> Thanks for your thoughts,
    >>
    >> Martin
    >>
    >>

    > 20 lines of documentation per method? As far as I'm concerned that's
    > not a smell, that's a stink.


    Depends on the method. For some, 20 lines is 18 lines too many. For
    others, that's 80 lines too few.



    --
    Steven
     
    Steven D'Aprano, Nov 2, 2012
    #5
  6. Paul Rubin

    Peter Otten Guest

    Martin Hewitson wrote:

    > On 2, Nov, 2012, at 09:40 AM, Mark Lawrence <>
    > wrote:


    >> 20 lines of documentation per method? As far as I'm concerned that's not
    >> a smell, that's a stink.

    >
    > Wow, I don't think I've ever been criticised before for writing too much
    > documentation :)
    >
    > I guess we have different end users. This is not a set of classes for
    > other developers to use: it's a set of classes which creates a data
    > analysis environment for scientists to use. They are not programmers, and
    > expect the algorithms to be documented in detail.


    While I would never discourage thorough documentation you may be better off
    with smaller docstrings and the details in an external document. Python
    projects typically use rst-files processed by sphinx.

    http://pypi.python.org/pypi/Sphinx/
     
    Peter Otten, Nov 2, 2012
    #6
  7. On 02/11/2012 08:45, Martin Hewitson wrote:
    >
    > On 2, Nov, 2012, at 09:40 AM, Mark Lawrence <> wrote:
    >
    >> On 02/11/2012 08:08, Martin Hewitson wrote:
    >>>
    >>> Even if one takes reasonable numbers: 20 methods, each method has 20 lines of documentation, then we immediately have 400 lines in the file before writing a line of code. It would seem much more natural to me to have these methods in their own file, grouped nicely in sub-directories. But it seems this is not the python way. Sigh.
    >>>
    >>> Thanks for your thoughts,
    >>>
    >>> Martin
    >>>

    >>
    >> 20 lines of documentation per method? As far as I'm concerned that's not a smell, that's a stink.

    >
    > Wow, I don't think I've ever been criticised before for writing too much documentation :)
    >
    > I guess we have different end users. This is not a set of classes for other developers to use: it's a set of classes which creates a data analysis environment for scientists to use. They are not programmers, and expect the algorithms to be documented in detail.
    >
    > Martin
    >


    You've completely missed the point. 99% of the time if you can't write
    down what a method does in at most half a dozen lines, the method is
    screaming out to be refactored. Rightly or wrongly you've already
    rejected that option, although I suspect that rightly is nearer the mark
    in this case on the grounds that practicality beats purity.

    --
    Cheers.

    Mark Lawrence.
     
    Mark Lawrence, Nov 2, 2012
    #7
  8. Paul Rubin

    Robert Kern Guest

    On 11/2/12 10:21 AM, Peter Otten wrote:
    > Martin Hewitson wrote:
    >
    >> On 2, Nov, 2012, at 09:40 AM, Mark Lawrence <>
    >> wrote:

    >
    >>> 20 lines of documentation per method? As far as I'm concerned that's not
    >>> a smell, that's a stink.

    >>
    >> Wow, I don't think I've ever been criticised before for writing too much
    >> documentation :)
    >>
    >> I guess we have different end users. This is not a set of classes for
    >> other developers to use: it's a set of classes which creates a data
    >> analysis environment for scientists to use. They are not programmers, and
    >> expect the algorithms to be documented in detail.

    >
    > While I would never discourage thorough documentation you may be better off
    > with smaller docstrings and the details in an external document. Python
    > projects typically use rst-files processed by sphinx.
    >
    > http://pypi.python.org/pypi/Sphinx/


    In the science/math community, we tend to build the Sphinx API reference from
    the thorough, authoritative docstrings. We like having complete docstrings
    because we are frequently at the interactive prompt. We tend to have broad APIs,
    so having a single source of documentation and not repeating ourselves is important.

    http://docs.scipy.org/doc/numpy/reference/index.html
    http://docs.scipy.org/doc/scipy/reference/index.html
    http://www.sagemath.org/doc/reference/
    http://docs.sympy.org/0.7.2/modules/index.html
    http://scikit-learn.org/stable/modules/classes.html

    --
    Robert Kern

    "I have come to believe that the whole world is an enigma, a harmless enigma
    that is made terrible by our own mad attempt to interpret it as though it had
    an underlying truth."
    -- Umberto Eco
     
    Robert Kern, Nov 2, 2012
    #8
  9. Paul Rubin

    Robert Kern Guest

    On 11/2/12 10:48 AM, Mark Lawrence wrote:
    > On 02/11/2012 08:45, Martin Hewitson wrote:
    >>
    >> On 2, Nov, 2012, at 09:40 AM, Mark Lawrence <> wrote:
    >>
    >>> On 02/11/2012 08:08, Martin Hewitson wrote:
    >>>>
    >>>> Even if one takes reasonable numbers: 20 methods, each method has 20 lines
    >>>> of documentation, then we immediately have 400 lines in the file before
    >>>> writing a line of code. It would seem much more natural to me to have these
    >>>> methods in their own file, grouped nicely in sub-directories. But it seems
    >>>> this is not the python way. Sigh.
    >>>>
    >>>> Thanks for your thoughts,
    >>>>
    >>>> Martin
    >>>>
    >>>
    >>> 20 lines of documentation per method? As far as I'm concerned that's not a
    >>> smell, that's a stink.

    >>
    >> Wow, I don't think I've ever been criticised before for writing too much
    >> documentation :)
    >>
    >> I guess we have different end users. This is not a set of classes for other
    >> developers to use: it's a set of classes which creates a data analysis
    >> environment for scientists to use. They are not programmers, and expect the
    >> algorithms to be documented in detail.
    >>
    >> Martin

    >
    > You've completely missed the point. 99% of the time if you can't write down
    > what a method does in at most half a dozen lines, the method is screaming out to
    > be refactored. Rightly or wrongly you've already rejected that option, although
    > I suspect that rightly is nearer the mark in this case on the grounds that
    > practicality beats purity.


    You've completely missed the context. These are not really complicated methods
    doing lots of things all at once such that can be refactored to simpler methods.
    The docstrings are not just glorified comments for other developers reading the
    source code. They are the online documentation for non-programmer end-users who
    are using the interactive prompt as an interactive data analysis environment.
    Frequently, they not only have to describe what it's doing, but also introduce
    the whole concept of what it's doing, why you would want to do such a thing, and
    provide examples of its use. That's why they are so long. For example:

    http://docs.scipy.org/doc/numpy/reference/generated/numpy.fft.fft.html

    --
    Robert Kern

    "I have come to believe that the whole world is an enigma, a harmless enigma
    that is made terrible by our own mad attempt to interpret it as though it had
    an underlying truth."
    -- Umberto Eco
     
    Robert Kern, Nov 2, 2012
    #9
  10. On 02/11/2012 14:49, Martin Hewitson wrote:

    [Top posting fixed]

    >
    >>
    >>
    >> BTW: If you told us which language(s) you have a background in, it could be easier to help you with identifying the idioms in that language that turn into misconceptions when applied to Python.

    >
    >>
    >> Greetings!
    >>
    >> Uli
    >>
    >> [1] Actually, modules themselves provide the kind of separation that I think you are after. Don't always think "class" if it comes to encapsulation and modularization!
    >> --
    >> http://mail.python.org/mailman/listinfo/python-list

    >
    > I'm considering porting some MATLAB code to python to move away from commercial software. Python seemed like the right choice simply because of the wonderful numpy, scipy and matplotlib.
    >
    > So my project will build on these packages to provide some additional state and functionality.
    >
    > Cheers,
    >
    > Martin
    >


    Just in case you're not aware there are separate mailing lists for numpy
    and matplotlib, presumably scipy as well, should you have specific
    questions. Further matplotlib is now at version 1.2rc3 with Python 3
    support, yippee. Combine that with the recently released Python 3.3 and
    it should make one hell of a tool kit :)

    --
    Cheers.

    Mark Lawrence.
     
    Mark Lawrence, Nov 2, 2012
    #10
  11. On Fri, 02 Nov 2012 09:08:07 +0100, Martin Hewitson wrote:

    > Even if one takes reasonable numbers: 20 methods, each method has 20
    > lines of documentation, then we immediately have 400 lines in the file
    > before writing a line of code. It would seem much more natural to me to
    > have these methods in their own file, grouped nicely in sub-directories.


    Ewww. Ewww ewww ewww ewww. Just reading about it makes me feel dirty.

    Seriously. That means any time you want to jump back and forth from one
    method to another method OF THE SAME CLASS, you have to change files.
    Yuck.

    --
    Steven
     
    Steven D'Aprano, Nov 3, 2012
    #11
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Peter Otten
    Replies:
    0
    Views:
    182
    Peter Otten
    Nov 2, 2012
  2. Robert Kern
    Replies:
    0
    Views:
    219
    Robert Kern
    Nov 2, 2012
  3. Peter Otten
    Replies:
    0
    Views:
    231
    Peter Otten
    Nov 2, 2012
  4. Frank Millman
    Replies:
    0
    Views:
    249
    Frank Millman
    Nov 2, 2012
  5. Ulrich Eckhardt
    Replies:
    0
    Views:
    231
    Ulrich Eckhardt
    Nov 2, 2012
Loading...

Share This Page