trouble subclassing str

Discussion in 'Python' started by Brent, Jun 23, 2005.

  1. Brent

    Brent Guest

    I'd like to subclass the built-in str type. For example:

    --

    class MyString(str):

    def __init__(self, txt, data):
    super(MyString,self).__init__(txt)
    self.data = data

    if __name__ == '__main__':

    s1 = MyString("some text", 100)

    --

    but I get the error:

    Traceback (most recent call last):
    File "MyString.py", line 27, in ?
    s1 = MyString("some text", 12)
    TypeError: str() takes at most 1 argument (2 given)

    I am using Python 2.3 on OS X. Ideas?
     
    Brent, Jun 23, 2005
    #1
    1. Advertising

  2. Brent

    Paul McGuire Guest

    My first thought is "make sure that subclassing str is really what you
    want to do." Here is a place where I have a subclass of str that
    really is a special kind of str:

    class PaddedStr(str):
    def __new__(cls,s,l,padc=' '):
    if l > len(s):
    s2 = "%s%s" % (s,padc*(l-len(s)))
    return str.__new__(str,s2)
    else:
    return str.__new__(str,s)

    print ">%s<" % PaddedStr("aaa",10)
    print ">%s<" % PaddedStr("aaa",8,".")


    (When subclassing str, you have to call str.__new__ from your
    subclass's __new__ method, since str's are immutable. Don't forget
    that __new__ requires a first parameter which is the input class. I
    think the rest of my example is pretty self-explanatory.)

    But if you are subclassing str just so that you can easily print your
    objects, look at implementing the __str__ instance method on your
    class. Reserve inheritance for true "is-a" relationships. Often,
    inheritance is misapplied when the designer really means "has-a" or
    "is-implemented-using-a", and in these cases, the supposed superclass
    is better referenced using a member variable, and delegating to it.

    -- Paul
     
    Paul McGuire, Jun 23, 2005
    #2
    1. Advertising

  3. On Thu, 23 Jun 2005 12:25:58 -0700, Paul McGuire wrote:

    > But if you are subclassing str just so that you can easily print your
    > objects, look at implementing the __str__ instance method on your
    > class. Reserve inheritance for true "is-a" relationships. Often,
    > inheritance is misapplied when the designer really means "has-a" or
    > "is-implemented-using-a", and in these cases, the supposed superclass
    > is better referenced using a member variable, and delegating to it.


    Since we've just be talking about buzzwords in another thread, and the
    difficulty self-taught folks have in knowing what they are, I don't
    suppose somebody would like to give a simple, practical example of what
    Paul means?

    I'm going to take a punt here and guess. Instead of creating a sub-class
    of str, Paul suggests you simply create a class:

    class MyClass:
    def __init__(self, value):
    # value is expected to be a string
    self.value = self.mangle(value)
    def mangle(self, s):
    # do work on s to make sure it looks the way you want it to look
    return "*** " + s + " ***"
    def __str__(self):
    return self.value

    (only with error checking etc for production code).

    Then you use it like this:

    py> myprintablestr = MyClass("Lovely Spam!")
    py> print myprintablestr
    *** Lovely Spam!!! ***

    Am I close?


    --
    Steven
     
    Steven D'Aprano, Jun 24, 2005
    #3
  4. Brent

    Donn Cave Guest

    In article <>,
    Steven D'Aprano <> wrote:

    > On Thu, 23 Jun 2005 12:25:58 -0700, Paul McGuire wrote:
    >
    > > But if you are subclassing str just so that you can easily print your
    > > objects, look at implementing the __str__ instance method on your
    > > class. Reserve inheritance for true "is-a" relationships. Often,
    > > inheritance is misapplied when the designer really means "has-a" or
    > > "is-implemented-using-a", and in these cases, the supposed superclass
    > > is better referenced using a member variable, and delegating to it.

    >
    > Since we've just be talking about buzzwords in another thread, and the
    > difficulty self-taught folks have in knowing what they are, I don't
    > suppose somebody would like to give a simple, practical example of what
    > Paul means?
    >
    > I'm going to take a punt here and guess. Instead of creating a sub-class
    > of str, Paul suggests you simply create a class:
    >
    > class MyClass:
    > def __init__(self, value):
    > # value is expected to be a string
    > self.value = self.mangle(value)
    > def mangle(self, s):
    > # do work on s to make sure it looks the way you want it to look
    > return "*** " + s + " ***"
    > def __str__(self):
    > return self.value
    >
    > (only with error checking etc for production code).
    >
    > Then you use it like this:
    >
    > py> myprintablestr = MyClass("Lovely Spam!")
    > py> print myprintablestr
    > *** Lovely Spam!!! ***
    >
    > Am I close?


    That's how I read it, with "value" as the member variable
    that you delegate to.

    Left unexplained is ``true "is-a" relationships''. Sounds
    like an implicit contradiction -- you can't implement
    something that truly is something else. Without that, and
    maybe a more nuanced replacement for "is-implemented-using-a",
    I don't see how you could really be sure of the point.

    Donn Cave,
     
    Donn Cave, Jun 24, 2005
    #4
  5. Brent

    John Machin Guest

    Brent wrote:
    > I'd like to subclass the built-in str type. For example:


    You'd like to build this weird-looking semi-mutable object as a
    perceived solution to what problem? Perhaps an alternative is a class of
    objects which have a "key" (your current string value) and some data
    attributes? Maybe simply a dict ... adict["some text"] = 100?

    > class MyString(str):
    >
    > def __init__(self, txt, data):
    > super(MyString,self).__init__(txt)
    > self.data = data
    >
    > if __name__ == '__main__':
    >
    > s1 = MyString("some text", 100)
    >
    >
    > but I get the error:
    >
    > Traceback (most recent call last):
    > File "MyString.py", line 27, in ?
    > s1 = MyString("some text", 12)
    > TypeError: str() takes at most 1 argument (2 given)
    >
    > I am using Python 2.3 on OS X. Ideas?
    >


    __init__ is not what you want.

    If you had done some basic debugging before posting (like putting a
    print statement in your __init__), you would have found out that it is
    not even being called.

    Suggestions:

    1. Read the manual section on __new__
    2. Read & run the following:

    class MyString(str):

    def __new__(cls, txt, data):
    print "MyString.__new__:"
    print "cls is", repr(cls)
    theboss = super(MyString, cls)
    print "theboss:", repr(theboss)
    new_instance = theboss.__new__(cls, txt)
    print "new_instance:", repr(new_instance)
    new_instance.data = data
    return new_instance

    if __name__ == '__main__':

    s1 = MyString("some text", 100)
    print "s1:", type(s1), repr(s1)
    print "s1.data:", s1.data

    3. Note, *if* you provide an __init__ method, it will be called
    [seemingly redundantly???] after __new__ has returned.

    HTH,
    John
     
    John Machin, Jun 24, 2005
    #5
  6. Brent

    Paul McGuire Guest

    Dang, that class should be:

    class PaddedStr(str):
    def __new__(cls,s,l,padc=' '):
    if l > len(s):
    s2 = "%s%s" % (s,padc*(l-len(s)))
    return str.__new__(cls,s2)
    else:
    return str.__new__(cls,s)

    -- Paul
     
    Paul McGuire, Jun 24, 2005
    #6
  7. Brent

    Kent Johnson Guest

    Donn Cave wrote:
    > Left unexplained is ``true "is-a" relationships''. Sounds
    > like an implicit contradiction -- you can't implement
    > something that truly is something else. Without that, and
    > maybe a more nuanced replacement for "is-implemented-using-a",
    > I don't see how you could really be sure of the point.


    Try this article for an explanation of is-a:
    http://www.objectmentor.com/resources/articles/lsp.pdf

    IMO Robert Martin explains what good OO design is better than anyone else. His book "Agile Software Development" is excellent.

    Kent
     
    Kent Johnson, Jun 24, 2005
    #7
  8. Brent

    Paul McGuire Guest

    >From purely Python terms, there is a distinction that one of these
    classes (PaddedStr) is immutable, while the other is not. Python only
    permits immutable objects to act as dictionary keys, so this would one
    thing to differentiate these two approaches.

    But on a more abstract, implementation-independent level, this is a
    distinction of inheritance vs. composition and delegation. Inheritance
    was one of the darling concepts in the early days of O-O programming,
    with promises of reusability and development speed. But before long,
    it turned out that inheritance comes with some unfriendly baggage -
    dependencies between subclasses and superclasses made refactoring more
    difficult, and modifications to supertypes had unwanted effects on
    subclasses. Sometimes subclasses would use some backdoor knowledge of
    the supertype data, thereby limiting flexibility in the superclass -
    this phenomenon is often cited as "inheritance breaks encapsulation."

    One check for good inheritance design is the Liskov Substitution
    Principle (LSP) (Thanks for the Robert Martin link, Kent - you beat me
    to it). Borrowing from the Wiki-pedia:
    "In general, the principle mandates that at all times objects from a
    class can be swapped with objects from an inheriting class, without the
    user noticing any other new behaviour. It has effects on the paradigms
    of design by contract, especially regarding to specification:
    - postconditions for methods in the subclass should be more strict than
    those in the superclass
    - preconditions for methods in the subclass should be less strict than
    those in the superclass
    - no new exceptions should be introduced in the subclass"
    (http://en.wikipedia.org/wiki/Liskov_substitution_principle)

    One thing I like about this concept is that is fairly indepedent of
    language or implementation features. I get the feeling that many such
    rules/guidelines seem to be inspired by limitations or gimmicks that
    are found in programming language X (usually C++ or Java), and then
    mistakenly generalized to be universal O-O truths.

    Looking back to PaddedStr vs. MyString, you can see that PaddedStr will
    substitute for str, and for that matter, the MyString behavior that is
    given could be a reasonable subclass of str, although maybe better
    named StarredStr. But let's take a slightly different MyString, one
    like this, where we subclass str to represent a person's name:

    class Person(str):
    def __new__(cls,s,data):
    self = str.__new__(cls,s)
    self.age = data
    return self

    p = Person("Bob",10)
    print p,p.age

    This is handy enough for printing out a Person and getting their name.
    But consider a father and son, both named "Bob".

    p1 = Person("Bob",10)
    p2 = Person("Bob",35) # p1's dad, also named Bob
    print p1 == p2 # prints 'true', should it?
    print p1 is p2 # prints 'false'


    Most often, I see "is-a" confused with "is-implemented-using-a". A
    developer decides that there is some benefit (reduced storage, perhaps)
    of modeling a zip code using an integer, and feels the need to define
    some class like:

    class ZipCode(int):
    def lookupState(self):
    ...

    But zip codes *aren't* integers, they just happen to be numeric - there
    is no sense in supporting zip code arithmetic, nor in using zip codes
    as slice indices, etc. And there are other warts, such as printing zip
    codes with leading zeroes (like they have in Maine).

    So when, about once a month we see on c.l.py "I'm having trouble
    sub-classing <built-in class XYZ>," I can't help but wonder if telling
    the poster how to sub-class an XYZ is really doing the right thing.

    In this thread, the OP wanted to extend str with something that was
    constructable with two arguments, a string and an integer, as in s1 =
    MyString("some text", 100). I tried to propose a case that would be a
    good example of inheritance, where the integer would be used to define
    and/or constrain some str attribute. A *bad* example of inheritance
    would have been one where the 100 had some independent characteristic,
    like a font size, or an age value to be associated with a string that
    happens to contain a person's name. In fact, looking at the proposed
    MyClass, this seems to be the direction he was headed.

    When *should* you use inheritance? Well, for a while, there was such
    backlash that the response was "Never". Personally, I use inheritance
    in cases where I have adopted a design pattern that incorporates it,
    such as Strategy; otherwise, I tend not to use it. (For those of you
    who use my pyparsing package, it is loaded with the Strategy pattern.
    The base class ParserElement defines an abstract do-nothing parsing
    implementation, which is overridden in subclasses such as Literal,
    Word, and Group. All derived instances are treated like the base
    ParserElement, with each subclass providing its own specialized
    parseImpl or postParse behavior, so any subclass can be substituted for
    the base ParserElement, satisfying LSP.)

    I think the current conventional wisdom is "prefer composition over
    inheritance" - never say "never"! :)

    -- Paul
     
    Paul McGuire, Jun 24, 2005
    #8
  9. Brent

    Donn Cave Guest

    In article <>,
    "Paul McGuire" <> wrote:
    [ ... lots of interesting discussion removed ... ]

    > Most often, I see "is-a" confused with "is-implemented-using-a". A
    > developer decides that there is some benefit (reduced storage, perhaps)
    > of modeling a zip code using an integer, and feels the need to define
    > some class like:
    >
    > class ZipCode(int):
    > def lookupState(self):
    > ...
    >
    > But zip codes *aren't* integers, they just happen to be numeric - there
    > is no sense in supporting zip code arithmetic, nor in using zip codes
    > as slice indices, etc. And there are other warts, such as printing zip
    > codes with leading zeroes (like they have in Maine).


    I agree, but I'm not sure how easily this kind of reasoning
    can be applied more generally to objects we write. Take for
    example an indexed data structure, that's generally similar
    to a dictionary but may compute some values. I think it's
    common practice in Python to implement this just as I'm sure
    you would propose, with composition. But is that because it
    fails your "is-a" test? What is-a dictionary, or is-not-a
    dictionary? If you ask me, there isn't any obvious principle,
    it's just a question of how we arrive at a sound implementation --
    and that almost always militates against inheritance, because
    of liabilities you mentioned elsewhere in your post, but in the
    end it depends on the details of the implementation.

    Donn Cave,
     
    Donn Cave, Jun 24, 2005
    #9
  10. Brent

    Paul McGuire Guest

    Look at the related post, on keeping key-key pairs in a dictionary.
    Based on our discussion in this thread, I created a subclass of dict
    called SymmetricDict, that, when storing symDict["A"] = 1, implicitly
    saves the backward looking symDict[1] = "A".

    I chose to inherit from dict, in part just to see what it would look
    like. In doing so, SymmetricDict automagically gets methods such as
    keys(), values(), items(), contains(), and support for len, "key in
    dict", etc. However, I think SymmetricDict breaks (or at least bends)
    LSP, in that there are some cases where SymmetricDict has some
    surprising non-dict behavior. For instance, if I do:

    d = dict()
    d["A"] = 1
    d["B"] = 1
    print d.keys()

    I get ["A", "B"]. But a SymmetricDict is rather strange.

    sd = SymmetricDict()
    sd["A"] = 1
    sd["B"] = 1
    print sd.keys()

    gives ["B",1]. The second assignment wiped out the association of "A"
    to 1. (This reminds me of some maddening O-O discussions I used to
    have at a former place of employment, in which one developer cited
    similar behavior for not having Square inherit from Rectangle - calling
    Square.setWidth() would have to implicitly call setHeight() and vice
    versa, in order to maintain its squarishness, and thereby broke Liskov.
    I withdrew from the debate, citing lack of context that would have
    helped resolve how things should go. At best, you can *probably* say
    that both inherit from Shape, and can be drawn, have an area, a
    bounding rectangle, etc., but not either inherits from the other.
    Unless I'm mistaken, I think Robert Martin has some discussion on this
    example also.)

    So in sum, I'd say that I would be comfortable having SymmetricDict
    extend dict *in my own code*, but that such a beast probably should
    *not* be part of the standard Python distribution, in whose scope the
    non-dictishness of SymmetricDict cannot be predicted. (And maybe this
    gives us some clue about the difficulty of deciding what and what not
    to put in to the Python language and libs.)

    -- Paul
     
    Paul McGuire, Jun 24, 2005
    #10
  11. Brent

    Donn Cave Guest

    In article <>,
    "Paul McGuire" <> wrote:
    ....
    > This reminds me of some maddening O-O discussions I used to
    > have at a former place of employment, in which one developer cited
    > similar behavior for not having Square inherit from Rectangle - calling
    > Square.setWidth() would have to implicitly call setHeight() and vice
    > versa, in order to maintain its squarishness, and thereby broke Liskov.
    > I withdrew from the debate, citing lack of context that would have
    > helped resolve how things should go. At best, you can *probably* say
    > that both inherit from Shape, and can be drawn, have an area, a
    > bounding rectangle, etc., but not either inherits from the other.


    This Squares and Rectangles issue sounds debatable in a
    language like C++ or Java, where it's important because
    of subtype polymorphism. In Python, does it matter?
    As a user of Square, I'm not supposed to ask about its
    parentage, I just try to be clear what's expected of it.
    There's no static typing to notice whether Square is a
    subclass of Rectangle, and if it gets out that I tried
    to discover this issubclass() relationship, I'll get a
    lecture from folks on comp.lang.python who suspect I'm
    confused about polymorphism in Python.

    This is a good thing, because as you can see it relieves
    us of the need to debate abstract principles out of context.
    It doesn't change the real issues - Square is still a lot
    like Rectangle, it still has a couple of differences, and
    the difference could be a problem in some contexts designed
    for Rectangle - but no one can fix that. If you need Square,
    you'll implement it, and whether you choose to inherit from
    Rectangle is left as a matter of implementation convenience.

    Donn Cave,
     
    Donn Cave, Jun 24, 2005
    #11
  12. On 23 Jun 2005 21:27:20 -0700, "Paul McGuire" <> wrote:

    >Dang, that class should be:
    >
    >class PaddedStr(str):
    > def __new__(cls,s,l,padc=' '):
    > if l > len(s):
    > s2 = "%s%s" % (s,padc*(l-len(s)))
    > return str.__new__(cls,s2)
    > else:
    > return str.__new__(cls,s)
    >

    Or you could write

    >>> class PaddedStr2(str):

    ... def __new__(cls,s,l,padc=' '):
    ... return str.__new__(cls, s+padc*(l-len(s)))
    ...

    Which gives

    >>> print '>%s<' % PaddedStr2('xxx',5,'.')

    >xxx..<
    >>> print '>%s<' % PaddedStr2('xxx',3,'.')

    >xxx<
    >>> print '>%s<' % PaddedStr2('xxx',2,'.')

    >xxx<


    (Taking advantage of multipliers <=0 working like 0 for strings):

    >>> for i in xrange(-3,4): print '%2s: >%s<'% (i, 'xxx'+'.'*i)

    ...
    -3: >xxx<
    -2: >xxx<
    -1: >xxx<
    0: >xxx<
    1: >xxx.<
    2: >xxx..<
    3: >xxx...<

    Regards,
    Bengt Richter
     
    Bengt Richter, Jun 26, 2005
    #12
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. David
    Replies:
    2
    Views:
    495
    Thomas G. Marshall
    Aug 3, 2003
  2. Trevor

    sizeof(str) or sizeof(str) - 1 ?

    Trevor, Apr 3, 2004, in forum: C Programming
    Replies:
    9
    Views:
    656
    CBFalconer
    Apr 10, 2004
  3. Thomas Lotze

    subclassing str

    Thomas Lotze, Sep 14, 2004, in forum: Python
    Replies:
    4
    Views:
    387
    Thomas Lotze
    Sep 14, 2004
  4. Sullivan WxPyQtKinter

    It is fun.the result of str.lower(str())

    Sullivan WxPyQtKinter, Mar 7, 2006, in forum: Python
    Replies:
    5
    Views:
    351
    Tim Roberts
    Mar 9, 2006
  5. not1xor1 (Alessandro)

    subclassing str

    not1xor1 (Alessandro), Nov 7, 2010, in forum: Python
    Replies:
    9
    Views:
    306
    Chris Rebert
    Nov 11, 2010
Loading...

Share This Page