My Experiences Subclassing String

Discussion in 'Python' started by Fuzzyman, Jun 7, 2004.

  1. Fuzzyman

    Fuzzyman Guest

    I recently went through a bit of a headache trying to subclass
    string.... This is because the string is immutable and uses the
    mysterious __new__ method rather than __init__ to 'create' a string.
    To those who are new to subclassign the built in types, my experiences
    might prove helpful. Hopefully not too many innacuracies :)

    I've just spent ages trying to subclass string.... and I'm very proud
    to say I finally managed it !

    The trouble is that the string type (str) is immutable - which means
    that new instances are created using the mysterious __new__ method
    rather than __init__ !! :) You still following me.... ?

    SO :

    class newstring(str):
    def __init__(self, value, othervalue):
    str.__init__(self, value)
    self.othervalue = othervalue

    astring = newstring('hello', 'othervalue')

    fails miserably. This is because the __new__ method of the str is
    called *before* the __init__ value.... and it says it's been given too
    many values. What the __new__ method does is actually return the new
    instance - for a string the __init__ method is just a dummy.

    The bit I couldn't get (and I didn't have access to a python manual at
    the time) - if the __new__ method is responsible for returning the new
    instance of the string, surely it wouldn't have a reference to self;
    since the 'self' wouldn't be created until after __new__ has been
    called......

    Actually thats wrong - so, a simple string type might look something
    like this :

    class newstring(str):
    def __new__(self, value):
    return str.__new__(self, value)
    def __init__(self, value):
    pass

    See how the __new__ method returns the instance and the __init__ is
    just a dummy.
    If we want to add the extra attribute we can do this :


    class newstring(str):
    def __new__(self, value, othervalue):
    return str.__new__(self, value)
    def __init__(self, value, othervalue):
    self.othervalue = othervalue

    The order of creation is that the __new__ method is called which
    returns the object *then* __init__ is called. Although the __new__
    method receives the 'othervalue' it is ignored - and __init__ uses it.
    In practise __new__ could probably do all of this - but I prefer to
    mess around with __new__ as little as possible ! I was just glad I got
    it working..... What it means is that I can create my own class of
    objects - that in most situations will behave like strings, but have
    their own attributes. The only restriction is that the string value is
    immutable and must be set when the object is created. See the
    excellent path module by Jason Orendorff for another example object
    that behaves like a string but also has other attributes - although it
    doesn't use the __new__ method; or the __init__ method I think.

    Regards,

    Fuzzy

    Posted to Voidspace - Techie Blog :
    http://www.voidspace.org.uk/voidspace/index.shtml
    Experiences used in the python modules at :
    http://www.voidspace.org.uk/atlantibots/pythonutils.html
     
    Fuzzyman, Jun 7, 2004
    #1
    1. Advertising

  2. Fuzzyman

    Paul McGuire Guest

    "Fuzzyman" <> wrote in message
    news:...
    > I recently went through a bit of a headache trying to subclass
    > string.... This is because the string is immutable and uses the
    > mysterious __new__ method rather than __init__ to 'create' a string.
    > To those who are new to subclassign the built in types, my experiences
    > might prove helpful. Hopefully not too many innacuracies :)


    <snip>

    > The bit I couldn't get (and I didn't have access to a python manual at
    > the time) - if the __new__ method is responsible for returning the new
    > instance of the string, surely it wouldn't have a reference to self;
    > since the 'self' wouldn't be created until after __new__ has been
    > called......
    >
    > Actually thats wrong - so, a simple string type might look something
    > like this :
    >
    > class newstring(str):
    > def __new__(self, value):
    > return str.__new__(self, value)
    > def __init__(self, value):
    > pass
    >
    > See how the __new__ method returns the instance and the __init__ is
    > just a dummy.
    > If we want to add the extra attribute we can do this :
    >
    >
    > class newstring(str):
    > def __new__(self, value, othervalue):
    > return str.__new__(self, value)
    > def __init__(self, value, othervalue):
    > self.othervalue = othervalue
    >
    > The order of creation is that the __new__ method is called which
    > returns the object *then* __init__ is called. Although the __new__
    > method receives the 'othervalue' it is ignored - and __init__ uses it.

    <snip>

    Fuzzy -

    I recently went down this rabbit hole while trying to optimize Literal
    handling in pyparsing. You are close in your description, but there is one
    basic concept that I think still needs to be sorted out for you.

    Think of __new__ as a class-level factory method, not an instance method.
    That first argument that you passed to your example as 'self' is not the
    self instance, it is the class being new'ed. By luck, even though you
    called it 'self', you passed it to str.__new__ where the class argument is
    supposed to go, so everything still worked.

    The canonical/do-nothing __new__ method looks like this:

    class A(object):
    def __new__(cls,*args):
    return object.__new__(cls)

    There's nothing stopping you from looking at the args tuple to see if you
    want to do more than this, but in truth that's what __init__ is for.

    Here's a sample of using __new__ to return a different class of object,
    depending on the initialization arguments:

    class SpecialA(object):
    pass

    class A(object):
    def __new__(cls,*args):
    print cls,":",args
    if len(args)>0 and args[0]==2:
    return object.__new__(SpecialA)
    return object.__new__(cls)

    obj = A()
    print type(obj)
    obj = A(1)
    print type(obj)
    obj = A(1,"test")
    print type(obj)
    obj = A(2,"test")
    print type(obj)

    gives the following output:

    <class '__main__.A'> : ()
    <class '__main__.A'>
    <class '__main__.A'> : (1,)
    <class '__main__.A'>
    <class '__main__.A'> : (1, 'test')
    <class '__main__.A'>
    <class '__main__.A'> : (2, 'test')
    <class '__main__.SpecialA'>


    HTH,
    -- Paul
     
    Paul McGuire, Jun 7, 2004
    #2
    1. Advertising

  3. Fuzzyman

    Fuzzyman Guest

    "Paul McGuire" <._bogus_.com> wrote in message news:<510xc.52456$>...
    [reluctant snip...]

    >
    > class SpecialA(object):
    > pass
    >
    > class A(object):
    > def __new__(cls,*args):
    > print cls,":",args
    > if len(args)>0 and args[0]==2:
    > return object.__new__(SpecialA)
    > return object.__new__(cls)
    >
    > obj = A()
    > print type(obj)
    > obj = A(1)
    > print type(obj)
    > obj = A(1,"test")
    > print type(obj)
    > obj = A(2,"test")
    > print type(obj)
    >
    > gives the following output:
    >
    > <class '__main__.A'> : ()
    > <class '__main__.A'>
    > <class '__main__.A'> : (1,)
    > <class '__main__.A'>
    > <class '__main__.A'> : (1, 'test')
    > <class '__main__.A'>
    > <class '__main__.A'> : (2, 'test')
    > <class '__main__.SpecialA'>
    >
    >
    > HTH,
    > -- Paul


    Thanks Paul, that was helpful and interesting.
    I've posted the following correction to my blog :

    Ok... so this is a correction to my post a couple of days ago about
    subclassing the built in types (in python).

    I *nearly* got it right. Because new is the 'factory method' for
    creating new instances it is actually a static method and *doesn't*
    receive a reference to self as the first instance... it receives a
    reference to the class as the first argument. By convention in python
    this is a variable named cls rather than self (which refers to the
    instance itself). What it means is that the example I gave *works*
    fine, but the terminology is slightly wrong...

    See the docs on the new style classes unifying types and classes. Also
    thanks to Paul McGuire on comp.lang.pyton for helping me with this.

    My example ought to read :
    class newstring(str):
    def __new__(cls, value, *args, **keywargs):
    return str.__new__(cls, value)
    def __init__(self, value, othervalue):
    self.othervalue = othervalue

    See how the __new__ method collects all the other arguments (using the
    *args and **keywargs collectors) but ignores them - they are rightly
    dealt with by __init__. You *could* examine these other arguments in
    __new__ and even return an object that is an instance of a different
    class depending on the parameters - see the example Paul gives...

    Get all that then ? :)
     
    Fuzzyman, Jun 8, 2004
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Raymond Lewallen

    VSLive experiences

    Raymond Lewallen, Apr 14, 2004, in forum: ASP .Net
    Replies:
    3
    Views:
    389
  2. Holger (David) Wagner
    Replies:
    0
    Views:
    620
    Holger (David) Wagner
    Jul 6, 2004
  3. Ily

    sharpPDF issues...experiences?

    Ily, Aug 3, 2005, in forum: ASP .Net
    Replies:
    0
    Views:
    584
  4. Steve

    Ajax Experiences?

    Steve, Mar 23, 2006, in forum: Java
    Replies:
    2
    Views:
    349
    David Segall
    Mar 24, 2006
  5. =?ISO-8859-1?Q?S=F6ren?=

    sgi rope experiences?

    =?ISO-8859-1?Q?S=F6ren?=, Jul 29, 2003, in forum: C++
    Replies:
    0
    Views:
    592
    =?ISO-8859-1?Q?S=F6ren?=
    Jul 29, 2003
Loading...

Share This Page