subclassing str

T

Thomas Lotze

Hi,

first of all, cheers to everyone, this is my first clp posting.

For an application, I need a special string-like object. It has some
specific semantics and behaves almost like a string, the only difference
being that it's string representation should be somewhat fancy. I want
to implement it as a subclass of str:

class SpecialString(str):
whatever

Using instances

x = SpecialString('foo')
y = SpecialString(x)

I want to get this behaviour:

str(x) => '(foo)'
str(y) => '(foo)'

That is, I want to be able to make a SpecialString from anything that has
a string representation, but at the same time leave a SpecialString
untouched in the process. After all, it already is and gets formatted as a
SpecialString.

I tried the following:

class SpecialString(str):
def __str__(self):
return "(" + self + ")"

This makes for str(x) => '(foo)' but str(y) => '((foo))' - as expected.

Does this accumulation of braces happen in the string itself, or does the
formatting routine get called several times at each str() call? How to fix
it, depending on the answer?

I tried reimplementing __init__() in order to treat the case that a
SpecialString is given to the constructor, but with little success. I
guess that something like

def __init__(self, obj):
if isinstance(obj, SpecialString):
self = raw(obj)
else:
self = obj

is needed, with raw() giving me the string without added parentheses,
bypassing the string representation. If this approach is sensible, what
would raw() look like?

Or is it best not to mess with __str__() at all, and introduce a render()
method?

Thanks & greetings,
Thomas
 
R

Russell Blau

Thomas Lotze said:
For an application, I need a special string-like object. It has some
specific semantics and behaves almost like a string, the only difference
being that it's string representation should be somewhat fancy. I want
to implement it as a subclass of str: ....
That is, I want to be able to make a SpecialString from anything that has
a string representation, but at the same time leave a SpecialString
untouched in the process. After all, it already is and gets formatted as a
SpecialString.

I tried the following:

class SpecialString(str):
def __str__(self):
return "(" + self + ")"

This makes for str(x) => '(foo)' but str(y) => '((foo))' - as expected.

Does this accumulation of braces happen in the string itself, or does the
formatting routine get called several times at each str() call? How to fix
it, depending on the answer?

I tried reimplementing __init__() in order to treat the case that a
SpecialString is given to the constructor, but with little success.

I think you need to use __new__() instead of __init__(), like so:
def __new__(cls, seq):
if isinstance(seq, SpecialString):
return str.__new__(cls, str(seq)[1:-1])
else:
return str.__new__(cls, seq)
def __str__(self):
return "(" + self + ")"
(foo)

For more info, see http://www.python.org/2.2.1/descrintro.html#__new__
 
D

Duncan Booth

Thomas said:
Using instances

x = SpecialString('foo')
y = SpecialString(x)

I want to get this behaviour:

str(x) => '(foo)'
str(y) => '(foo)'

That is, I want to be able to make a SpecialString from anything that
has a string representation, but at the same time leave a
SpecialString untouched in the process. After all, it already is and
gets formatted as a SpecialString.
Try this:
def __new__(cls, s):
if isinstance(s, SpecialString):
s = s._raw()
return str.__new__(cls, s)
def __str__(self):
return "(" + self._raw() +")"
def _raw(self):
return str.__str__(self)

(foo)

You have to override __new__ as the default constructor for str calls the
__str__ method on its argument (which is where your extra parens appear).

Having added a method to bypass the formatting it makes sense to use it
inside __str__ as well otherwise the code is very sensitive to minor edits
e.g. changing it to:
return "(%s)" % self
would cause an infinite loop.
 
T

Thomas Lotze

I think you need to use __new__() instead of __init__(), like so:

Thanks, that did the trick.
def __new__(cls, seq):
if isinstance(seq, SpecialString):
return str.__new__(cls, str(seq)[1:-1])
else:
return str.__new__(cls, seq)

I found it's actually possible to say seq[:] instead of str(seq)[1:-1],
which is less dependent on the exact format of the fancy representation.

So seq[:] seems to be a "trick" for getting at a sequence as it is,
without honouring its (string) representation. Thinking further, I wonder
what to do on non-sequence types, and whether there isn't a way to get to
an object's core that doesn't look like a trick. Or is it not all that
tricky after all?

Thanks for the pointer.
 
T

Thomas Lotze

def __new__(cls, s):
if isinstance(s, SpecialString):
s = s._raw()
return str.__new__(cls, s) [...]
def _raw(self):
return str.__str__(self)

Thanks a lot.
You have to override __new__ as the default constructor for str calls
the __str__ method on its argument (which is where your extra parens
appear).

Ah, good to have that piece of information.
Having added a method to bypass the formatting it makes sense to use it
inside __str__ as well otherwise the code is very sensitive to minor
edits e.g. changing it to:
return "(%s)" % self
would cause an infinite loop.

Yes, of course.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads

subclassing str 9
Need help with this script 4
Issues with writing pytest 0
Implementing append within a descriptor 0
harmful str(bytes) 17
problem with str() 13
Automatic Type Conversion to String 6
error 2

Members online

Forum statistics

Threads
473,776
Messages
2,569,603
Members
45,190
Latest member
ClayE7480

Latest Threads

Top