Characters contain themselves?

WENDUM Denis 47.76.11 (agent) · Apr 7, 2006

While testing recursive algoritms dealing with generic lists I stumbled
on infinite loops which were triggered by the fact that (at least for my
version of Pyton) characters contain themselves.See session:

system prompt% python
Python 2.3.5 (#2, Feb 9 2005, 00:38:15)
[GCC 3.3.5 (Debian 1:3.3.5-8)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
....

>>> 'a' is 'a' True
>>> 'a' in 'a' True
>>> 'a' in ['a'] True
>>> ....

Click to expand...

Click to expand...

Leading to paradoxes and loops objects which contain themselves (and
other kinds of monsters) are killed in set theory with the Axiom of
Foundation:=)

But let's go back to more earthly matters. I couldn't find any clue in a
python FAQ after having googled with the following "Python strings FAQ"
about why this design choice and how to avoid falling in this trap
without having to litter my code everywhere with tests for stringiness
each time I process a generic list of items.

Any hints would be appreciated.

Rene Pijlman · Apr 7, 2006

WENDUM Denis 47.76.11 (agent):

While testing recursive algoritms dealing with generic lists I stumbled
on infinite loops which were triggered by the fact that (at least for my
version of Pyton) characters contain themselves.

No, strings contain characters. And 'a' is a string consisting of one
character.

"The items of a string are characters. There is no separate character
type; a character is represented by a string of one item."
http://docs.python.org/ref/types.html

(One item of what type, one might ask)

gry · Apr 7, 2006

In fact, not just characters, but strings contain themselves:
True

This is a very nice(i.e. clear and concise) shortcut for:
True

Which I always found contorted and awkward.

Could you be a bit more concrete about your complaint?

-- George
[Thanks, I did enjoy looking up the Axiom of Foundation!]

Mark Jackson · Apr 7, 2006

Rene Pijlman said:
WENDUM Denis 47.76.11 (agent):

No, strings contain characters. And 'a' is a string consisting of one
character.

"The items of a string are characters. There is no separate character
type; a character is represented by a string of one item."
http://docs.python.org/ref/types.html

(One item of what type, one might ask)

Good point. ". . .represented by a string of length one" would be
better.

Bruno Desthuilliers · Apr 7, 2006

WENDUM Denis 47.76.11 (agent) a écrit :

While testing recursive algoritms dealing with generic lists I stumbled
on infinite loops which were triggered by the fact that (at least for my
version of Pyton) characters contain themselves.

There is *no* character type in Python. 'a' is a string, and of course
the string 'a' is a substring of string 'a'.

Steven D'Aprano · Apr 8, 2006

While testing recursive algoritms dealing with generic lists I stumbled
on infinite loops which were triggered by the fact that (at least for my
version of Pyton) characters contain themselves.

A "character" is just a string of length 1.

[snip]

Leading to paradoxes and loops objects which contain themselves (and
other kinds of monsters) are killed in set theory with the Axiom of
Foundation:=)

Let me see if I understand you... you would like behaviour like this?
False # not the real result

Ain't going to happen.

But let's go back to more earthly matters. I couldn't find any clue in a
python FAQ after having googled with the following "Python strings FAQ"
about why this design choice and how to avoid falling in this trap
without having to litter my code everywhere with tests for stringiness
each time I process a generic list of items.

Any hints would be appreciated.

I think that if you want to do work with sets, use sets, not strings.

In Python 2.3, you need:

In Python 2.4, Sets (note the initial capital) are a built-in.

Steven D'Aprano · Apr 8, 2006

But let's go back to more earthly matters. I couldn't find any clue in a
python FAQ after having googled with the following "Python strings FAQ"
about why this design choice and how to avoid falling in this trap
without having to litter my code everywhere with tests for stringiness
each time I process a generic list of items.

Any hints would be appreciated.

Here is another suggestion.

If you absolutely must use strings rather than sets or lists, create a
custom string class:
.... def __contains__(self, other):
.... if self == other: return False
.... return super(Mystr, self).__contains__(other)
....False

Hope this helps.

Atanas Banov · Apr 8, 2006

congratulations for (ostensibly) discovering the Barber's paradox (if
the village barber shaves all and only those who don't shave
tehmselves, who shaves the barber?
http://en.wikipedia.org/wiki/Barber_paradox) in python ! :-D

as far as i see it, you complaint is not just that any string X
contains itself but that string X can contain another string Y (i.e.
object of class string to contain another of class string) - where you
understand "contain" as per the operator "in" to be set-theory
operator, when in fact the meaning put for strings is instead "has a
substring".

therefore your grudge is not just with
'a' in 'a'
but also with
'a' in 'abcd'

here is excerpt from the reference manual:
----------------------------------------
The operators in and not in test for set membership. x in s evaluates
to true if x is a member of the set s, and false otherwise. x not in s
returns the negation of x in s. The set membership test has
traditionally been bound to sequences; an object is a member of a set
if the set is a sequence and contains an element equal to that object.
However, it is possible for an object to support membership tests
without being a sequence. In particular, dictionaries support
membership testing as a nicer way of spelling key in dict; other
mapping types may follow suit.

For the list and tuple types, x in y is true if and only if there
exists an index i such that x == y is true.

For the Unicode and string types, x in y is true if and only if x is a
substring of y. An equivalent test is y.find(x) != -1. Note, x and y
need not be the same type; consequently, u'ab' in 'abc' will return
True. Empty strings are always considered to be a substring of any
other string, so "" in "abc" will return True.
----------------------------------------

it is apparent "in" was overriden for strings for convenience's sake,
not to get freaky on the therory of sets.

what can you do about it? well, you can check for string type
specifically but there are no guarantees in life: someone else can
define new type with "in" that behaves like that: say "interval(x,y)",
where "interval(x,y) in interval(a,b)" checks if [x,y] is a
sub-interval of [a,b] - very intuitive - but there you have the problem
again!

or you can specifically check if the objects are from a "semanthically
supported group" of classes - but that will hamper authomatic extension
by introducing new types.

- Nas

While testing recursive algoritms dealing with generic lists I stumbled

'a' is 'a' True
'a' in 'a' True
'a' in ['a'] True
....

Click to expand...

Click to expand...

Leading to paradoxes and loops objects which contain themselves (and
other kinds of monsters) are killed in set theory with the Axiom of
Foundation:=)

But let's go back to more earthly matters. I couldn't find any clue in a
python FAQ after having googled with the following "Python strings FAQ"
about why this design choice and how to avoid falling in this trap
without having to litter my code everywhere with tests for stringiness
each time I process a generic list of items.

Any hints would be appreciated.

Click to expand...

Kent Johnson · Apr 8, 2006

WENDUM said:
While testing recursive algoritms dealing with generic lists I stumbled
on infinite loops which were triggered by the fact that (at least for my
version of Pyton) characters contain themselves.See session:

Strings are sequences and this is a problem for recursive functions that
process nested lists. You do need to test for stringiness somehow. IIRC
the usual solutions are

- explicit test for string: isinstance(xx, basestring)

- explicit test for list: isinstance(xx, list) - this fails for
user-defined sequences that don't subclass list

- test for __iter__ attribute. IIUC this relies on an implementation
detail that strings don't define __iter__:

In [6]: hasattr('aa', '__iter__')
Out[6]: False

In [7]: hasattr([], '__iter__')
Out[7]: True

Kent

Azolex · Apr 10, 2006

WENDUM said:
While testing recursive algoritms dealing with generic lists I stumbled
on infinite loops which were triggered by the fact that (at least for my
version of Pyton) characters contain themselves.

Note that the empty string is contained in all strings, including itself.
True

WENDUM Denis 47.76.11 (agent) · Apr 10, 2006

In fact, not just characters, but strings contain themselves:

True

This is a very nice(i.e. clear and concise) shortcut for:

True

Which I always found contorted and awkward.

Could you be a bit more concrete about your complaint?

-- George
[Thanks, I did enjoy looking up the Axiom of Foundation!]

Here is the minimal context triggering an infinite descent in recursion.
From the answers I've got it seems I'll have to check if I'm iterating
on a string or on another kind of "list".

python

Python 2.3.5 (#2, Feb 9 2005, 00:38:15)
[GCC 3.3.5 (Debian 1:3.3.5-8)] on linux2
Type "help", "copyright", "credits" or "license" for more information..... try: # assuming List supports iteration
.... new=[f(item) for item in List]
.... except:# if no iteration is possible
.... new=1
.... return new
....

f([1, [2,3]]) # f substitutes each non iterable item by 1 (silly butshows the problem) [1, [1, 1]]
f([1, [2.3, (5,6,7)]]) [1, [1, [1, 1, 1]]]
f('bac') [[[[[[[[1]]]]]]], [[[[[[[1]]]]]]], [[[[[[[1]]]]]]]]
sys.setrecursionlimit(5)
f('bac') [[[1]], [[1]], [[1]]]
# each item in 'bac' is a character,

Click to expand...

Click to expand...

ie. a string of length one on wich one can iterate
and so triggers an infinite descent of recursion.
...

Scott David Daniels · Apr 10, 2006

WENDUM said:
... From the answers I've got it seems I'll have to check if I'm
iterating on a string or on another kind of "list"....... try: # assuming List supports iteration
... new=[f(item) for item in List]
... except:# if no iteration is possible
... new=1
... return new

f([1, [2,3]]) # f substitutes each non iterable item by 1 (silly
butshows the problem) [1, [1, 1]]
f([1, [2.3, (5,6,7)]]) [1, [1, [1, 1, 1]]]
f('bac') [[[[[[[[1]]]]]]], [[[[[[[1]]]]]]], [[[[[[[1]]]]]]]]
sys.setrecursionlimit(5)
f('bac') [[[1]], [[1]], [[1]]]
# each item in 'bac' is a character,

Click to expand...

Click to expand...

ie. a string of length one on wich one can iterate
and so triggers an infinite descent of recursion.

I'd rewrite your 'f' as follows:
def copier(List):
try: # assuming List supports iteration
new = [f(item) for item in List]
except TypeError: # Non-iterables get replaced
return '?'
except RuntimeError: # Recursion failures get shown
return 'Recursion'
return new

Note, even if you got your wish and somehow you couldn't have an issue
with characters in strings, you would have to deal with this:

simple_list = [3.1415]
container = [simple_list]
simple_list.append(container)

Your function (and mine) assumes the argument is a DAG (Directed Acyclic
Graph), but there is no such guarantee about data structures in python.

--Scott David Daniels
(e-mail address removed)

Graham Fawcett · Apr 10, 2006

WENDUM said:
While testing recursive algoritms dealing with generic lists I stumbled
on infinite loops which were triggered by the fact that (at least for my
version of Pyton) characters contain themselves. [snip]
Leading to paradoxes and loops objects which contain themselves (and
other kinds of monsters) are killed in set theory with the Axiom of
Foundation:=)

You could always use an "is-proper-subset-of" function, which is closer
to the intent of your algorithm. Using Jamitzky's very clever infix
recipe [1], you can even write it as an infix operator:

#Jamitzky's infix-operator class, abbreviated
class Infix:
def __init__(self, function):
self.function = function
def __ror__(self, other):
return Infix(lambda x, self=self, other=other:
self.function(other, x))
def __or__(self, other):
return self.function(other)
def __call__(self, value1, value2):
return self.function(value1, value2)

# define our is-proper-subset operator...
ips = Infix(lambda a, b: (a is not b) and (a in b))
False

Of course, |ips| looks like Lips-L, which is odd; but you could pick a
better name. ;-)

Graham

[1] http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/384122

Sion Arrowsmith · Apr 11, 2006

Graham Fawcett said:
You could always use an "is-proper-subset-of" function, which is closer
to the intent of your algorithm. Using Jamitzky's very clever infix
recipe [1], you can even write it as an infix operator:

#Jamitzky's infix-operator class, abbreviated
class Infix:
[ ... ]

# define our is-proper-subset operator...
ips = Infix(lambda a, b: (a is not b) and (a in b))

True

Click to expand...

Unfortunately:

True

Which might not be what you want. On the other hand, it's a simple
fix:

True
False

[1] http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/384122

Graham Fawcett · Apr 11, 2006

Sion said:
Unfortunately:
True

Which might not be what you want. On the other hand, it's a simple
fix:

Ah, good point.

Graham

urlencode with high characters	1	Nov 2, 2005
No module named glade?	0	Mar 17, 2006
The devolution of English language and slothful c.l.p behaviors exposed!	50	Jan 24, 2012
python-dev Summary for 2006-02-16 through 2006-02-28	1	Apr 29, 2006
python-dev Summary for 2005-04-16 through 2005-04-30	7	May 16, 2005
comp.lang.c Answers (Abridged) to Frequently Asked Questions (FAQ)	0	Jan 12, 2008
comp.lang.c Answers (Abridged) to Frequently Asked Questions (FAQ)	0	Mar 1, 2008
comp.lang.c Answers (Abridged) to Frequently Asked Questions (FAQ)	0	Dec 15, 2007

Characters contain themselves?

WENDUM Denis 47.76.11 (agent)

Rene Pijlman

gry

Mark Jackson

Bruno Desthuilliers

Steven D'Aprano

Steven D'Aprano

Atanas Banov

Kent Johnson

Azolex

WENDUM Denis 47.76.11 (agent)

Scott David Daniels

Graham Fawcett

Sion Arrowsmith

Graham Fawcett

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads