list comprehension problem

M

mk

Hello everyone,

print hosts
hosts = [ s.strip() for s in hosts if s is not '' and s is not None and
s is not '\n' ]
print hosts

['9.156.44.227\n', '9.156.46.34 \n', '\n']
['9.156.44.227', '9.156.46.34', '']

Why does the hosts list after list comprehension still contain '' in
last position?

I checked that:

print hosts
hosts = [ s.strip() for s in hosts if s != '' and s != None and s != '\n' ]
print hosts

...works as expected:

['9.156.44.227\n', '9.156.46.34 \n', '\n']
['9.156.44.227', '9.156.46.34']


Are there two '\n' strings in the interpreter's memory or smth so the
identity check "s is not '\n'" does not work as expected?

This is weird. I expected that at all times there is only one '\n'
string in Python's cache or whatever that all labels meant by the
programmer as '\n' string actually point to. Is that wrong assumption?



Regards,
mk
 
D

Diez B. Roggisch

mk said:
Hello everyone,

print hosts
hosts = [ s.strip() for s in hosts if s is not '' and s is not None and
s is not '\n' ]
print hosts

['9.156.44.227\n', '9.156.46.34 \n', '\n']
['9.156.44.227', '9.156.46.34', '']

Why does the hosts list after list comprehension still contain '' in
last position?

I checked that:

print hosts
hosts = [ s.strip() for s in hosts if s != '' and s != None and s != '\n'
] print hosts

..works as expected:

['9.156.44.227\n', '9.156.46.34 \n', '\n']
['9.156.44.227', '9.156.46.34']


Are there two '\n' strings in the interpreter's memory or smth so the
identity check "s is not '\n'" does not work as expected?

This is weird. I expected that at all times there is only one '\n'
string in Python's cache or whatever that all labels meant by the
programmer as '\n' string actually point to. Is that wrong assumption?

Yes. Never use "is" unless you know 100% that you are talking about the same
object, not just equality.

Diez
 
F

Falcolas

mk said:
Hello everyone,
print hosts
hosts = [ s.strip() for s in hosts if s is not '' and s is not None and
s is not '\n' ]
print hosts
['9.156.44.227\n', '9.156.46.34 \n', '\n']
['9.156.44.227', '9.156.46.34', '']
Why does the hosts list after list comprehension still contain '' in
last position?
I checked that:
print hosts
hosts = [ s.strip() for s in hosts if s != '' and s != None and s != '\n'
] print hosts
..works as expected:
['9.156.44.227\n', '9.156.46.34 \n', '\n']
['9.156.44.227', '9.156.46.34']
Are there two '\n' strings in the interpreter's memory or smth so the
identity check "s is not '\n'" does not work as expected?
This is weird. I expected that at all times there is only one '\n'
string in Python's cache or whatever that all labels meant by the
programmer as '\n' string actually point to. Is that wrong assumption?

Yes. Never use "is" unless you know 100% that you are talking about the same
object, not just equality.

Diez

I'd also recommend trying the following filter, since it is identical
to what you're trying to do, and will probably catch some additional
edge cases without any additional effort from you.

[s.strip() for s in hosts if s.strip()]

This will check the results of s.strip(), and since empty strings are
considered false, they will not make it into your results.

Garrick
 
M

MRAB

Diez said:
mk said:
Hello everyone,

print hosts
hosts = [ s.strip() for s in hosts if s is not '' and s is not None and
s is not '\n' ]
print hosts

['9.156.44.227\n', '9.156.46.34 \n', '\n']
['9.156.44.227', '9.156.46.34', '']

Why does the hosts list after list comprehension still contain '' in
last position?

I checked that:

print hosts
hosts = [ s.strip() for s in hosts if s != '' and s != None and s != '\n'
] print hosts

..works as expected:

['9.156.44.227\n', '9.156.46.34 \n', '\n']
['9.156.44.227', '9.156.46.34']


Are there two '\n' strings in the interpreter's memory or smth so the
identity check "s is not '\n'" does not work as expected?

This is weird. I expected that at all times there is only one '\n'
string in Python's cache or whatever that all labels meant by the
programmer as '\n' string actually point to. Is that wrong assumption?

Yes. Never use "is" unless you know 100% that you are talking about the same
object, not just equality.
Some objects are singletons, ie there's only ever one of them. The most
common singleton is None. In virtually every other case you should be
using "==" and "!=".
 
B

Bruno Desthuilliers

Falcolas a écrit :
(snip)
>
I'd also recommend trying the following filter, since it is identical
to what you're trying to do, and will probably catch some additional
edge cases without any additional effort from you.

[s.strip() for s in hosts if s.strip()]

The problem with this expression is that it calls str.strip two times...
Sometimes, a more lispish approach is better:

whatever = filter(None, map(str.strip, hosts))

or just a plain procedural loop:

whatever = []
for s in hosts:
s = s.strip()
if s:
whatever.append(s)

As far as I'm concerned, I have a clear preference for the first
version, but well, YMMV...
 
B

Bruno Desthuilliers

mk a écrit :
Hello everyone,

print hosts

['9.156.44.227\n', '9.156.46.34 \n', '\n']


Just for the record, where did you get this "hosts" list from ? (hint :
depending on the answer, there might be a way to avoid having to filter
out the list)
 
N

Nick Stinemates

Some objects are singletons, ie there's only ever one of them. The most
common singleton is None. In virtually every other case you should be
using "==" and "!=".

Please correct me if I am wrong, but I believe you meant to say some
objects are immutable, in which case you would be correct.
 
A

alex23

Please correct me if I am wrong, but I believe you meant to say some
objects are immutable, in which case you would be correct.

You're completely wrong. Immutability has nothing to do with identity,
which is what 'is' is testing for:
True

MRAB was refering to the singleton pattern[1], of which None is the
predominant example in Python. None is _always_ None, as it's always
the same object.

1: http://en.wikipedia.org/wiki/Singleton_pattern
 
T

Terry Reedy

alex23 said:
You're completely wrong. Immutability has nothing to do with identity,
which is what 'is' is testing for:

What immutability has to do with identity is that 'two' immutable
objects with the same value *may* actually be the same object,
*depending on the particular version of a particular implementation*.

Whether or not this is 'another' object or the same object is irrelevant
for all purposes except identity checking. It is completely up to the
interpreter.

In this case, but it could have been True.
True

MRAB was refering to the singleton pattern[1], of which None is the
predominant example in Python. None is _always_ None, as it's always
the same object.

And in 3.x, the same is true of True and False.
 
A

alex23

Terry Reedy said:
What immutability has to do with identity is that 'two' immutable
objects with the same value *may* actually be the same object,
*depending on the particular version of a particular implementation*.

See, I prefer a little more certainty in my code. Isn't this why we
continually caution people against relying on implementation details?
In this case, but it could have been True.

Yes, and if my aunt had a penis she'd be my uncle. But she doesn't. So
what's the point here? Under certain implementations, _some_ immutable
objects _may_ share identity, but you shouldn't rely on it? Are you
trying to advocate a use for this behaviour by highlighting it?

I'm honestly not getting your point here.
MRAB was refering to the singleton pattern[1], of which None is the
predominant example in Python. None is _always_ None, as it's always
the same object.

And in 3.x, the same is true of True and False.

None of which refutes or lessens anything I wrote. What my post was
_countering_ was the claim that immutables should use identity checks,
mutables should use equality checks. That the implementation caches
_some_ objects for performance reasons certainly doesn't make that
claim any less wrong.

Again, what was the point of this other than "things differ on the
implementation level"? Isn't it better to talk about the level of the
language that you can _expect_ to be consistent?
 
T

Terry Reedy

alex23 said:
....
> I'm honestly not getting your point here.

Let me try again, a bit differently.

I claim that the second statement, and therefor the first, can be seen
as wrong. I also claim that (Python) programmers need to understand why.

In mathematics, we generally have immutable values whose 'identity' is
their value. There is, for example, only one, immutable, empty set.

In informatics, and in particular in Python, in order to have
mutability, we have objects with value and an identity that is separate
from their value. There can be, for example, multiple mutable empty
sets. Identity is important because we must care about which empty set
we add things to. 'Identity' is only needed because of 'mutability', so
it is mistaken to say they have nothing to do with each other.

Ideally, from both a conceptual and space efficiency view, an
implementation would allow only one copy for each value of immutable
classes. This is what new programmers assume when they blithely use 'is'
instead of '==' (as would usually be correct in math).

However, for time efficiency reasons, there is no unique copy guarantee,
so one must use '==' instead of 'is', except in those few cases where
there is a unique copy guarantee, either by the language spec or by
one's own design, when one must use 'is' and not '=='. Here 'must'
means 'must to be generally assured of program correctness as intended'.

We obviously agree on this guideline.

Terry Jan Reedy
 
S

Steven D'Aprano

Let me try again, a bit differently.

I claim that the second statement, and therefor the first, can be seen
as wrong. I also claim that (Python) programmers need to understand why.

In mathematics, we generally have immutable values whose 'identity' is
their value. There is, for example, only one, immutable, empty set.


I think it's more than that -- I don't think pure mathematics makes any
distinction at all between identity and equality. There are no instances
at all, so you can't talk about individual values. It's not that the
empty set is a singleton, because the empty set isn't a concrete object-
with-existence at all. It's an abstraction, and as such, questions of
"how many separate empty sets are there?" are meaningless.

There are an infinite number of empty sets that differ according to their
construction:

The set of all American Presidents called Boris Nogoodnik.
The set of all human languages with exactly one noun and one verb.
The set of all fire-breathing mammals.
The set of all real numbers equal to sqrt(-1).
The set of all even prime numbers other than 2.
The set of all integers between 0 and 1 exclusive.
The set of all integers between 1 and 2 exclusive.
The set of all positive integers between 2/5 and 4/5.
The set of all multiples of five between 26 and 29.
The set of all non-zero circles in Euclidean geometry where the radius
equals the circumference.
....

I certainly wouldn't say all fire-breathing mammals are integers between
0 and 1, so those sets are "different", and yet clearly they're also "the
same" in some sense. I think this demonstrates that the question of how
many different empty sets is meaningless -- it depends on what you mean
by different and how many.


In informatics, and in particular in Python, in order to have
mutability, we have objects with value and an identity that is separate
from their value.

I think you have this backwards. We have value and identity because of
the hardware we use -- we store values in memory locations, which gives
identity. Our universe imposes the distinction between value and
identity. To simulate immutability and singletons is hard, and needs to
be worked at.

Nevertheless, it would be possible to go the other way. Given
hypothetical hardware which only supported mutable singletons, we could
simulate multiple instances. It would be horribly inefficient, but it
could be done. Imagine a singleton-mutable-set implementation, something
like this:

class set:
def __init__(id):
return singleton
def add(id, obj):
singleton.elements.append((id, obj))
def __contains__(id, element)
return (id, obj) in singleton.elements


and so forth.

You might notice that this is not terribly different from how one might
define non-singleton sets. The difference being, Python sets have
identity implied by storage in distinct memory locations, while this
hypothetical singleton-set has to explicitly code for identity.


There can be, for example, multiple mutable empty
sets. Identity is important because we must care about which empty set
we add things to. 'Identity' is only needed because of 'mutability', so
it is mistaken to say they have nothing to do with each other.

True, but it is not a mistake to say that identity and mutability are
independent: there are immutable singletons, and mutable singletons, and
immutable non-singletons, and mutable non-singletons. Clearly, knowing
that an object is mutable doesn't tell you whether it is a singleton or
not, and knowing it is a singleton doesn't tell you whether it is
immutable or not.

E.g. under normal circumstances modules are singletons, but they are
mutable; frozensets are immutable, but they are not singletons.

Ideally, from both a conceptual and space efficiency view, an
implementation would allow only one copy for each value of immutable
classes.

Ideally, from a complexity of implementation view, an implementation
would allow an unlimited number of copies of each value of immutable
classes.

This is what new programmers assume when they blithely use 'is'
instead of '==' (as would usually be correct in math).

Nah, I think you're crediting them with far more sophistication than they
actually have. I think most people in general, including many new
programmers, simply don't have a good grasp of the conceptual difference
between equality and identity. In plain language, "is" and its
grammatical forms "be", "are", "am" etc. have many meanings:

(1) Set membership testing:
Socrates is a man.
This is a hammer.

(2) Existence:
There is a computer language called Python.
There is a monster under the bed.

(3) Identity:
Iron Man is Tony Stark.
The butler is the murderer.

(4) Mathematical equality:
If x is 5, and y is 11, then y is 2x+1.

(5) Equivalence:
The winner of this race is the champion.
The diameter of a circle is twice the radius.

(6) Metaphoric equivalence:
Kali is death.
Life is like a box of chocolates.

(7) Testing of state:
My ankle is sore.
Fred is left-handed.

(8) Consequence
If George won the lottery, he would say he is happy.

(9) Cost
A cup of coffee is $3.


Only two of these usages work at all in any language I know of: equality
and identity testing, although it would be interesting to consider a
language that allowed type testing:

45 is an int -> returns True
"abc" is a float -> returns False

Some languages, like Hypertalk (by memory) and related languages, make
"is" a synonym for equals.

However, for time efficiency reasons, there is no unique copy guarantee,
so one must use '==' instead of 'is', except in those few cases where
there is a unique copy guarantee, either by the language spec or by
one's own design, when one must use 'is' and not '=='. Here 'must'
means 'must to be generally assured of program correctness as intended'.

We obviously agree on this guideline.


Yes.
 
C

Cousin Stanley

....
There are an infinite number of empty sets
that differ according to their construction:
....
The set of all fire-breathing mammals.
....

Apparently, you have never been a witness
to someone who recently ingested one of
Cousin Chuy's Super-Burritos .......... :)
 
M

Mel

Steven said:
(6) Metaphoric equivalence:
Kali is death.
Life is like a box of chocolates.

OK to here, but this one switches between metaphor and simile, and arguably,
between identity and equality.

Mel.
 
M

Mick Krippendorf

Steven said:
There are an infinite number of empty sets that differ according to their
construction:

The set of all American Presidents called Boris Nogoodnik.
The set of all human languages with exactly one noun and one verb.
The set of all fire-breathing mammals.
The set of all real numbers equal to sqrt(-1).
The set of all even prime numbers other than 2.
The set of all integers between 0 and 1 exclusive.
The set of all integers between 1 and 2 exclusive.
The set of all positive integers between 2/5 and 4/5.
The set of all multiples of five between 26 and 29.
The set of all non-zero circles in Euclidean geometry where the radius
equals the circumference.
...

Logically, they're all the same, by extensionality. There is of course a
difference between the reference of an expression and it's meaning, but
logical truth only depends on reference.

In mathematical logic 'the thing, that ...' can be expressed with the
iota operator (i...), defined like this:

((ia)phi e b) := (Ec)((c e b) & (Aa)((a = b) <-> phi)).

with phi being a formula, E and A the existential and universal
quantors, resp., e the membership relation, & the conjunction operator
and <-> the bi-conditional operator.

When we want find out if two sets s1 and s2 are the same we only need to
look at their extensions, so given:

(i s1)(Ay)(y e s1 <-> y is a fire-breathing animal)
(i s2)(Ay)(y e s2 <-> y is a real number equal to sqrt(-1))

we only need to find out if:

(Ax)(x is a fire-breathing animal <-> x is a real number equal to
sqrt(-1)).

And since there are neither such things, it follows that s1 = s2.

BTW, '=' is usually defined as:

a = b := (AabP)(Pa <-> Pb)

AKA the Leibniz-Principle, but this definition is 2nd order logic. If we
have sets at our disposal when we're axiomatisizing mathematics, we can
also define it 1st-orderly:

a = b := (Aabc)((a e c) <-> (b e c))

Regargs,
Mick.
 
S

Steven D'Aprano

When we want find out if two sets s1 and s2 are the same we only need to
look at their extensions, so given:

(i s1)(Ay)(y e s1 <-> y is a fire-breathing animal) (i s2)(Ay)(y e s2
<-> y is a real number equal to sqrt(-1))

we only need to find out if:

(Ax)(x is a fire-breathing animal <-> x is a real number equal to
sqrt(-1)).

And since there are neither such things, it follows that s1 = s2.

That assumes that all({}) is defined as true. That is a common definition
(Python uses it), it is what classical logic uses, and it often leads to
the "obvious" behaviour you want, but there is no a priori reason to
accept that all({}) is true, and indeed it leads to some difficulties:

All invisible men are alive.
All invisible men are dead.

are both true. Consequently, not all logic systems accept vacuous truths.

http://en.wikipedia.org/wiki/Vacuous_truth
 
M

Mick Krippendorf

Steven said:
That assumes that all({}) is defined as true. That is a common definition
(Python uses it), it is what classical logic uses, and it often leads to
the "obvious" behaviour you want, but there is no a priori reason to
accept that all({}) is true, and indeed it leads to some difficulties:

All invisible men are alive.
All invisible men are dead.

are both true. Consequently, not all logic systems accept vacuous truths.

http://en.wikipedia.org/wiki/Vacuous_truth

You're right, of course, but I'm an oldfashioned quinean guy :) Also,
in relevance logic and similar systems my beloved proof that there are
no facts (Davidson's Slingshot) goes down the drain. So I think I'll
stay with classical logic FTTB.

Regards,
Mick.
 
A

Aahz

I'd also recommend trying the following filter, since it is identical
to what you're trying to do, and will probably catch some additional
edge cases without any additional effort from you.

[s.strip() for s in hosts if s.strip()]

This breaks if s might be None
--
Aahz ([email protected]) <*> http://www.pythoncraft.com/

[on old computer technologies and programmers] "Fancy tail fins on a
brand new '59 Cadillac didn't mean throwing out a whole generation of
mechanics who started with model As." --Andrew Dalke
 
K

Krister Svanlund

I'd also recommend trying the following filter, since it is identical
to what you're trying to do, and will probably catch some additional
edge cases without any additional effort from you.

[s.strip() for s in hosts if s.strip()]

This breaks if s might be None

If you don't want Nones in your list just make a check for it...
[s.strip() for s in hosts if s is not None and s.strip()]
 
P

Paul Rudin

Falcolas said:
[s.strip() for s in hosts if s.strip()]

There's something in me that rebels against seeing the same call
twice. I'd probably write:

filter(None, (s.strip() for s in hosts))
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,012
Latest member
RoxanneDzm

Latest Threads

Top