ordered sets operations on lists..

A

Amit Khemka

Hello, Is there a *direct* way of doing set operations on lists which
preserve the order of the input lists ?
For Ex. l1 = [1, 5, 3, 2, 4, 7]
l2 = [3, 5, 10]

and (l1 intersect l2) returns [5, 3] .... (and (l2 intersect l1)
returns [3, 5])

thanks in advance,
amit.
 
B

bonono

Amit said:
Hello, Is there a *direct* way of doing set operations on lists which
preserve the order of the input lists ?
For Ex. l1 = [1, 5, 3, 2, 4, 7]
l2 = [3, 5, 10]

and (l1 intersect l2) returns [5, 3] .... (and (l2 intersect l1)
returns [3, 5])
what do you mean by "direct" way ? ugly(some said) one liner ?

filter(set(l1).intersection(set(l2)).__contains__, l1)
filter(set(l1).intersection(set(l2)).__contains__, l2)
 
S

Scott David Daniels

Amit said:
Hello, Is there a *direct* way of doing set operations on lists which
preserve the order of the input lists ? Nope

For Ex. l1 = [1, 5, 3, 2, 4, 7]
l2 = [3, 5, 10]

and (l1 intersect l2) returns [5, 3] .... (and (l2 intersect l1)
returns [3, 5])

However:
intersection = set(list1) & set(list2)
[element for element in list1 if element in intersection]
or
[element for element in list2 if element in intersection]
Give you the result you'd like.

--Scott David Daniels
(e-mail address removed)
 
R

Raymond Hettinger

[Amit Khemka]
Hello, Is there a *direct* way of doing set operations on lists which
preserve the order of the input lists ?
For Ex. l1 = [1, 5, 3, 2, 4, 7]
l2 = [3, 5, 10]

and (l1 intersect l2) returns [5, 3] .... (and (l2 intersect l1)
[bonono]
what do you mean by "direct" way ? ugly(some said) one liner ?

filter(set(l1).intersection(set(l2)).__contains__, l1)
filter(set(l1).intersection(set(l2)).__contains__, l2)

The intersection step is unnecessary, so the answer can be simplified a
bit:
filter(set(l2).__contains__, l1) [5, 3]
filter(set(l1).__contains__, l2)
[3, 5]
 
A

Alex Martelli

Raymond Hettinger said:
The intersection step is unnecessary, so the answer can be simplified a
bit:
filter(set(l2).__contains__, l1) [5, 3]
filter(set(l1).__contains__, l2)
[3, 5]

....and if one has time to waste, "setification" being only an
optimization, it can also be removed: filter(l2.__contains__, l1) etc
(very slow for long lists, of course).

Personally, I'd always use (depending on guesses regarding lengths of
lists) [x for x in l1 if x in l2] or the setified equivalent, of course.


Alex
 
B

Bengt Richter

Raymond Hettinger said:
The intersection step is unnecessary, so the answer can be simplified a
bit:
filter(set(l2).__contains__, l1) [5, 3]
filter(set(l1).__contains__, l2)
[3, 5]

...and if one has time to waste, "setification" being only an
optimization, it can also be removed: filter(l2.__contains__, l1) etc
(very slow for long lists, of course).

Personally, I'd always use (depending on guesses regarding lengths of
lists) [x for x in l1 if x in l2] or the setified equivalent, of course.
Perhaps newbies should be advised that

[x for x in l1 if x in set(l2)]

is not a (well) setified equivalent? I could see them being tempted.

Regards,
Bengt Richter
 
A

Alex Martelli

Bengt Richter said:
Personally, I'd always use (depending on guesses regarding lengths of
lists) [x for x in l1 if x in l2] or the setified equivalent, of course.
Perhaps newbies should be advised that

[x for x in l1 if x in set(l2)]

is not a (well) setified equivalent? I could see them being tempted.

You mean, newbies should be advised that Python does NOT hoist any
computations whatsoever from the body of a loop (including LCs and
genexps), so if you want anything hoisted you need to hoist it yourself?
Yes, it is a point worth making, since the lack of hoisting is a
frequent cause of performance loss.


Alex
 
S

Steve Holden

Alex said:
Personally, I'd always use (depending on guesses regarding lengths of
lists) [x for x in l1 if x in l2] or the setified equivalent, of course.

Perhaps newbies should be advised that

[x for x in l1 if x in set(l2)]

is not a (well) setified equivalent? I could see them being tempted.


You mean, newbies should be advised that Python does NOT hoist any
computations whatsoever from the body of a loop (including LCs and
genexps), so if you want anything hoisted you need to hoist it yourself?
Yes, it is a point worth making, since the lack of hoisting is a
frequent cause of performance loss.
Of course now 2.5 is planning to include the AST parser there's fruitful
ground for optimization studies that will perform hoisting automatically
(should anyone be looking for a project ;-)

Given that Python 2.4 doesn't even perform simple constant folding for
arithmetic expressions
1 0 LOAD_CONST 0 (1)
3 LOAD_CONST 1 (2)
6 BINARY_ADD
7 PRINT_ITEM
8 PRINT_NEWLINE
9 LOAD_CONST 2 (None)
12 RETURN_VALUE

even hoisting could be seen as an "advanced" optimization.

regards
Steve
 
F

Felipe Almeida Lessa

Em Dom, 2006-02-12 às 23:15 -0500, Steve Holden escreveu:
Given that Python 2.4 doesn't even perform simple constant folding for
arithmetic expressions
[snip]

May I ask why doesn't it perform such optimization? Is there any special
difficulties in doing so with the Python compiler?

Also, IIRC Psyco does optimize these constant expressions. Or am I
wrong?

Cheers,
Felipe.

--
"Quem excele em empregar a força militar subjulga os exércitos dos
outros povos sem travar batalha, toma cidades fortificadas dos outros
povos sem as atacar e destrói os estados dos outros povos sem lutas
prolongadas. Deve lutar sob o Céu com o propósito primordial da
'preservação'. Desse modo suas armas não se embotarão, e os ganhos
poderão ser preservados. Essa é a estratégia para planejar ofensivas."

-- Sun Tzu, em "A arte da guerra"
 
S

Steve Holden

Felipe said:
Em Dom, 2006-02-12 às 23:15 -0500, Steve Holden escreveu:
Given that Python 2.4 doesn't even perform simple constant folding for
arithmetic expressions
[snip]


May I ask why doesn't it perform such optimization? Is there any special
difficulties in doing so with the Python compiler?
As well to ask why the sky is blue, and has those little white things in
it (unless you live in Arizona) :)

The basic answer is that so far no developer has felt it worthwhile to
expend time on adding these optimizations.
Also, IIRC Psyco does optimize these constant expressions. Or am I
wrong?
Psyco does some very advanced things, but it does them all at run-time.
Unless I misunderstand (not unheard of), there are no circumstances
under which Psyco will improve run-time for a piece of code that is only
executed once.

regards
Steve
 
F

Felipe Almeida Lessa

Em Dom, 2006-02-12 às 23:51 -0500, Steve Holden escreveu:
The basic answer is that so far no developer has felt it worthwhile to
expend time on adding these optimizations.

I always thought these small optimizations could lead Python to be
faster overall. I remember about this every time I see CPython vs.
IronPython benchmarks (.NET and Mono do some nice optimizations at
compile and run times).
Psyco does some very advanced things, but it does them all at run-time.
Unless I misunderstand (not unheard of), there are no circumstances
under which Psyco will improve run-time for a piece of code that is only
executed once.

Sorry, I think I should have been clearer. Yes, Psyco only helps at
runtime (when the function is called), but those constant folds only
practically help on parts of the code that are called many times anyway,
right?

--
"Quem excele em empregar a força militar subjulga os exércitos dos
outros povos sem travar batalha, toma cidades fortificadas dos outros
povos sem as atacar e destrói os estados dos outros povos sem lutas
prolongadas. Deve lutar sob o Céu com o propósito primordial da
'preservação'. Desse modo suas armas não se embotarão, e os ganhos
poderão ser preservados. Essa é a estratégia para planejar ofensivas."

-- Sun Tzu, em "A arte da guerra"
 
S

Steve Holden

Felipe said:
Em Dom, 2006-02-12 às 23:51 -0500, Steve Holden escreveu:



I always thought these small optimizations could lead Python to be
faster overall. I remember about this every time I see CPython vs.
IronPython benchmarks (.NET and Mono do some nice optimizations at
compile and run times).
Indeed it is true that on some benchmarks IronPython is faster than
CPython. I suspect this was helped by the fact that IronPython is Jim's
third implementation of Python (I believe he was familiar with CPython
before he started on Jython, formerly known as JPython).

The fact remains that until someone writes the code it can't be included
in the implementation. Performance optimization, unfortunately, doesn't
have the same glamorous cachet as more esoteric language features.
Sorry, I think I should have been clearer. Yes, Psyco only helps at
runtime (when the function is called), but those constant folds only
practically help on parts of the code that are called many times anyway,
right?
Right. Technically constant folding is a win for a single execution, but
only if you don't take the slightly-increased compile time into account :)

regards
Steve
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,764
Messages
2,569,564
Members
45,040
Latest member
papereejit

Latest Threads

Top