question on list comprehensions

D

Darren Dale

Hi,

I need to replace the following loop with a list comprehension:

res=[0]
for i in arange(10000):
res[0]=res[0]+i

In practice, res is a complex 2D numarray. For this reason, the regular
output of a list comprehension will not work: constructing a list of every
intermediate result will result in huge hits in speed and memory.

I saw this article at ASPN:
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/204297

def thislist():
"""Return a reference to the list object being constructed by the
list comprehension from which this function is called. Raises an
exception if called from anywhere else.
"""
import sys
d = sys._getframe(1).f_locals
nestlevel = 1
while '_[%d]' % nestlevel in d:
nestlevel += 1
return d['_[%d]' % (nestlevel - 1)].__self__

Could the list comprehension include something like thislist().pop(0), to be
called when len(thislist)>1? (I think this could work, but am having
trouble with the syntax.) Or is there a better way to approach the problem?

Thank you,

Darren
 
R

Roberto Antonio Ferreira De Almeida

Darren said:
Hi,

I need to replace the following loop with a list comprehension:

res=[0]
for i in arange(10000):
res[0]=res[0]+i

res[0] = (10000 * (10000-1))/2.0 ;-)
In practice, res is a complex 2D numarray. For this reason, the regular
output of a list comprehension will not work: constructing a list of every
intermediate result will result in huge hits in speed and memory.

Why do you *need* to replace the for loop with a listcomp? Could you
give more details about what you're doing with the complex array?

Roberto
 
D

Darren Dale

Roberto said:
Darren said:
Hi,

I need to replace the following loop with a list comprehension:

res=[0]
for i in arange(10000):
res[0]=res[0]+i

res[0] = (10000 * (10000-1))/2.0 ;-)
In practice, res is a complex 2D numarray. For this reason, the regular
output of a list comprehension will not work: constructing a list of
every intermediate result will result in huge hits in speed and memory.

Why do you *need* to replace the for loop with a listcomp? Could you
give more details about what you're doing with the complex array?

Roberto


OK. As usual, I am having trouble clearly expressing myself. Sorry about
that. Prepare for some physics:

I am simulating diffraction from an array of particles. I have to calculate
a complex array for each particle, add up all these arrays, and square the
magnitude of the result. If I do a for loop, it takes about 6-8 seconds for
a 2000 element array added up over 250 particles. In reality, I will have
2500 particles, or even 2500x2500 particles.

The list comprehension takes only 1.5 seconds for 250 particles. Already,
that means the time has decreased from 40 hours to 10, and that time can be
reduced further if python is not constantly requesting additionaly memory
to grow the resulting list.

I guess that means that I would like to avoid growing the list and popping
the previous result if possible, and just over-write the previous result.
 
C

Carlos Ribeiro

I am simulating diffraction from an array of particles. I have to calculate
a complex array for each particle, add up all these arrays, and square the
magnitude of the result. If I do a for loop, it takes about 6-8 seconds for
a 2000 element array added up over 250 particles. In reality, I will have
2500 particles, or even 2500x2500 particles.

Can't you just sum them inplace as you calculate every new array, for
each particle? Something like this (pseudo code):

result = make_empty_array()
for particle in particles:
result += make_complex_array(particle)
return square(result)

It does not grow the result array indefinitely.... so it solves your
problem, if that's what you need/want to avoid.

--
Carlos Ribeiro
Consultoria em Projetos
blog: http://rascunhosrotos.blogspot.com
blog: http://pythonnotes.blogspot.com
mail: (e-mail address removed)
mail: (e-mail address removed)
 
D

Darren Dale

Carlos said:
Can't you just sum them inplace as you calculate every new array, for
each particle? Something like this (pseudo code):

result = make_empty_array()
for particle in particles:
result += make_complex_array(particle)
return square(result)

It does not grow the result array indefinitely.... so it solves your
problem, if that's what you need/want to avoid.

You're example is what I have now, it's what I want to replace.

I am trying to make the operations general, so I can do 1x1, 1xN, and NxN
result arrays. In the case of NxN, the overhead in the for loop can be
small compared to the time required for the array operations. But for a 1x1
result array, the for loop is a huge time sink. I thought it would be nice
to get the speedup of the list comprehension, but see no way to change the
array in place. The list comprehension wants to create a new list instead.

At any rate, I found another way to frame my problem to get the speedup.
It's entirely based on linear algebra operations, no ingenious coding
involved, so I wont share the details. They are pretty mundane.
 
J

Josiah Carlson

OK. As usual, I am having trouble clearly expressing myself. Sorry about
that. Prepare for some physics:

I am simulating diffraction from an array of particles. I have to calculate
a complex array for each particle, add up all these arrays, and square the
magnitude of the result. If I do a for loop, it takes about 6-8 seconds for
a 2000 element array added up over 250 particles. In reality, I will have
2500 particles, or even 2500x2500 particles.

The list comprehension takes only 1.5 seconds for 250 particles. Already,
that means the time has decreased from 40 hours to 10, and that time can be
reduced further if python is not constantly requesting additionaly memory
to grow the resulting list.

I guess that means that I would like to avoid growing the list and popping
the previous result if possible, and just over-write the previous result.

Download scientific python: http://www.scipy.org/
And get your linear algebra on. Using Numeric arrays, the size of the
matrices (because they are matrices) are much smaller than the size of
an equivalent Python list of lists, so memory may stop being a concern.

- Josiah
 
M

Mustafa Demirhan

Why not just use while loops instead of for loops? You dont have to
create a new array each time you want a loop - you can simply use an
index integer.

i = 0
while i < 5000000:
res [0] = res [0] + i
i = i + 1

Takes less than 2 seconds on my laptop.
 
A

Alex Martelli

Mustafa Demirhan said:
Why not just use while loops instead of for loops? You dont have to
create a new array each time you want a loop - you can simply use an
index integer.

i = 0
while i < 5000000:
res [0] = res [0] + i
i = i + 1

Takes less than 2 seconds on my laptop.

Sure, this is fine, but low-level twiddling with indices isn't all that
nice. A compact alternative such as

res[0] = sum(xrange(5000000))

is, IMHO, preferable to your while loop, not so much because it may be
faster, but because it expresses a single design idea ("let's sum the
first 5 million nonnegative integers") very directly, rather than
getting into the low-level implementation details of _how_ we generate
those integers one after the other, and how we sum them up ditto.


Alex
 
B

Bengt Richter

Mustafa Demirhan said:
Why not just use while loops instead of for loops? You dont have to
create a new array each time you want a loop - you can simply use an
index integer.

i = 0
while i < 5000000:
res [0] = res [0] + i
i = i + 1

Takes less than 2 seconds on my laptop.

Sure, this is fine, but low-level twiddling with indices isn't all that
nice. A compact alternative such as

res[0] = sum(xrange(5000000))

is, IMHO, preferable to your while loop, not so much because it may be
faster, but because it expresses a single design idea ("let's sum the
first 5 million nonnegative integers") very directly, rather than
getting into the low-level implementation details of _how_ we generate
those integers one after the other, and how we sum them up ditto.
As I'm sure you know, sum(xrange(n)) is pretty predictable:
... print n, sum(xrange(n)), n*(n-1)/2
...
0 0 0
1 0 0
2 1 1
3 3 3
4 6 6
5 10 10
6 15 15
7 21 21
8 28 28
9 36 36 5000000 12499997500000 12499997500000

Guess where the time and space was consumed ;-)

Regards,
Bengt Richter
 
A

Alex Martelli

Bengt Richter said:
i = 0
while i < 5000000:
res [0] = res [0] + i
i = i + 1

Takes less than 2 seconds on my laptop.

Sure, this is fine, but low-level twiddling with indices isn't all that
nice. A compact alternative such as

res[0] = sum(xrange(5000000))
...
As I'm sure you know, sum(xrange(n)) is pretty predictable:

Of course (Gauss is said to have proven that theorem in primary school,
to solve just such a summation which the teacher had posed to the class
to get -- the teacher hoped -- some longer time of quiet;-).

If I recall correctly, you can generally find closed-form solutions for
summations of polynomials in i. But I had assumed that the poster meant
his code to stand for the summation of some generic nonpolynomial form
in i, and just wanted to contrast his lower-level approach with a
higher-level and more concise one -- essentially teaching Python, not
number theory;-).


Alex
 
T

Terry Reedy

Alex Martelli said:
If I recall correctly, you can generally find closed-form solutions for
summations of polynomials in i.

Yes, as with integrals, the sum of an nth degree poly is (n+1)th degree.
The n+2 coefficients for the sum from 0 to k can be determined by actually
summing the poly for each of 0 to n+1 and equating the n+2 partial sume to
the result poly (with powers evaluated) to get n+2 equations in n+2
unknowns. Even with integral coefficients in the poly to be summed, the
coefficients are generally non-integral rationals and can get pretty nasty
to calculate, so for exact results, it may well be easier and faster to
write and run a program.

Terry J. Reedy
 
J

Josiah Carlson

Yes, as with integrals, the sum of an nth degree poly is (n+1)th degree.
The n+2 coefficients for the sum from 0 to k can be determined by actually
summing the poly for each of 0 to n+1 and equating the n+2 partial sume to
the result poly (with powers evaluated) to get n+2 equations in n+2
unknowns. Even with integral coefficients in the poly to be summed, the
coefficients are generally non-integral rationals and can get pretty nasty
to calculate, so for exact results, it may well be easier and faster to
write and run a program.


As long as you can solve a k+1 degree polynomial with linear algebra
(via gaussian elimination, etc.), you can find:
sum([i**k for i in xrange(1, n+1)])
... with little issue (if you know the trick).

- Josiah
 
B

Bengt Richter

Yes, as with integrals, the sum of an nth degree poly is (n+1)th degree.
The n+2 coefficients for the sum from 0 to k can be determined by actually
summing the poly for each of 0 to n+1 and equating the n+2 partial sume to
the result poly (with powers evaluated) to get n+2 equations in n+2
unknowns. Even with integral coefficients in the poly to be summed, the
coefficients are generally non-integral rationals and can get pretty nasty
to calculate, so for exact results, it may well be easier and faster to
write and run a program.
Got curious, so I automated creating a polynomial based on an integer series:
>>> from coeffs import getlambda
>>> getlambda([0,1,2,3]) 'lambda x: x'
>>> getlambda([1,2,3]) 'lambda x: x +1'
>>> L = [sum(xrange(i)) for i in xrange(1,10)]
>>> L [0, 1, 3, 6, 10, 15, 21, 28, 36]
>>> getlambda(L) 'lambda x: (x**2 +x)/2'
>>> f = eval(getlambda(L))
>>> [f(x) for x in xrange(9)] [0, 1, 3, 6, 10, 15, 21, 28, 36]
>>> L2 = [sum([i**2 for i in xrange(j)]) for j in xrange(1,10)]
>>> L2 [0, 1, 5, 14, 30, 55, 91, 140, 204]
>>> f2 = eval(getlambda(L2))
>>> [f2(x) for x in xrange(9)] [0, 1, 5, 14, 30, 55, 91, 140, 204]
>>> getlambda(L2) 'lambda x: (2*x**3 +3*x**2 +x)/6'
>>>

Now we'll try an arbitrary series:
>>> f3 = eval(getlambda([3,1,1,0,3,5]))
>>> [f3(x) for x in xrange(6)] [3, 1, 1, 0, 3, 5]
>>> getlambda([3,1,1,0,3,5])
'lambda x: (-1080*x**5 +13200*x**4 -55800*x**3 +98400*x**2 -69120*x +21600)/7200'

Maybe I'll clean it up some time ;-)
It's a simultaneous equation solver using my exact decimal/rational class for math,
and then formatting the lambda expression for the nonzero polynomial coefficients.
Not very tested, but seems to work.

Regards,
Bengt Richter
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,020
Latest member
GenesisGai

Latest Threads

Top