array manipulation without for loops

S

Sheldon

Hi,

I have two arrays that are of the same dimension but having 3 different
values: 255, 1 or 2.
I would like to set all the positions in both arrays having 255 to be
equal, i.e., where one array has 255, I set the same elements in the
other array to 255 and visa versa. Does anyone know how to do this
without using for loops?

Sincerely,
Sheldon
 
G

Gary Herron

Sheldon said:
Hi,

I have two arrays that are of the same dimension but having 3 different
values: 255, 1 or 2.
I would like to set all the positions in both arrays having 255 to be
equal, i.e., where one array has 255, I set the same elements in the
other array to 255 and visa versa. Does anyone know how to do this
without using for loops?

Sincerely,
Sheldon
Whatever for? Have you got something against for loops?

However...

You could roll you own loop:
i=0
while i < whatever:
# ... do something with i
i += 1

But what's the point? This does the same as a for loop but slower.

If you don't want any kind of a loop (again, What's the point?) you
could write something recursive:

def proc(i, ...):
# ... do something with i
if i < whatever:
proc(i+1, ...)

But this would be even slower.

Gary Herron
 
S

Sheldon

Hi Gary,

I am really trying to cut the time down as I have 600+ arrays with
dimensions (1215,1215) to compare and I do a lot more things with the
arrays. If I understand you correctly, there is no way around a for
loop?


/Sheldon
 
A

Alex Martelli

Sheldon said:
I have two arrays that are of the same dimension but having 3 different
values: 255, 1 or 2.
I would like to set all the positions in both arrays having 255 to be
equal, i.e., where one array has 255, I set the same elements in the
other array to 255 and visa versa. Does anyone know how to do this
without using for loops?

Python's Numeric extension package (still available, but not actively
developed any more) and its successors (numarray, and the even newer
numpy) are replete with such functionality -- indeed, to really use
arrays in Python you should get any one of these packages (Python offers
arrays only in the very limited incarnation of the standard library
module named array -- better than nothing, but little built-in
functionality, while those packages have plenty).

Of course, if you say "array" when actually you mean "list", the
situation is completely different (it may be worth turning list into
Numeric arrays for manipulation, then back when you're done, if speed is
absolutely of the essence and you desperately need lists as inputs and
outputs but a lot of manipulation in-between).


Alex
 
A

Alex Martelli

Sheldon said:
Hi Gary,

I am really trying to cut the time down as I have 600+ arrays with
dimensions (1215,1215) to compare and I do a lot more things with the
arrays. If I understand you correctly, there is no way around a for
loop?

In pure Python (w/o extension packages) there are no 2-D arrays; so
either you're using lists of lists (and I wonder how you fit even one of
them in memory, if they're 1215 by 1215, much less 600!) or you're
already using some extension (Numeric, numarray, numpy) and aren't
telling us which one. If you're using pure Python add your extension of
choice, if you're using an extension already tell us which one, and in
each case there will be ways to perform your manipulation tasks faster
than Python-level for loops would afford.


Alex
 
G

Gary Herron

Sheldon said:
Hi Gary,

I am really trying to cut the time down as I have 600+ arrays with
dimensions (1215,1215) to compare and I do a lot more things with the
arrays. If I understand you correctly, there is no way around a for
loop?
Well no. I gave you two alternatives to for loops. But once we learn
that your motivation is speed on large arrays, then, by all means, go
with Alex's suggestion. Use numpy (or one if its earlier incarnations).
See: http://numeric.scipy.org/

This is a HIGHLY efficient implementation of arrays for Python. It
provides a number of very general operations that can be performed
across arrays.

Good luck
Gary Herron
 
S

Sheldon

Alex,

I am using Numeric and have created 3 arrays: zero((1215,1215),Float)
Two arrays are compared and one is used to hold the mean difference
between the two compared arrays. Then I compare 290 or 340 pairs of
arrays. I know that memory is a problem and that is why I don't open
all of these arrays at the same time. I cannot install Numpy due to my
working conditions. Sorry I should have made it clear that is was
Numeric I was working with.

/Sheldon
 
T

Tim Chase

I have two arrays that are of the same dimension but having 3 different
values: 255, 1 or 2.
I would like to set all the positions in both arrays having 255 to be
equal, i.e., where one array has 255, I set the same elements in the
other array to 255 and visa versa. Does anyone know how to do this
without using for loops?
>>> # make some sample data
>>> c = 20
>>> import random, itertools
>>> a1 = [(1,2,255)[random.randint(0,2)] for x in xrange(c)]
>>> a2 = [(1,2,255)[random.randint(0,2)] for x in xrange(c)]
>>>
>>> # actually do the work
>>> all = [(x==255 or y==255) and (255, 255) or (x,y) for (x,y) in itertools.izip(a1,a2)]
>>> b1 = [x[0] for x in all]
>>> b2 = [x[1] for x in all]
>>> a1, a2 = b1, b2 # if you want them to replace the originals

Seems to do what I understand that you're describing using "no"
loops (other than those implied by list comprehension).

There may be some nice pythonic way to "unzip" a list of tuples
created by zip() but I couldn't scare up such a method, and
searching for "unzip" turned up a blizzard of hits for dealing
with ZIP files, not for unzipping that which was previously
zip()'ed. My google-foo must be broken. :) Otherwise, you could
just do
>>> b1,b2 = magic_unzip([(x==255 or y==255) and (255, 255) or
(x,y) for (x,y) in itertools.izip(a1,a2)])

or
>>> a1,a2 = magic_unzip([(x==255 or y==255) and (255, 255) or
(x,y) for (x,y) in itertools.izip(a1,a2)])

if you just want to dispose of your originals.

-tkc
 
A

Alex Martelli

Sheldon said:
Alex,

I am using Numeric and have created 3 arrays: zero((1215,1215),Float)
Two arrays are compared and one is used to hold the mean difference
between the two compared arrays. Then I compare 290 or 340 pairs of
arrays. I know that memory is a problem and that is why I don't open
all of these arrays at the same time. I cannot install Numpy due to my
working conditions. Sorry I should have made it clear that is was
Numeric I was working with.

It's OK, even if the hard-core numeric-python people are all
evangelizing for migration to numpy (for reasons that are of course
quite defensible!), I think it's quite OK to stick with good old Numeric
for the moment (and that's exactly what I do for my own personal use!).

So, anyway, I'll assume you mean your 1215 x 1215 arrays were created by
calling Numeric.zeros, not "zero" (with no trailing s) which is a name
that does not exists in Numeric.

Looking back to your original post, let's say that you have two such
arrays, a and b, both 1215x1215 and of Numeric.Float type, and the
entries of each array are all worth 1, 2, or 255 (that's how I read your
original post; if that's not the case, please specify). We want to
write a function that alters both a and b, specifically setting to 255
all entries in each array whose corresponding entries are 255 in the
other array.

Now that's pretty easy -- for example:

import Numeric

def equalize(a, b, v=255):
Numeric.putmask(a, b==v, v)
Numeric.putmask(b, a==v, v)

if __name__ == '__main__':
a = Numeric.zeros((5,5), Numeric.Float)
b = Numeric.zeros((5,5), Numeric.Float)
a[1,2]=a[2,1]=b[3,4]=b[0,2]=255
a[3,0]=a[0,0]=1
b[0,3]=b[4,4]=2
print "Before:"
print a
print b
equalize(a, b)
print "After:"
print a
print b


brain:~/pynut alex$ python ab.py
Before:
[[ 1. 0. 0. 0. 0.]
[ 0. 0. 255. 0. 0.]
[ 0. 255. 0. 0. 0.]
[ 1. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0.]]
[[ 0. 0. 255. 2. 0.]
[ 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 255.]
[ 0. 0. 0. 0. 2.]]
After:
[[ 1. 0. 255. 0. 0.]
[ 0. 0. 255. 0. 0.]
[ 0. 255. 0. 0. 0.]
[ 1. 0. 0. 0. 255.]
[ 0. 0. 0. 0. 0.]]
[[ 0. 0. 255. 2. 0.]
[ 0. 0. 255. 0. 0.]
[ 0. 255. 0. 0. 0.]
[ 0. 0. 0. 0. 255.]
[ 0. 0. 0. 0. 2.]]
brain:~/pynut alex$

Of course I'm using tiny arrays here, for speed of running and ease of
display and eyeball-checking, but everything should work just as well in
your case. Care to check and let us know?

Numeric has pretty good documentation (numpy's is probably even better,
but it is not available for free, so I don't know!), and if you don't
find that documentation sufficient you might want to have a look to my
book "Python in a Nutshell" which devotes a chapter to Numeric (it also
is not available for free, but you can get a subscription to O'Reilly's
Safari online-books repository, which is free for the first two weeks,
and lets you look at many books including Python in a Nutshell -- if you
don't want to pay monthly subscription fees, make sure you cancel your
trial subscription before two weeks have passed!!!).

I strongly recommend that, in some way or other, you DO get a taste of
the huge amount of functionality that Numeric provides for you -- with
the size of computational tasks you're talking about, an investment of
2-3 hours spent becoming deeply familiar with everything Numeric offers
may well repay itself in savings of ten times as much execution time,
and what other investments offer such ROI as 1000%?-)


Alex
 
A

Alex Martelli

Tim Chase said:
all = [(x==255 or y==255) and (255, 255) or (x,y) for (x,y) in itertools.izip(a1,a2)]
b1 = [x[0] for x in all]
b2 = [x[1] for x in all]
a1, a2 = b1, b2 # if you want them to replace the originals

Seems to do what I understand that you're describing using "no"
loops (other than those implied by list comprehension).

Yep, but the performance cost of the for-loops in the comprehension is
essentially the same as for such loops written "normally".
There may be some nice pythonic way to "unzip" a list of tuples
created by zip() but I couldn't scare up such a method, and

Perhaps what you have in mind is:
a=zip('feep','grol')
a [('f', 'g'), ('e', 'r'), ('e', 'o'), ('p', 'l')]
zip(*a)
[('f', 'e', 'e', 'p'), ('g', 'r', 'o', 'l')]


But this wouldn't help the OP all that much with his performance
problems with large 2-D arrays (though it required some guessing to
gauge that it _was_ Numeric arrays that he was dealing with;-),


Alex
 
S

Sheldon

Hi Alex,

I will code this in a little while and get back to you. Terrific! I saw
this function but I skipped over it without realizing what it could do.

The Numeric doc is not very good and I am just getting into Python so
your book sounds great especially since it covers Numeric. I will look
into it when I get back to work tomorrow.

Bye for now,
Sheldon

Alex said:
Sheldon said:
Alex,

I am using Numeric and have created 3 arrays: zero((1215,1215),Float)
Two arrays are compared and one is used to hold the mean difference
between the two compared arrays. Then I compare 290 or 340 pairs of
arrays. I know that memory is a problem and that is why I don't open
all of these arrays at the same time. I cannot install Numpy due to my
working conditions. Sorry I should have made it clear that is was
Numeric I was working with.

It's OK, even if the hard-core numeric-python people are all
evangelizing for migration to numpy (for reasons that are of course
quite defensible!), I think it's quite OK to stick with good old Numeric
for the moment (and that's exactly what I do for my own personal use!).

So, anyway, I'll assume you mean your 1215 x 1215 arrays were created by
calling Numeric.zeros, not "zero" (with no trailing s) which is a name
that does not exists in Numeric.

Looking back to your original post, let's say that you have two such
arrays, a and b, both 1215x1215 and of Numeric.Float type, and the
entries of each array are all worth 1, 2, or 255 (that's how I read your
original post; if that's not the case, please specify). We want to
write a function that alters both a and b, specifically setting to 255
all entries in each array whose corresponding entries are 255 in the
other array.

Now that's pretty easy -- for example:

import Numeric

def equalize(a, b, v=255):
Numeric.putmask(a, b==v, v)
Numeric.putmask(b, a==v, v)

if __name__ == '__main__':
a = Numeric.zeros((5,5), Numeric.Float)
b = Numeric.zeros((5,5), Numeric.Float)
a[1,2]=a[2,1]=b[3,4]=b[0,2]=255
a[3,0]=a[0,0]=1
b[0,3]=b[4,4]=2
print "Before:"
print a
print b
equalize(a, b)
print "After:"
print a
print b


brain:~/pynut alex$ python ab.py
Before:
[[ 1. 0. 0. 0. 0.]
[ 0. 0. 255. 0. 0.]
[ 0. 255. 0. 0. 0.]
[ 1. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0.]]
[[ 0. 0. 255. 2. 0.]
[ 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 255.]
[ 0. 0. 0. 0. 2.]]
After:
[[ 1. 0. 255. 0. 0.]
[ 0. 0. 255. 0. 0.]
[ 0. 255. 0. 0. 0.]
[ 1. 0. 0. 0. 255.]
[ 0. 0. 0. 0. 0.]]
[[ 0. 0. 255. 2. 0.]
[ 0. 0. 255. 0. 0.]
[ 0. 255. 0. 0. 0.]
[ 0. 0. 0. 0. 255.]
[ 0. 0. 0. 0. 2.]]
brain:~/pynut alex$

Of course I'm using tiny arrays here, for speed of running and ease of
display and eyeball-checking, but everything should work just as well in
your case. Care to check and let us know?

Numeric has pretty good documentation (numpy's is probably even better,
but it is not available for free, so I don't know!), and if you don't
find that documentation sufficient you might want to have a look to my
book "Python in a Nutshell" which devotes a chapter to Numeric (it also
is not available for free, but you can get a subscription to O'Reilly's
Safari online-books repository, which is free for the first two weeks,
and lets you look at many books including Python in a Nutshell -- if you
don't want to pay monthly subscription fees, make sure you cancel your
trial subscription before two weeks have passed!!!).

I strongly recommend that, in some way or other, you DO get a taste of
the huge amount of functionality that Numeric provides for you -- with
the size of computational tasks you're talking about, an investment of
2-3 hours spent becoming deeply familiar with everything Numeric offers
may well repay itself in savings of ten times as much execution time,
and what other investments offer such ROI as 1000%?-)


Alex
 
S

Sheldon

The following script (using your function) raised no exception so it
worked! Elegant Alex, thanx.

res = equalize_arrays(msgtmp,ppstmp,255) # class
(ppstmp,msgtmp) = res.equalize() # class method
for i in range(int(main.xsize)):
for j in range(int(main.ysize)):
if msgtmp[i,j] == 255 and ppstmp[i,j] != 255:
raise "equalize error!"
if ppstmp[i,j] == 255 and msgtmp[i,j] != 255:
raise "equalize error!"
I read up on the putmask function and I don't understand this part:
print x [10 1 30 3 50]
putmask(x, [1,0,1,0,1], [-1,-2])
print x
[-1 1 -1 3 -1]

Can you explain why the -2 didn't factor in?

/Sheldon


Hi Alex,

I will code this in a little while and get back to you. Terrific! I saw
this function but I skipped over it without realizing what it could do.

The Numeric doc is not very good and I am just getting into Python so
your book sounds great especially since it covers Numeric. I will look
into it when I get back to work tomorrow.

Bye for now,
Sheldon

Alex said:
Sheldon said:
Alex,

I am using Numeric and have created 3 arrays: zero((1215,1215),Float)
Two arrays are compared and one is used to hold the mean difference
between the two compared arrays. Then I compare 290 or 340 pairs of
arrays. I know that memory is a problem and that is why I don't open
all of these arrays at the same time. I cannot install Numpy due to my
working conditions. Sorry I should have made it clear that is was
Numeric I was working with.

It's OK, even if the hard-core numeric-python people are all
evangelizing for migration to numpy (for reasons that are of course
quite defensible!), I think it's quite OK to stick with good old Numeric
for the moment (and that's exactly what I do for my own personal use!).

So, anyway, I'll assume you mean your 1215 x 1215 arrays were created by
calling Numeric.zeros, not "zero" (with no trailing s) which is a name
that does not exists in Numeric.

Looking back to your original post, let's say that you have two such
arrays, a and b, both 1215x1215 and of Numeric.Float type, and the
entries of each array are all worth 1, 2, or 255 (that's how I read your
original post; if that's not the case, please specify). We want to
write a function that alters both a and b, specifically setting to 255
all entries in each array whose corresponding entries are 255 in the
other array.

Now that's pretty easy -- for example:

import Numeric

def equalize(a, b, v=255):
Numeric.putmask(a, b==v, v)
Numeric.putmask(b, a==v, v)

if __name__ == '__main__':
a = Numeric.zeros((5,5), Numeric.Float)
b = Numeric.zeros((5,5), Numeric.Float)
a[1,2]=a[2,1]=b[3,4]=b[0,2]=255
a[3,0]=a[0,0]=1
b[0,3]=b[4,4]=2
print "Before:"
print a
print b
equalize(a, b)
print "After:"
print a
print b


brain:~/pynut alex$ python ab.py
Before:
[[ 1. 0. 0. 0. 0.]
[ 0. 0. 255. 0. 0.]
[ 0. 255. 0. 0. 0.]
[ 1. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0.]]
[[ 0. 0. 255. 2. 0.]
[ 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 255.]
[ 0. 0. 0. 0. 2.]]
After:
[[ 1. 0. 255. 0. 0.]
[ 0. 0. 255. 0. 0.]
[ 0. 255. 0. 0. 0.]
[ 1. 0. 0. 0. 255.]
[ 0. 0. 0. 0. 0.]]
[[ 0. 0. 255. 2. 0.]
[ 0. 0. 255. 0. 0.]
[ 0. 255. 0. 0. 0.]
[ 0. 0. 0. 0. 255.]
[ 0. 0. 0. 0. 2.]]
brain:~/pynut alex$

Of course I'm using tiny arrays here, for speed of running and ease of
display and eyeball-checking, but everything should work just as well in
your case. Care to check and let us know?

Numeric has pretty good documentation (numpy's is probably even better,
but it is not available for free, so I don't know!), and if you don't
find that documentation sufficient you might want to have a look to my
book "Python in a Nutshell" which devotes a chapter to Numeric (it also
is not available for free, but you can get a subscription to O'Reilly's
Safari online-books repository, which is free for the first two weeks,
and lets you look at many books including Python in a Nutshell -- if you
don't want to pay monthly subscription fees, make sure you cancel your
trial subscription before two weeks have passed!!!).

I strongly recommend that, in some way or other, you DO get a taste of
the huge amount of functionality that Numeric provides for you -- with
the size of computational tasks you're talking about, an investment of
2-3 hours spent becoming deeply familiar with everything Numeric offers
may well repay itself in savings of ten times as much execution time,
and what other investments offer such ROI as 1000%?-)


Alex
 
A

Alex Martelli

Sheldon said:
The following script (using your function) raised no exception so it
worked! Elegant Alex, thanx.

res = equalize_arrays(msgtmp,ppstmp,255) # class
(ppstmp,msgtmp) = res.equalize() # class method
for i in range(int(main.xsize)):
for j in range(int(main.ysize)):
if msgtmp[i,j] == 255 and ppstmp[i,j] != 255:
raise "equalize error!"
if ppstmp[i,j] == 255 and msgtmp[i,j] != 255:
raise "equalize error!"
I read up on the putmask function and I don't understand this part:
print x [10 1 30 3 50]
putmask(x, [1,0,1,0,1], [-1,-2])
print x
[-1 1 -1 3 -1]

Can you explain why the -2 didn't factor in?

Because it always happens in places where the mask is 0, of course --
the third argument gets conceptually "repeated" to get the length of the
mask, giving [-1, -2, -1, -2, -1] -- and the "-2" always occur where the
mask is 0, so they don't matter. Exactly as I would expect from:
putmask(a, mask, v) results in a = v for all places mask is true.
If v is shorter than mask it will be repeated as necessary.
In particular v can be a scalar or length 1 array.

and I just can't see where you might have formed any different
expectations from this documentation. Use a different mask, say
[1,0,0,1,1] -- and the -2 in 4th place will be set into x, just like the
-1 ocurrences at the start and end.

Similarly, say:
Numeric.compress([1,0,1,0,1], [-1, -2]*3)
array([-1, -1, -1])

even though here we have to explicitly use the "*3" part for repetition
since compress, differently from putmask, doesn't implicitly repeat the
last argument, the idea is similar: pick only elements corresponding to
a true value in the mask argument.

If what you want is to put -1 where the first 1 in the mask occurs, -2
where the 2nd 1 in the mask occurs, and so forth, you need some
auxiliary manipulation of the indices to prepare the proper "values"
array, for example:

import Numeric

class SequenceRepeater(object):
def __init__(self, seq, thelen):
self.seq = seq
self.len = thelen
def __len__(self):
return self.len
def __getitem__(self, i):
if i<0: i += self.len
return self.seq[i % len(self.seq)]

def strangeput(anarray, amask, somevalues):
repeater = SequenceRepeater(somevalues, len(amask))
somevalues = Numeric.take(repeater, Numeric.cumsum(amask)-1)
Numeric.putmask(anarray, amask, somevalues)

if __name__ == '__main__':
x = Numeric.zeros(5)
strangeput(x, [1, 0, 1, 0, 1], [-1, -2])
print x


brain:~/pynut alex$ python pr.py
[-1 0 -2 0 -1]


There may be simpler and faster approaches for this, of course, but I
had this SequenceRepeater auxiliary class in my "mixed bag of useful
stuff" so I just copied-and-pasted a solution based on it!-)


Alex
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,020
Latest member
GenesisGai

Latest Threads

Top