speed up a numpy code with huge array

A

Alexzive

Hello Pythonguys!

is there a way to improve the performance of the attached code ? it
takes about 5 h on a dual-core (using only one core) when len(V)
~1MIL. V is an array which is supposed to store all the volumes of
tetrahedral elements of a grid whose coord. are stored in NN (accessed
trough the list of tetraelements --> EL)


Thanks in advance!
Alex

####
print 'start ' + nameodb
#path = '/windows/D/SIM-MM/3D/E_ortho/' + nameodb + '.odb'
path = pt + nameodb + '.odb'
odb = openOdb(path)

N = odb.rootAssembly.instances['PART-1-1'].nodes
if loadV==1:
pathV=pt+vtet
V=numpy.loadtxt(pathV)
VTOT = V[0]
L3 = V[1]
print 'using ' + vtet
else:
NN=[]
B=[0,0,0,0]
for i in range(len(N)):
B[0] = N.label
B[1] = N.coordinates[0]
B[2] = N.coordinates[1]
B[3] = N.coordinates[2]
NN = append(NN,B)

NN=NN.reshape(-1,4)
EL = odb.rootAssembly.instances['PART-1-1'].elements

L1 = max(NN[:,1])-min(NN[:,1])
L2 = max(NN[:,2])-min(NN[:,2])
L3 = max(NN[:,3])-min(NN[:,3])
VTOT=L1*L2*L3
print 'VTOT: [mm³]' + str(VTOT)

V = array([])

print 'calculating new Vtet '
V = range(len(EL)+2)
V[0] = VTOT
V[1] = L3
for j in range(0,len(EL)):
Va = EL[j].connectivity[0]
Vb = EL[j].connectivity[1]
Vc = EL[j].connectivity[2]
Vd = EL[j].connectivity[3]
ix = where(NN[:,0] == Va)
Xa = NN[ix,1][0][0]
Ya = NN[ix,2][0][0]
Za = NN[ix,3][0][0]
ix = where(NN[:,0] == Vb)
Xb = NN[ix,1][0][0]
Yb = NN[ix,2][0][0]
Zb = NN[ix,3][0][0]
ix = where(NN[:,0] == Vc)
Xc = NN[ix,1][0][0]
Yc = NN[ix,2][0][0]
Zc = NN[ix,3][0][0]
ix = where(NN[:,0] == Vd)
Xd = NN[ix,1][0][0]
Yd = NN[ix,2][0][0]
Zd = NN[ix,3][0][0]
a = [Xa,Ya,Za]
b = [Xb,Yb,Zb]
c = [Xc,Yc,Zc]
d = [Xd,Yd,Zd]
aa = numpy.diff([b,a],axis=0)[0]
bb = numpy.diff([c,b],axis=0)[0]
cc = numpy.diff([d,c],axis=0)[0]
D=array([aa,bb,cc])
det=numpy.linalg.det(D)
V[j+2] = abs(det)/6
pathV = pt + vtet
savetxt(pathV, V, fmt='%.3e')
###
 
D

draco parvus

Hello Pythonguys!

is there a way to improve the performance of the attached code ? it
takes about 5 h on a dual-core (using only one core) when len(V)
~1MIL. V is an array which is supposed to store all the volumes of
tetrahedral elements of a grid whose coord. are stored in NN (accessed
trough the list of tetraelements --> EL)

Thanks in advance!
Alex

####
print 'start ' + nameodb
#path = '/windows/D/SIM-MM/3D/E_ortho/' + nameodb + '.odb'
path = pt + nameodb + '.odb'
odb = openOdb(path)

N = odb.rootAssembly.instances['PART-1-1'].nodes
if loadV==1:
  pathV=pt+vtet
  V=numpy.loadtxt(pathV)
  VTOT = V[0]
  L3 = V[1]
  print 'using ' + vtet
else:
  NN=[]
  B=[0,0,0,0]
  for i in range(len(N)):
        B[0] = N.label
        B[1] = N.coordinates[0]
        B[2] = N.coordinates[1]
        B[3] = N.coordinates[2]
        NN = append(NN,B)

  NN=NN.reshape(-1,4)
  EL = odb.rootAssembly.instances['PART-1-1'].elements

  L1 = max(NN[:,1])-min(NN[:,1])
  L2 = max(NN[:,2])-min(NN[:,2])
  L3 = max(NN[:,3])-min(NN[:,3])
  VTOT=L1*L2*L3
  print 'VTOT: [mm³]' + str(VTOT)

  V = array([])

  print 'calculating new Vtet '
  V = range(len(EL)+2)
  V[0] = VTOT
  V[1] = L3
  for j in range(0,len(EL)):
        Va = EL[j].connectivity[0]
        Vb = EL[j].connectivity[1]
        Vc = EL[j].connectivity[2]
        Vd = EL[j].connectivity[3]
        ix = where(NN[:,0] == Va)
        Xa = NN[ix,1][0][0]
        Ya = NN[ix,2][0][0]
        Za = NN[ix,3][0][0]
        ix = where(NN[:,0] == Vb)
        Xb = NN[ix,1][0][0]
        Yb = NN[ix,2][0][0]
        Zb = NN[ix,3][0][0]
        ix = where(NN[:,0] == Vc)
        Xc = NN[ix,1][0][0]
        Yc = NN[ix,2][0][0]
        Zc = NN[ix,3][0][0]
        ix = where(NN[:,0] == Vd)
        Xd = NN[ix,1][0][0]
        Yd = NN[ix,2][0][0]
        Zd = NN[ix,3][0][0]
        a =  [Xa,Ya,Za]
        b =  [Xb,Yb,Zb]
        c =  [Xc,Yc,Zc]
        d =  [Xd,Yd,Zd]
        aa = numpy.diff([b,a],axis=0)[0]
        bb = numpy.diff([c,b],axis=0)[0]
        cc = numpy.diff([d,c],axis=0)[0]
        D=array([aa,bb,cc])
        det=numpy.linalg.det(D)
        V[j+2] = abs(det)/6
  pathV = pt + vtet
  savetxt(pathV, V, fmt='%.3e')
###


Main problem you've got is quadratic behaviour. For each vertex of
each of your million tets, you go through the entire node list to find
its coordinates. You should use a dict instead, such as:

allnodes = {}
for node in N:
allnodes[node.label] = node.coordinates

And later, instead of using numpy.where, directly use:

Xa, Ya, Za = allnodes[Va] # with Va = EL[j].connectivity[0]
....

Should speed things up a bit. But manipulating odb files is never very
fast.

d.
 
S

Stefan Behnel

Alexzive, 25.05.2010 21:05:
is there a way to improve the performance of the attached code ? it
takes about 5 h on a dual-core (using only one core) when len(V)
~1MIL. V is an array which is supposed to store all the volumes of
tetrahedral elements of a grid whose coord. are stored in NN (accessed
trough the list of tetraelements --> EL)

Consider using Cython for your algorithm. It has direct support for NumPy
arrays and translates to fast C code.

Stefan
 
A

Alexzive

thank you all for the tips.
I 'll try them soon.

I also notice another bottleneck, when python tries to access some
array data stored in the odb files (---> in text below), even before
starting the algoritm:

###
EPS_nodes = range(len(frames))
for f in frames:
.... sum = 0
---> UN = F[f].fieldOutputs['U'].getSubset(region=TOP).values <---
.... EPS_nodes[f] = UN[10].data[Scomp-1]/L3

###

unfortunately I don't have time to learn cython. Using dictionaries
sounds promising.
Thanks!
Alex
 
A

Alexzive

sorry it was just bullshit what I wrote about the second bottleneck,
it seemed to hang up but it was just me forgetting to double-enter
during debugging after "for cycle".

thank you all for the tips.
I 'll try them soon.

I also notice another bottleneck, when python tries to access some
array data stored in the odb files (---> in text below), even before
starting the algoritm:

###
EPS_nodes = range(len(frames))
for f in frames:
...     sum = 0
--->    UN = F[f].fieldOutputs['U'].getSubset(region=TOP).values <---
...     EPS_nodes[f] = UN[10].data[Scomp-1]/L3

###

unfortunately I don't have time to learn cython. Using dictionaries
sounds promising.
Thanks!
Alex

Alexzive, 25.05.2010 21:05:
Consider using Cython for your algorithm. It has direct support for NumPy
arrays and translates to fast C code.
 
B

bobicanprogram

thank you all for the tips.
I 'll try them soon.

I also notice another bottleneck, when python tries to access some
array data stored in the odb files (---> in text below), even before
starting the algoritm:

###
EPS_nodes = range(len(frames))
for f in frames:
... sum = 0
---> UN = F[f].fieldOutputs['U'].getSubset(region=TOP).values <---
... EPS_nodes[f] = UN[10].data[Scomp-1]/L3

###

unfortunately I don't have time to learn cython. Using dictionaries
sounds promising.
Thanks!
Alex

Alexzive, 25.05.2010 21:05:
Consider using Cython for your algorithm. It has direct support for NumPy
arrays and translates to fast C code.


The SIMPL toolkit (http://www.icanprogram.com/06py/lesson1/
lesson1.html) might be a simpler way to offload some processing to
faster C code.

bob
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,743
Messages
2,569,478
Members
44,899
Latest member
RodneyMcAu

Latest Threads

Top