nearest neighbor in 2D

M

Miki Tebeka

Hello John,
Can anyone point me to some code or module that provides the
appropriate data structures and algorithms to handle this task
efficiently? The size of the list will likely be in the range of
10-1000 elements.
boost (http://www.boost.org) has a graph library and a good connection
to Python using Boost.Python.

HTH.
Miki
 
P

Peter Otten

Jim said:
I am new to timeit.py, but this is odd.

jim@grendel:~$ /usr/lib/python2.3/timeit.py -c ' p=1.6+2.5j;np=2.4+1.3j;
d=abs(p-np)' 100000 loops, best of 3: 3.1 usec per loop

The above is actually timed.
jim@grendel:~$ /usr/lib/python2.3/timeit.py -c -s'import math;
p=1.6+2.5j;np=2.4+1.3j; d=abs(p-np)' 10000000 loops, best of 3: 0.141 usec
per loop

Here you put everything in the setup parameter. As the timed statement
defaults to pass, you are essentially timing an empty loop :-(
Is it because the builtin math functions are much slower?

You should have used math.abs() or from math import abs to really get the
abs() function in the math module. But
False

Not there, so we cannot compare.

Peter
 
W

WEINHANDL Herbert

John said:
I have a list of two tuples containing x and y coord

(x0, y0)
(x1, y1)
...
(xn, yn)

Given a new point x,y, I would like to find the point in the list
closest to x,y. I have to do this a lot, in an inner loop, and then I
add each new point x,y to the list. I know the range of x and y in
advance.
Can anyone point me to some code or module that provides the
appropriate data structures and algorithms to handle this task
efficiently? The size of the list will likely be in the range of
10-1000 elements.

The plotting-library dislin (http://www.dislin.com)
contains a really fast triangulation subroutine
(~1 hour for 700000 points on an 1.5 GHz Athlon).

For an example triangulation/nearest-neighbor-search see
the attached python-script.

Dislin is available for many platforms (Linux, Winxxx, ...)
and it can be used for free if you are using free languages like
Python, Perl, etc.

Happy pythoning

Herbert


#!/usr/bin/python

# (c) by H. Weinhandl 04.Nov.2003

import math
import dislin


def dist( ia,ib ) :
return math.sqrt( (X1[ia]-X1[ib])**2 + (Y1[ia]-Y1[ib])**2 )


def find_nearest_neighb() :

print 'extracting neighbours ... ',

for i in range( nr_dat+3 ) :
# initualize list wit the point i itself
neighb.append( )

for i in range( nr_tri ) :
# get a indes-triple, dislin seems to start indices with 1,
# so I'm subtracting 1 to get a zero-based index
U,V,W = I1-1, I2-1, I3-1

# add indices from all triangles, which contain the point itself
if ( U in neighb ) :
if not (V in neighb ) : neighb.append( V ) # do not append V if already in the list
if not (W in neighb ) : neighb.append( W ) # do not append W if already in the list

if ( V in neighb[V] ) :
if not (U in neighb[V] ) : neighb[V].append( U ) # do not append U if already in the list
if not (W in neighb[V] ) : neighb[V].append( W ) # do not append W if already in the list

if ( W in neighb[W] ) :
if not (U in neighb[W] ) : neighb[W].append( U ) # do not append U if already in the list
if not (V in neighb[W] ) : neighb[W].append( V ) # do not append V if already in the list

print ' OK'
for i in range( nr_dat ) :
neighb.remove( i ) # remove the point i itself
neighb.sort()
r_mi = 9.9e9
i_mi = i

# search for the nearest neighbor of point i
for j in neighb :
r = dist( i, j )
if r < r_mi :
r_mi = r
i_mi = j
print 'NB %2d : r_mi=%5.3f, i_mi=%2d '% (i, r_mi, i_mi), neighb




if __name__ == '__main__' :

X1 = [ 1.0, 2.0, 3.0, 4.0, 5.0, 4.0, 3.0 ]
Y1 = [ 1.0, 6.0, 3.0, 3.0, 2.0, 6.0, 0.0 ]
nr_dat = len( X1 )
nr_max = 2 * nr_dat + 1
I1 = [ 0 ] * nr_max
I2 = [ 0 ] * nr_max
I3 = [ 0 ] * nr_max
neighb = [ ]

# padding the 2 input-lists with 3 additional elements is required by dislin
X1.append( 0 )
X1.append( 0 )
X1.append( 0 )
Y1.append( 0 )
Y1.append( 0 )
Y1.append( 0 )


# delaunay triangulation of the input lists
# I1,I2,I3 are the indices of the triangular edge-points
nr_tri = dislin.triang( X1,Y1, nr_dat, I1,I2,I3, nr_max )

print X1
print Y1
print nr_dat, nr_tri, nr_max
print I1
print I2
print I3

find_nearest_neighb()

#---- end ----
 
J

Jim Richardson

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

The above is actually timed.


Here you put everything in the setup parameter. As the timed statement
defaults to pass, you are essentially timing an empty loop :-(


Ah! that explains it, thanks. breaking the import off, gets the loop to
3.2 usec per loop, which makes more sense.
You should have used math.abs() or from math import abs to really get the
abs() function in the math module. But

False

Not there, so we cannot compare.

Actually, I just cut'n'pasted, with a tiny alteration (timeit.py isn't
in my path) I like the idea of using the complex number there, it's a
tad easier that the other method for me to visualize.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (GNU/Linux)

iD8DBQE/qbFpd90bcYOAWPYRArgzAKDiNBXw3tIpooWqglmxoqn2GebyxgCcDEKV
u60SPPopUrkJ1Kak0t4eias=
=+Guh
-----END PGP SIGNATURE-----
 
C

Colin J. Williams

This enquiry has generated a lot of interest. I thought it might be
useful to try numarray on the problem. numarray has a compress
function, which is used to discard points which are not of interest.

The code is below.

As would be expected, each scan over the points in the neigbourhood
discards, on average, a little more than half the points.

I have not tried it, but it should be possible, with numarray, to
generalize this from 2D to multimimensional space.

Colin W.

# Neighbour.py
''' To find the closest neighbour, in a neghbourhood P,
to a point p.
'''
import math
import numarray as N
import numarray.numerictypes as _nt
import numarray.random_array as R

trial= 0
lengthRemaining= []
def find(P, p):
''' In the 2D neighbourhood P, find the point closest to p.
The approach is based on the selection of a trial value Pt, from
P, and
discarding all values further than Pt from p.
To avoid repeated sqrt calculations the discard is based on an
enclosing square.
'''
global lengthRemaining, trial
lengthRemaining+= [[P.shape[0]]]
Pz= P - p # zero based neighbourhood
while len(Pz):
Pt= Pz[0] # trial value
Pta= N.abs(Pt)
Pz= Pz[1:]
pd= math.sqrt(N.dot(Pta, Pta)) # distance of p from the trial
value
goodCases= N.logical_and((Pz < pd),
(Pz > -pd))# use the enclosing square
goodCases= N.logical_and.reduce(goodCases, 1)
Pz= N.compress(goodCases, Pz) # discard points outside the square
if len(Pz) == 1:
Pt= Pz[0] # We have found the closest
Pz= []
lengthRemaining[trial]+= [len(Pz)]
z= 100
trial+= 1
return Pt + p

if __name__ == '__main__':
for sampleSize in range(100, 5000, 100):
P= R.random(shape= (sampleSize, 2))
for i in range(20):
p= R.random((1, 2)) # point
a= find(P, p)
## print 'Closest neighbour:', a[0]
## print 'Check - Point(p):', p[0]
## check= []
## for i in range(len(P)):
## check+= [(math.sqrt((P[i, 0]-p[0, 0])**2 + (P[i, 1]-p[0,
1])**2), P[i, 0], P[i, 1])]
## print P, math.sqrt((P[i, 0]-p[0, 0])**2 + (P[i, 1]-p[0, 1])**2)
## check.sort()
## print check[0]
## print check
print 'Number of scans:', sum([len(lst) for lst in lengthRemaining])
print 'Sample size:', P.shape[0], ' Average numner of scans:',
sum([len(lst) for lst in lengthRemaining])/float(len(lengthRemaining))


WEINHANDL said:
John said:
I have a list of two tuples containing x and y coord
(x0, y0)
(x1, y1)
...
(xn, yn)

Given a new point x,y, I would like to find the point in the list
closest to x,y. I have to do this a lot, in an inner loop, and then I
add each new point x,y to the list. I know the range of x and y in
advance.

Can anyone point me to some code or module that provides the
appropriate data structures and algorithms to handle this task
efficiently? The size of the list will likely be in the range of
10-1000 elements.


The plotting-library dislin (http://www.dislin.com)
contains a really fast triangulation subroutine
(~1 hour for 700000 points on an 1.5 GHz Athlon).

For an example triangulation/nearest-neighbor-search see
the attached python-script.

Dislin is available for many platforms (Linux, Winxxx, ...)
and it can be used for free if you are using free languages like
Python, Perl, etc.

Happy pythoning

Herbert


------------------------------------------------------------------------

#!/usr/bin/python

# (c) by H. Weinhandl 04.Nov.2003

import math
import dislin


def dist( ia,ib ) :
return math.sqrt( (X1[ia]-X1[ib])**2 + (Y1[ia]-Y1[ib])**2 )


def find_nearest_neighb() :

print 'extracting neighbours ... ',

for i in range( nr_dat+3 ) :
# initualize list wit the point i itself
neighb.append( )

for i in range( nr_tri ) :
# get a indes-triple, dislin seems to start indices with 1,
# so I'm subtracting 1 to get a zero-based index
U,V,W = I1-1, I2-1, I3-1

# add indices from all triangles, which contain the point itself
if ( U in neighb ) :
if not (V in neighb ) : neighb.append( V ) # do not append V if already in the list
if not (W in neighb ) : neighb.append( W ) # do not append W if already in the list

if ( V in neighb[V] ) :
if not (U in neighb[V] ) : neighb[V].append( U ) # do not append U if already in the list
if not (W in neighb[V] ) : neighb[V].append( W ) # do not append W if already in the list

if ( W in neighb[W] ) :
if not (U in neighb[W] ) : neighb[W].append( U ) # do not append U if already in the list
if not (V in neighb[W] ) : neighb[W].append( V ) # do not append V if already in the list

print ' OK'
for i in range( nr_dat ) :
neighb.remove( i ) # remove the point i itself
neighb.sort()
r_mi = 9.9e9
i_mi = i

# search for the nearest neighbor of point i
for j in neighb :
r = dist( i, j )
if r < r_mi :
r_mi = r
i_mi = j
print 'NB %2d : r_mi=%5.3f, i_mi=%2d '% (i, r_mi, i_mi), neighb




if __name__ == '__main__' :

X1 = [ 1.0, 2.0, 3.0, 4.0, 5.0, 4.0, 3.0 ]
Y1 = [ 1.0, 6.0, 3.0, 3.0, 2.0, 6.0, 0.0 ]
nr_dat = len( X1 )
nr_max = 2 * nr_dat + 1
I1 = [ 0 ] * nr_max
I2 = [ 0 ] * nr_max
I3 = [ 0 ] * nr_max
neighb = [ ]

# padding the 2 input-lists with 3 additional elements is required by dislin
X1.append( 0 )
X1.append( 0 )
X1.append( 0 )
Y1.append( 0 )
Y1.append( 0 )
Y1.append( 0 )


# delaunay triangulation of the input lists
# I1,I2,I3 are the indices of the triangular edge-points
nr_tri = dislin.triang( X1,Y1, nr_dat, I1,I2,I3, nr_max )

print X1
print Y1
print nr_dat, nr_tri, nr_max
print I1
print I2
print I3

find_nearest_neighb()

#---- end ----
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,743
Messages
2,569,478
Members
44,899
Latest member
RodneyMcAu

Latest Threads

Top