# Hashing function for integer coordinates

Discussion in 'Java' started by Owen Jacobson, May 26, 2005.

1. ### Owen JacobsonGuest

Salve.

I find myself in need of a hashing function mapping (int x, int y) -> int
such that either A) no (or very few) points "near"[0] a given point have
identical hash values as that point or at least B) points in the vicinity
of (0, 0) have unique hash values.

If anyone has such a function on hand, I'd be very grateful; even better
would be pointers on how to develop such a function for myself.

-Owen

[0] within some small fraction of the representible range.

Owen Jacobson, May 26, 2005

2. ### Chris McDonaldGuest

What about: sprintf(str, "(%.10f,%.10f)", x, y);
MD5(result, str);

______________________________________________________________________________
Dr Chris McDonald E:
Computer Science & Software Engineering W: http://www.csse.uwa.edu.au/~chris
The University of Western Australia, M002 T: +618 6488 2533
Crawley, Western Australia, 6009 F: +618 6488 1089

Chris McDonald, May 26, 2005

3. ### Chris UppalGuest

If you want that property then you have to ensure that the low-order bits of
both coordinates are preserved in the hash.

One simple implementation of that idea is simply to discard the high 16-bits of
each coordinate, and then combine their low bits into one hash:

int hash(int x, int y)
{
int lowX = x & 0xFFFF;
int lowY = y & 0xFFFF;
int hash = (lowX << 16) | lowY;

return hash;
}

Since two distinct points near to each other must differ in the low bits of
either their x or y co-ordinates, and since the hash preserves those bits, the
points must have different hashes. Essentially you are dividing the plane into
64K x 64K squares and then giving each point in each square the same hash as
the corresponding points in all the other squares, but giving each point
/within/ a square a unique hash.

BTW, please don't assume that just because the hashes are unique (in the above
sense) that they are also necessarily well-scattered for use in any given
hash-table implementation.

More sophisticated schemes are possible. For instance you could preserve the
low 8 bits of each co-ordinate to make 16 bits of the hash, and then hash
together the high 24 bits of the coordinates in some more "random" way to
supply the other 16 bits of the hash

-- chris

Chris Uppal, May 26, 2005
4. ### CBFalconerGuest

You can't have such a thing without a completely pre-determined
input set, when you should look for means of creating a "perfect
hash". Your criteria are such that the resultant hash should give
greater weight to the less significant bits of the points. Most
hashing systems tend to give equal weight to all input bits. Here
weight refers to the probability of the bits affecting the output
value.

As a crude example, the logarithm of the x and y displacements
emphasizes small displacements, at the cost of large displacements,
for values > 1.

Polar coordinates might make more sense, since they immediately
codify the meaning of 'near'.

CBFalconer, May 26, 2005