Distinct ID Number Per Object?

H

Hal Vaughan

I have a case where I'll need distinct and printable names to use in a
reference table. I'd like to make it so each object, whether it's of the
same class as any other object or not, can produce a distinct number. It
looks like if I get the hashcode for any object, the JVM attempts to give
each object a unique hashcode, but it doesn't seem to guarantee it.

Is there any way to get a unique string or number for each object that is
created by a particular JVM?

Thanks!

Hal
 
S

Stefan Ram

Hal Vaughan said:
Is there any way to get a unique string or number for each
object that is created by a particular JVM?

The size of a string is bounded by the finite memory the JVM
can allocate. With, for example, 4 GB of memory, a string
might be at most 2^32 Bytes long, so there are at most
2^(2^40) unique strings of this length.

On the other hand, the loop

while( true )new java.lang.Object();

can run for more than 2^(2^40) iterations, as memory is
reclaimed by the garbage collector. Thus it can create an
unlimited number of new objects, especially 2^(2^40)+1.

In this case, there are not enough distinct strings
possible in this JVM to give each of these 2^(2^40)+1 objects
a unique string.
 
H

Hal Vaughan

Stefan said:
The size of a string is bounded by the finite memory the JVM
can allocate. With, for example, 4 GB of memory, a string
might be at most 2^32 Bytes long, so there are at most
2^(2^40) unique strings of this length.

On the other hand, the loop

while( true )new java.lang.Object();

can run for more than 2^(2^40) iterations, as memory is
reclaimed by the garbage collector. Thus it can create an
unlimited number of new objects, especially 2^(2^40)+1.

In this case, there are not enough distinct strings
possible in this JVM to give each of these 2^(2^40)+1 objects
a unique string.

So is it only in extreme cases like this where hashcodes would be
duplicated?

Hal
 
T

Twisted

So is it only in extreme cases like this where hashcodes would be
duplicated?

Not necessarily, but it's unlikely you'll run out of unique ones in a
production environment.

I suggest you use System.identityHashCode(Object) to get these
numbers. It should be a) fixed for an object's lifetime in one session
(it will change when the object is serialized and later deserialized);
b) globally unique (within the one JVM anyway) as the usual
implementation of the default hash code for Object is the memory
address of that object, which is necessarily globally unique in that
scope; and c) not subject to being overridden unlike calling
hashCode() on the object. This of course works if you need a globally
unique ID per object, even to the point of two copies of a single
object (so equals()) have distinct such IDs. Try to just use hashCode
otherwise.

A second option is to create an IdentityHashMap<Object, Integer> and
stuff objects into that. Distinct objects act as distinct keys (even
if equals()) and you'd assign a new higher integer to each one. Use
Long if you run out of Integers (unlikely). Use BigDecimal if you run
out of Longs (unlikely until we have converted most of the visible
universe into computronium). This has the benefit of giving out ever-
higher numbers even to objects that use the memory space where an
earlier object was before being garbage collected. You can't twiddle
the Object constructor to put everything in automatically, but you can
fake it by having the get-ID method you plan to use actually assign an
ID to objects that don't already have one lazily when it's first
requested. Of course, the objects won't get garbage collected unless
you use a WeakReference, in which case you may as well use an ordinary
HashMap<WeakReference<Object>, Integer> as the WeakReference hashCode
is the default one if I recall correctly. (WeakHashMap would cause
distinct objects that compare equal to have the same ID.) This is
complicated because you need to carry the WeakReference around with
you to look up in the HashMap (you can't just make a new one to the
same object and expect it to work). Perhaps a better option is to wrap
objects that will need IDs in a dummy object that has a single public
Object field, the default equals and hashCode, and make a
WeakHashMap<Dummy, Integer>. The wrapper object has to be used in
place of the original object on any path that leads to getting its ID.

Finally, if this ID is only needed for objects of classes you control,
you can make the base classes you control generate unique ID numbers
to put in a public final field. This is easiest if you can derive all
the classes for which you have this requirement from a single base.
Otherwise, the ID can be a long with 32 bits a sequentially-assigned
int and 32 bits the hashCode of the particular base class.
 
L

Lew

Hash codes have even fewer values than Strings. That means there must be
proportionately more collisions. Have you read the Javadocs on the hashCode()
method? You should. Also read the Javadocs on Map, HashMap and IdentityHashMap.

As Twisted pointed out, the "Identity", i.e., the internal "address" of an
object, is unique for the lifetime of that object. Even without
IdentityHashMap, any Map can use an object that doesn't override equals()
(most custom objects, for example) as a unique key into a lookup. It is
sufficient to use a regular Map (e.g., HashMap) when equals() and == define
the same relation. IdentityHashMap is for when they differ and you want the
key selection to be based on ==.

Twisted said:
as the usual implementation of the default hash code for Object is the memory address of that object,

"converted to an integer".
(This is typically implemented by converting the internal address of the object into an integer, but this implementation technique is not required by the JavaTM programming language.)

You certainly cannot rely on a correspondence. That is what Sun's
implementation of Object.hashCode() does, but many, many subclasses override
it. It is a best practice (see /Effective Java/ by Josh Bloch) to override
hashCode() in any class that overrides equals(). Since most of the objects in
an application likely are of subtypes of Object, it is common that their
hashCode() will not return the "address" of the object.

Javadocs rule.
 
S

Stefan Ram

Hal Vaughan said:
So is it only in extreme cases like this where hashcodes would
be duplicated?

The hashcode only needs to fulfill the requirements of

http://download.java.net/jdk7/docs/api/java/lang/Object.html#hashCode()

These requirements are compatible with an implementation
that always returns the same value for each object.

The hashcode of an object depends on the class.
Therefore, if the class of an object is not known,
one can not assert more than the above requirements.
 
K

Karl Uppiano

Hal Vaughan said:
I have a case where I'll need distinct and printable names to use in a
reference table. I'd like to make it so each object, whether it's of the
same class as any other object or not, can produce a distinct number. It
looks like if I get the hashcode for any object, the JVM attempts to give
each object a unique hashcode, but it doesn't seem to guarantee it.

Is there any way to get a unique string or number for each object that is
created by a particular JVM?

UUIDs are sometimes used for applications like this, as long as you remain
cognizant of the possible dynamic range and/or performance limitations.
However, it represents a 128-bit value, and java.util.UUID.randomUUID has
been fast enough for my needs.

http://en.wikipedia.org/wiki/UUID

http://java.sun.com/javase/6/docs/api/index.html
 
H

Hal Vaughan

Lew said:
Hash codes have even fewer values than Strings. That means there must be
proportionately more collisions. Have you read the Javadocs on the
hashCode()
method? You should. Also read the Javadocs on Map, HashMap and
IdentityHashMap.

I did. The part that concerned me was this: "It is not required that if two
objects are unequal according to the equals(java.lang.Object) method, then
calling the hashCode method on each of the two objects must produce
distinct integer results."

That's why I was asking about whether they were unique within a particular
runtime.
As Twisted pointed out, the "Identity", i.e., the internal "address" of an
object, is unique for the lifetime of that object. Even without
IdentityHashMap, any Map can use an object that doesn't override equals()
(most custom objects, for example) as a unique key into a lookup. It is
sufficient to use a regular Map (e.g., HashMap) when equals() and ==
define
the same relation. IdentityHashMap is for when they differ and you want
the key selection to be based on ==.

Twisted said:

"converted to an integer".


You certainly cannot rely on a correspondence. That is what Sun's
implementation of Object.hashCode() does, but many, many subclasses
override
it.

It's in one of my own classes, so I'm not concerned about it being
overridden.
It is a best practice (see /Effective Java/ by Josh Bloch) to
override
hashCode() in any class that overrides equals().

That part I did find, but I won't be overridding either one.
Since most of the
objects in an application likely are of subtypes of Object, it is common
that their hashCode() will not return the "address" of the object.

I don't need to separate all objects. I have a set of data tables that all
have a master table, but then they have sub tables that are tracked subsets
of the master tables. I need to make sure that if I create a tracked
table, it has a different name from all the other tracked tables. I have
one particular class that will be generating names for those tracked tables
on its own and I want to make sure that if I create, say, 5 instances of
that class, that each separate instance will create names that are
different than the names created by the other instances.

I don't need an object's address or anything, I just want to be sure that
each instance of this one class has some kind of unique ID I can use to
specify unique names for the tracked tables.

Hal
 
S

Stefan Ram

Twisted said:
I suggest you use System.identityHashCode(Object) to get these
numbers. It should be a) fixed for an object's lifetime in one
session (it will change when the object is serialized and later
deserialized); b) globally unique (within the one JVM anyway)
as the usual implementation of the default hash code for Object
is the memory

http://download.java.net/jdk7/docs/api/java/lang/System.html#identityHashCode(java.lang.Object)

is not guaranteed to be »unique« by the documentation, this is
only a property of some implementations (as you have written
yourself).

On a 32-bit-system, two objects with non-overlapping lifetime
might share the same address.

On a 64-bit-system, even two objects with overlapping lifetime
might need to share the same 32-bit identity hash code.

The original poster might explain, what it is that he wants to
accomplish with the unique ID, as this might provide better
answers.
 
S

Stefan Ram

Hal Vaughan said:
I don't need an object's address or anything, I just want to be
sure that each instance of this one class has some kind of
unique ID I can use to specify unique names for the tracked
tables.

This could also be solved by a counter singleton invoked
upon creation of such an instance. The instance then
uses its unique counter value as a prefix for its IDs.
 
H

Hal Vaughan

Stefan said:
This could also be solved by a counter singleton invoked
upon creation of such an instance. The instance then
uses its unique counter value as a prefix for its IDs.

It took me a bit to think through this. Do you mean making a static int and
each instance uses it as an ID, then increments it for the next one?
That's what I got, or rather, worked out.

Thanks!

Hal
 
S

Stefan Ram

Twisted said:
implementation of the default hash code for Object is the memory
address of that object, which is necessarily globally unique in that

Readers are encouraged to run the following program (and be
prepared to wait several minutes, but not more than 15, for
the output) and then report the outcome here.

public class Main
{ final static java.lang.String lineSeparator =
java.lang.System.getProperty( "line.separator" );
public static void main( final java.lang.String[] args )
{ final java.lang.Object object = new java.lang.Object();
final int code = object.hashCode();
java.lang.Object object1;
int code1;
do
{ code1 =( object1 = new java.lang.Object() ).hashCode(); }
while( code1 != code );
java.lang.System.out.print
(( object == object1 )+ lineSeparator +
code + lineSeparator +
code1 + lineSeparator ); }}
 
H

Hal Vaughan

Stefan said:
public class Main
{ final static java.lang.String lineSeparator =
java.lang.System.getProperty( "line.separator" );
public static void main( final java.lang.String[] args )
{ final java.lang.Object object = new java.lang.Object();
final int code = object.hashCode();
java.lang.Object object1;
int code1;
do
{ code1 =( object1 = new java.lang.Object() ).hashCode(); }
while( code1 != code );
java.lang.System.out.print
(( object == object1 )+ lineSeparator +
code + lineSeparator +
code1 + lineSeparator ); }}

What kind of safeguards does Java have in place so this doesn't overload my
CPU or RAM?

Hal
 
S

Stefan Ram

Hal Vaughan said:
It took me a bit to think through this. Do you mean making a
static int and each instance uses it as an ID, then increments
it for the next one? That's what I got, or rather, worked out.

class globalCounter { private static int value = 0;
public static int getValue(){ return value++; }}

class Identifier
{ final private java.lang.String prefix;
private int count;
public Identifier()
{ this.prefix = java.lang.String.valueOf( globalCounter.getValue() );
this.count = 0; }
public java.lang.String get()
{ return prefix + "-" + java.lang.String.valueOf( count++ ); }}

public class Main
{ final static java.lang.String lineSeparator =
java.lang.System.getProperty( "line.separator" );
public static void main( final java.lang.String[] args )
{ final Identifier identifier0 = new Identifier();
final Identifier identifier1 = new Identifier();
java.lang.System.out.println
( identifier0.get() + lineSeparator +
identifier0.get() + lineSeparator +
identifier0.get() + lineSeparator +
identifier1.get() + lineSeparator +
identifier1.get() + lineSeparator +
identifier1.get() + lineSeparator ); }}

0-0
0-1
0-2
1-0
1-1
1-2
 
L

Lew

Hal Vaughan wrote:
Lew wrote:

Hal said:
The part that concerned me was this: "It is not required that if two
objects are unequal according to the equals(java.lang.Object) method, then
calling the hashCode method on each of the two objects must produce
distinct integer results."

That's why I was asking about whether they were unique within a particular
runtime.

They aren't, necessarily. It depends on the hashCode() method of the object
in question.

It's in one of my own classes, so I'm not concerned about it being
overridden.

I don't understand. If you control the hashCode() then you know what it does.
Where does the question come from?
That part I did find, but I won't be overridding either one.


I don't need to separate all objects. I have a set of data tables that all
have a master table, but then they have sub tables that are tracked subsets
of the master tables. I need to make sure that if I create a tracked
table, it has a different name from all the other tracked tables. I have
one particular class that will be generating names for those tracked tables
on its own and I want to make sure that if I create, say, 5 instances of
that class, that each separate instance will create names that are
different than the names created by the other instances.

But names aren't hash codes. By definition, a hash reduces the size of the
value set from the domain to the range.
I don't need an object's address or anything, I just want to be sure that
each instance of this one class has some kind of unique ID I can use to
specify unique names for the tracked tables.

So create unique names. Your issue has nothing to do with hashCode().

If you override equals, let's say to guarantee that two objects with the same
name are considered equal, i.e., to "mean" the same real-world object (in your
case, a "table"), then also override hashCode().

Why not just override those methods so that any time two objects with the same
name, which may be different objects in the JVM, are understood to refer to
the same table? This is a more normal idiom and should do everything you
need. Then you can use normal Maps to map the name to the object that models
the table.

class TableModel
{
private final String name;
public TableModel( String n )
{
name = n;
}
public final String getName() { return name; }
// other attributes
}

then in some other code
Map <String, TableModel> tables = new HashMap <String, TableModel> ();
public TableModel put( TableModel table )
{
return tables.put( table.getName(), table );
}

If you need a table object at a later time in your code, obtain
tables.get( name )
given the name of the table you want. (If you get null, create a TableModel
and put() it into the Map.)

This idiom might save you the trouble of UUID generation.
 
H

Hal Vaughan

Stefan said:
Hal Vaughan said:
It took me a bit to think through this. Do you mean making a
static int and each instance uses it as an ID, then increments
it for the next one? That's what I got, or rather, worked out.

class globalCounter { private static int value = 0;
public static int getValue(){ return value++; }}

class Identifier
{ final private java.lang.String prefix;
private int count;
public Identifier()
{ this.prefix = java.lang.String.valueOf( globalCounter.getValue() );
this.count = 0; }
public java.lang.String get()
{ return prefix + "-" + java.lang.String.valueOf( count++ ); }}

public class Main
{ final static java.lang.String lineSeparator =
java.lang.System.getProperty( "line.separator" );
public static void main( final java.lang.String[] args )
{ final Identifier identifier0 = new Identifier();
final Identifier identifier1 = new Identifier();
java.lang.System.out.println
( identifier0.get() + lineSeparator +
identifier0.get() + lineSeparator +
identifier0.get() + lineSeparator +
identifier1.get() + lineSeparator +
identifier1.get() + lineSeparator +
identifier1.get() + lineSeparator ); }}

0-0
0-1
0-2
1-0
1-1
1-2

Okay, I got one working. Thanks!

Hal
 
H

Hal Vaughan

Lew said:
Hal Vaughan wrote:
Lew wrote:



They aren't, necessarily. It depends on the hashCode() method of the
object in question.



I don't understand. If you control the hashCode() then you know what it
does.
Where does the question come from?

I need a unique ID for each instance of the class. Yes, I can control the
hashCode() method, but I wanted to know if it would be unique. At this
point, I've taken Stefan Ram's suggestion and used a static int and each
time an instance is created this int is used as the ID number and the
static int is incremented for the next one.

....
So create unique names. Your issue has nothing to do with hashCode().

Now I see there are other ways to do this. When I looked through the
Javadocs, the hashCode() function was the only thing I found that I thought
would give a unique id for each class created.
If you override equals, let's say to guarantee that two objects with the
same name are considered equal, i.e., to "mean" the same real-world object
(in your case, a "table"), then also override hashCode().

Why not just override those methods so that any time two objects with the
same name, which may be different objects in the JVM, are understood to
refer to
the same table?

Each table could contain a different subset, so if two objects with the same
name were created, they would use the same table, but they might need
separate tables.
This is a more normal idiom and should do everything you
need. Then you can use normal Maps to map the name to the object that
models the table.

class TableModel
{
private final String name;
public TableModel( String n )
{
name = n;
}
public final String getName() { return name; }
// other attributes
}

then in some other code
Map <String, TableModel> tables = new HashMap <String, TableModel> ();
public TableModel put( TableModel table )
{
return tables.put( table.getName(), table );
}

If I follow this correctly, then one issue is that I have to be sure that
each time an instance of this class is used, I have to make sure it is
passed a unique number as an ID. That means keeping track of those
numbers. I'm using a number of different modules, some I know I'll be
adding months or years from now, so I'm doing as much as possible inside
the classes I'm doing now so later I can create them and use them without
the need to go through much in docs. When I do use them, I will likely
have only a short time to put them in place in a new module, so I'm making
them them as easy as possible to use. Essentially, I'm frontloading the
work. More work now isn't fun, but it means when I have to quickly put
together a new module using these classes, I'll hardly have to remember a
thing or look up much in the Javadocs I create.

Hal
 
E

Eric Sosman

Lew said:
Hash codes have even fewer values than Strings. That means there must
be proportionately more collisions. Have you read the Javadocs on the
hashCode() method? You should. Also read the Javadocs on Map, HashMap
and IdentityHashMap.

As Twisted pointed out, the "Identity", i.e., the internal "address" of
an object, is unique for the lifetime of that object. [...]

Can you find this guarantee in the Javadoc or other
authoritative place? Does this rule out 64-bit JVM's?
 
L

Lew

Eric said:
Lew said:
Hal said:
So is it only in extreme cases like this where hashcodes would be
duplicated?

Hash codes have even fewer values than Strings. That means there must
be proportionately more collisions. Have you read the Javadocs on the
hashCode() method? You should. Also read the Javadocs on Map,
HashMap and IdentityHashMap.

As Twisted pointed out, the "Identity", i.e., the internal "address"
of an object, is unique for the lifetime of that object. [...]

Can you find this guarantee in the Javadoc or other
authoritative place? Does this rule out 64-bit JVM's?

You mean the guarantee that an object's "address" is unique during its
lifetime? How else would the JVM find a particular instance? In other words,
how could it possibly not be?

If two objects had the same "address", then a reference using that "address"
would not reference a single object, which contradicts the very definition of
an object reference.
a reference to the newly created object is returned as the result [of] the indicated constructor

There is no way for one "address" to point to two objects simultaneously.

This question has nothing to do with bit width, AFAICS. I'm not really sure
how there could even be a question here.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,578
Members
45,052
Latest member
LucyCarper

Latest Threads

Top