Ranking

C

Crouchez

What's the best way to have in-memory ranking? For example each element is
counted for the number of times it appears and ranks are updated with every
count. Eg

element 1, 4 times
element 4, 3 times
element 3, 2 times
element 2, 1 times

Can I use a hashtable type for this?
 
J

Jacky

What's the best way to have in-memory ranking? For example each element is
counted for the number of times it appears and ranks are updated with every
count. Eg

element 1, 4 times
element 4, 3 times
element 3, 2 times
element 2, 1 times

Can I use a hashtable type for this?


You can use Collections.sort(list) to sort you record.
Of course,the element in list should implements Comparable interface.
 
I

Ishwor Gurung

Crouchez said:
What's the best way to have in-memory ranking? For example each element is
counted for the number of times it appears and ranks are updated with
every count. Eg

element 1, 4 times
element 4, 3 times
element 3, 2 times
element 2, 1 times

Is this your homework or something ?
e.g., informal pseudocode below -

a[],b[]=a;
found[] = {0}; //make found the same size as a in general;
i =0, foreach i in a.length, i++:
j=0,foreach j in b.lengh, j++:
if b[j] = a:
found++;
Can I use a hashtable type for this?

Yes, sure. Hashtables are nothing but a mapping system where a=>b (map keys
to values) so if you take a bit of time and play around with the the
collections API, it'll work. you'd probably need it when in above code
after "found++;". :)

hth
 
J

Jani Tiainen

Ishwor Gurung kirjoitti:
Crouchez said:
What's the best way to have in-memory ranking? For example each element is
counted for the number of times it appears and ranks are updated with
every count. Eg

element 1, 4 times
element 4, 3 times
element 3, 2 times
element 2, 1 times

Is this your homework or something ?
Can I use a hashtable type for this?

Yes, sure. Hashtables are nothing but a mapping system where a=>b (map keys
to values) so if you take a bit of time and play around with the the
collections API, it'll work. you'd probably need it when in above code
after "found++;". :)


I think "priority queue" is what Crouchez is after. And that is part of
java collections framework since 1.5
 
L

Lew

Jani said:
Ishwor Gurung kirjoitti:
Crouchez said:
What's the best way to have in-memory ranking? For example each
element is
counted for the number of times it appears and ranks are updated with
every count. Eg

element 1, 4 times
element 4, 3 times
element 3, 2 times
element 2, 1 times

Is this your homework or something ?
Can I use a hashtable type for this?

Yes, sure. Hashtables are nothing but a mapping system where a=>b (map
keys
to values) so if you take a bit of time and play around with the the
collections API, it'll work. you'd probably need it when in above code
after "found++;". :)


I think "priority queue" is what Crouchez is after. And that is part of
java collections framework since 1.5


Also, watch out for the term "hash table" in Java; it might lead you to use
java.util.Hashtable, which would be suboptimal in most cases. Declare the
variable as Map when such things are needed, and pick the appropriate
implementation thereof, such as HashMap, or very rarely, Hashtable.

Similarly with Vector vs. ArrayList.
 
C

Crouchez

Jacky said:
You can use Collections.sort(list) to sort you record.
Of course,the element in list should implements Comparable interface.

A list is a array of single items but I have two items of data for one
object?
 
C

Crouchez

Lew said:
Jani said:
Ishwor Gurung kirjoitti:
Crouchez wrote:

What's the best way to have in-memory ranking? For example each element
is
counted for the number of times it appears and ranks are updated with
every count. Eg

element 1, 4 times
element 4, 3 times
element 3, 2 times
element 2, 1 times

Is this your homework or something ?

Can I use a hashtable type for this?

Yes, sure. Hashtables are nothing but a mapping system where a=>b (map
keys
to values) so if you take a bit of time and play around with the the
collections API, it'll work. you'd probably need it when in above code
after "found++;". :)


I think "priority queue" is what Crouchez is after. And that is part of
java collections framework since 1.5


Also, watch out for the term "hash table" in Java; it might lead you to
use java.util.Hashtable, which would be suboptimal in most cases. Declare
the variable as Map when such things are needed, and pick the appropriate
implementation thereof, such as HashMap, or very rarely, Hashtable.

Similarly with Vector vs. ArrayList.


why rarely hashtable? - it's auto synchronized which is a big help
 
C

Crouchez

Basically what I want to do is maintain a constant top 10 list that is
accessed and modified by multiple threads. The list contains objects that
have two values - a name and an access count. Think I might use this?
Collections.sort(List list, Comparator c) like Jacky said but with a
different comparator
 
J

Joshua Cranmer

Crouchez said:
Basically what I want to do is maintain a constant top 10 list that is
accessed and modified by multiple threads. The list contains objects that
have two values - a name and an access count. Think I might use this?
Collections.sort(List list, Comparator c) like Jacky said but with a
different comparator

Your choice depends on a few things:

If your list has a total of N items, finding the top k in this list is
O(k*N) using a hash table, O(k) for a sorted list, O(k*N) for an
unsorted list, and O(k*lg N) for a priority queue.

Updating the list requires O(1) for a hash table, O(N) for a sorted
list, O(1) for an unsorted list, and O(lg N) for priority queues.

If the ratio of modifications to generations is m (i.e., for every m
modifications, the list is regenerated), then the total time is O(m+k*N)
for hash table, O(N+m*k) for sorted lists, O(k*N+m) for unsorted lists,
and O((k+m)*lg N) for priority queues.

If m is 1, these formulas degenerate into O(k*N), O(N+k), O(k*N), and
O(k*lg N) respectively.

I should note that metrics for the last two imply O(1) access to a
specific element, which means that an auxiliary hash table mapping
objects to indices is needed. For unsorted lists, that means that it is
essentially a hash table. Without that unit, the times go up to O(N) for
access in both cases.

Furthermore, the time for a sorted list implies using a
slightly-modified insertion sort rather than a quick sort. Updating goes
up to an O(N lg N) in that case. I am also assuming that a custom heap
is used for the priority queue case to take advantage of the hash table.

If you want to write as little code as possible, a hash table
(implemented via HashMap and Collections.synchronizedMap()) is fastest.
In high-throughput conditions, a custom-made priority queue works best;
the sorted list is a good middle-of-the-road, assuming you use a
modified insertion sort rather than the default quicksort.
 
M

Martin Gregorie

Crouchez said:
A list is a array of single items but I have two items of data for one
object?
A single item can be an Object provided (see above) that it implements
the Comparable interface. The Object can contain an arbitrary number of
data
items.
 
L

Lew

Crouchez said:
A list is a array of single items but I have two items of data for one
object?

use a Map if one of the elements is a key to the other, which in this case it
seems to be:

{ item, count(item) }

Use HashMap first, another implementation if you need the characteristics of it.
 
L

Lew

why rarely hashtable? - it's auto synchronized which is a big help

That's "Hashtable", not "hashtable". You hadn't previously stated a need for
synchronization, so of course I spoke to the general case.

Hashtable isn't even necessarily the best synchronized Map because of
non-Collection methods it still supports.

You can use Collections.synchronizedMap( someMap ) for better safety.

Or use Hashtable. Notice that I said it's "suboptimal in most cases", not "in
all cases".

Are you writing multi-threaded code that shares the Map, then?
 
C

Crouchez

Lew said:
That's "Hashtable", not "hashtable". You hadn't previously stated a need
for synchronization, so of course I spoke to the general case.

Hashtable isn't even necessarily the best synchronized Map because of
non-Collection methods it still supports.

You can use Collections.synchronizedMap( someMap ) for better safety.

Or use Hashtable. Notice that I said it's "suboptimal in most cases", not
"in all cases".

Are you writing multi-threaded code that shares the Map, then?

Yeah. I've tried Collection.synchronizedMap(map) but it throughs a
concurrentmodification when used with an Iterator even when wrapped in a
synchnronized(map) block. Hashtable with enumeration doesn't do this?
 
R

Roedy Green

What's the best way to have in-memory ranking? For example each element is
counted for the number of times it appears and ranks are updated with every
count. Eg

element 1, 4 times
element 4, 3 times
element 3, 2 times
element 2, 1 times

Can I use a hashtable type for this?

If what you are trying to do is count item of various kinds there are
three basic approaches:

1. HashMap. Look up object and increment its count.
see http://mindprod.com/jgloss/hashmap.html

2. array of counts. Compute which bin and increment it.
see http://mindprod.com/jgloss/array.html

3. sort and collapse.
see http://mindprod.com/jgloss/sort.html
 
C

Crouchez

Crouchez said:
Yeah. I've tried Collection.synchronizedMap(map) but it throughs a
concurrentmodification when used with an Iterator even when wrapped in a
synchnronized(map) block. Hashtable with enumeration doesn't do this?
*throughs?? throws that should be
 
L

Lew

Crouchez said:
*throughs?? throws that should be

Have you tried it?

ConcurrentModificationException is not inherently a threading issue. It means
something changed the underlying Collection but not through the Iterator.
Synchronization will not fix that.
 
C

Crouchez

Lew said:
Have you tried it?

ConcurrentModificationException is not inherently a threading issue. It
means something changed the underlying Collection but not through the
Iterator. Synchronization will not fix that.

Yep. I can confirm that Collection.synchronizedMap(map) with iterator throws
an concurrentmodificationexception with multithreading read/write. Hashtable
with enumeration works ok.
 
L

Lew

Yep. I can confirm that Collection.synchronizedMap(map) with iterator throws
an concurrentmodificationexception with multithreading read/write. Hashtable
with enumeration [sic] works ok.

It'll throw that exception even without more than one thread.

So will Hashtable, if you use an Iterator.

When you said "enumeration", did you mean "Enumeration", as in
"java.util.Enumeration<E>"? If so then you are mistaken, it does not "work
ok" because there is no ability for the Enumeration to detect concurrent
modification. Therefore the code is silently breaking.

Again - ConcurrentModificationException is not inherently a threading issue.
It does not require more than one thread to happen. If you are getting that
exception, it's because you're not protecting the Map against "side" access,
ergo you're getting inconsistent or wrong results. The exception alerts you
to that issue. Using an Enumeration to hide the error doesn't mean the error
isn't happening.
 
C

Crouchez

Lew said:
Yep. I can confirm that Collection.synchronizedMap(map) with iterator
throws an concurrentmodificationexception with multithreading read/write.
Hashtable with enumeration [sic] works ok.

It'll throw that exception even without more than one thread.

So will Hashtable, if you use an Iterator.

When you said "enumeration", did you mean "Enumeration", as in
"java.util.Enumeration<E>"? If so then you are mistaken, it does not
"work ok" because there is no ability for the Enumeration to detect
concurrent modification. Therefore the code is silently breaking.

Again - ConcurrentModificationException is not inherently a threading
issue. It does not require more than one thread to happen. If you are
getting that exception, it's because you're not protecting the Map against
"side" access, ergo you're getting inconsistent or wrong results. The
exception alerts you to that issue. Using an Enumeration to hide the
error doesn't mean the error isn't happening.

--
Lew
Again - ConcurrentModificationException is not inherently a threading
issue. It does not require more than one thread to happen.

Concurrent means multiple threads though doesn't it?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,019
Latest member
RoxannaSta

Latest Threads

Top