M
Mike
Please bear with me. I'm a bit new to Java. I have a list of strings
that I'd like to summate. For example I'd like to take this data:
Apple
Orange
Apple
Apple
Pear
Peach
Banana
Banana
And generate this data:
Term Count
---- -----
Apple 3
Orange 1
Pear 1
Peach 1
Banana 2
I'd then like to sort this data so the items with the higher counts come
first in the list. The list of source strings can be anywhere from 1 -
20000 strings having anywhere from 1 - 500 unique terms. The average
set will be roughly 300 strings with about 30 unique terms. I am
currently doing this now by looping through the strings and keeping the
term/count data in an ArrayList. I use a binary search to determine if
the string is already in my list. If not I insert the string into the
ArrayList in the appropriate location to keep the list sorted so the
binary search will continue to work.
My method is fast, but I'm hoping it can be faster. I'm running this
code in a web app and it needs to be as fast as possible. The data
constantly changes so caching is not an option. It's also not coming
from a SQL data source so performing a query to generate this is also
not an option.
What I'm wondering is if there's some innate Java class that does this
sort of thing already and hopefully does it much faster than I'm doing
it. The code needs to work with Java Version 1.4.2_09. I've looked at
TreeMap and some of the other various collection classes, but I'm not
sure from the API docs which one might be best suited for my scenario.
I'm hoping the collective wisdom of this group can help save me some
time and aggravation.
Thanks in advance!
- Mike
that I'd like to summate. For example I'd like to take this data:
Apple
Orange
Apple
Apple
Pear
Peach
Banana
Banana
And generate this data:
Term Count
---- -----
Apple 3
Orange 1
Pear 1
Peach 1
Banana 2
I'd then like to sort this data so the items with the higher counts come
first in the list. The list of source strings can be anywhere from 1 -
20000 strings having anywhere from 1 - 500 unique terms. The average
set will be roughly 300 strings with about 30 unique terms. I am
currently doing this now by looping through the strings and keeping the
term/count data in an ArrayList. I use a binary search to determine if
the string is already in my list. If not I insert the string into the
ArrayList in the appropriate location to keep the list sorted so the
binary search will continue to work.
My method is fast, but I'm hoping it can be faster. I'm running this
code in a web app and it needs to be as fast as possible. The data
constantly changes so caching is not an option. It's also not coming
from a SQL data source so performing a query to generate this is also
not an option.
What I'm wondering is if there's some innate Java class that does this
sort of thing already and hopefully does it much faster than I'm doing
it. The code needs to work with Java Version 1.4.2_09. I've looked at
TreeMap and some of the other various collection classes, but I'm not
sure from the API docs which one might be best suited for my scenario.
I'm hoping the collective wisdom of this group can help save me some
time and aggravation.
Thanks in advance!
- Mike