Question about hashing

G

Guest

Dear All,

I have a text file which consists of 10000000 data. I would like to do
hashing
to search the desired data rather than linear search. Would you please give
me some hints to implement that ?

Best and Regards,
David
 
B

Benji

Dear All,
I have a text file which consists of 10000000 data. I would like to do
hashing
to search the desired data rather than linear search. Would you please give
me some hints to implement that ?

You're going to have to be more specific about your requirements. What are
the "data"? What are your size and speed requirements?

If you have 10 million of anything that's even close to large, you won't be
able to store it all in memory, and you'll have to index the data if you
want searching to be fast. (but that's only possible depending on what
type of stuff the data is, so be more specific)
 
G

Guest

It's glad to receive your reply. Actually, I have a frequency table, which
stores words and frequency of the words. I have tried to read this values to
hashtable and then search the word. However, it took a long time to finish.
Thanks.

Frequency Table :
Word Frequencey
Java 45
..NET 11

The size of this table only 5 MB.

Best and Regards,
David
 
I

Ingo R. Homann

Hi,

It's glad to receive your reply. Actually, I have a frequency table, which
stores words and frequency of the words. I have tried to read this values to
hashtable and then search the word. However, it took a long time to finish.
Thanks.

Frequency Table :
Word Frequencey
Java 45
.NET 11

The size of this table only 5 MB.

That sounds to be no problem, I think. Can you post a code snippet, how
you read the file and how you store the data in a HashMap?

What takes so long? Reading the file or looking for an entry? What do
you mean with "long"? 10 ms? 1 sec? 10 sec?

Ciao,
Ingo
 
R

Roedy Green

I have a text file which consists of 10000000 data. I would like to do
hashing
to search the desired data rather than linear search. Would you please give
me some hints to implement that ?

You would have to break each line in to words. You would do that with
a regex split. see http://mindprod.com/jgloss/regex.html

Then you would add each word to a HashMap key the word, and value the
line number or offset.

see http://mindprod.com/jgloss/hashmap.html
http://mindprod.com/jgloss/hashtable.html

The default hashCode for String will do fine.
 
G

Guest

Hi,

I just simply use StreamReader to read the file by using readline().
Then put the word as key and frequencey as value to
the hashtable. After updating the frequency in the hashtable, the whole
frequency table would write to frequencey table with I/O.

Thanks.
 
C

Chris Uppal

I just simply use StreamReader to read the file by using readline().
Then put the word as key and frequencey as value to
the hashtable. After updating the frequency in the hashtable, the whole
frequency table would write to frequencey table with I/O.

You'll have to post some code. And say why you think it's too slow.

(Yes, that /is/ just repeating the questions that Ingo has already asked, but
which you didn't answer).

-- chris
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads

Question about my projects 3
hashing strings to integers for sqlite3 keys 22
Hashing long strings... 8
Dynamic Hashing 1
hashing function 6
Benchmarking hashcode algorithms 6
Hashing 1
Question about loggers 26

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,014
Latest member
BiancaFix3

Latest Threads

Top