Compress Lexico

K

kubek82

Hi.
I have a list of words stored in a file.Words (Strings) are separated
with new line character.
I need to keep content of the file in memory during run time, but
sometimes, for different languages, size of the file increases to quite
big size.

I would like to compress the content of the file and load it in to
memory.
Is it worth to do that using Binary Tree? Or simply Huffman Coding
would be more optimal - in case of compress factor=2, I save 2 times
more memory with small programming afford.

Is the Binary Tree good solution for Scrabble? Let's say I want to go
through the Lexico and check whether given word can be formed from my
letters.

I don't expect anyone doing my homework for me, I just hope You can
give me some ideas, key words I should look for, maybe links to code
snippets etc...

With regards
Chris
 
B

Ben Kraufmann

Hi.
I have a list of words stored in a file.Words (Strings) are separated
with new line character.
I need to keep content of the file in memory during run time, but
sometimes, for different languages, size of the file increases to quite
big size.

I would like to compress the content of the file and load it in to
memory.
Is it worth to do that using Binary Tree? Or simply Huffman Coding
would be more optimal - in case of compress factor=2, I save 2 times
more memory with small programming afford.

Is the Binary Tree good solution for Scrabble? Let's say I want to go
through the Lexico and check whether given word can be formed from my
letters.

I don't expect anyone doing my homework for me, I just hope You can
give me some ideas, key words I should look for, maybe links to code
snippets etc...

With regards
Chris

Hi!

You can compress and uncompress the file itself using the package
java.util.zip.
Hmmm, scrabble... Have a look at n-ary search trees. The idea is that a
path in the search tree represents a word in your dictionary:

-|-d-o-g.
|-c-a-t.-s.
|-n-d-l-e.

Imagine this was a tree ;)
Each '.' means that you have found a word (end of a path).
I think this is called Directed Acyclic Word Graph.

Ben
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,774
Messages
2,569,596
Members
45,130
Latest member
MitchellTe
Top