Free Dictionary German

  • Thread starter HansHenning.Gabriel
  • Start date
H

HansHenning.Gabriel

Hi,

I like to implement a little SpellChecking Tool. Therefor I need some
kind of free dictionary API that contains all possible german words.
Does anybody know where to get something like this?

Thanks!
 
O

Oliver Wong

Hi,

I like to implement a little SpellChecking Tool. Therefor I need some
kind of free dictionary API that contains all possible german words.
Does anybody know where to get something like this?

This is the wrong approach, as German words can almost infinitely be
composed together to form yet more words. If you were to actually have a
file with every possible German word, the file would end up being several
gigabytes.

See http://j3e.de/ispell/igerman98/todo.html for more details and see
http://hunspell.sourceforge.net/ for an example implementation of a German
spellchecker.

- Oliver
 
R

Rhino

Oliver Wong said:
This is the wrong approach, as German words can almost infinitely be
composed together to form yet more words. If you were to actually have a
file with every possible German word, the file would end up being several
gigabytes.
While you're right that German words can be blended together to make bigger
words, I don't think anyone is expecting a dictionary to have every single
compound that can be made.

<Pedantic aside>I think the longest real German word I ever saw was in a
German textbook in a university course. It was something like:
Vierwaldstatterseedampfschiffgesellschaft. Despite this horrendously long
word, it is actually easy to break down:

Vier = four
wald = forest
Tattersee = a proper place name containing "see", which means "lake"
dampf = steam
schiff = ship
gesellschaft = company

If you put it all together, it meant something like:

Lake of the Four Woods Steamship Company

(There might have been another word in their too, something that meant
"travel" or "excursion" but I don't recall for sure.)

Basically, this is analagous to making "raincoat" from "rain" and "coat";
it's just that German does this more frequently than English.
 
A

Andrey Kuznetsov

German textbook in a university course. It was something like:
Vierwaldstatterseedampfschiffgesellschaft. Despite this horrendously long
word, it is actually easy to break down:

there are some words which may be splitted on different ways - with funny
results:
rohrohrzucker

roh|rohr|zucker:
roh=raw
rohr=cane
zucker=sugar

rohr|ohr|zucker
rohr=cane
ohr=ear
zucker=sugar

the second one does not make sense,
but how spellchecker should know about it?

Andrey
 
R

Rhino

Andrey Kuznetsov said:
there are some words which may be splitted on different ways - with funny
results:
rohrohrzucker

roh|rohr|zucker:
roh=raw
rohr=cane
zucker=sugar

rohr|ohr|zucker
rohr=cane
ohr=ear
zucker=sugar

the second one does not make sense,
but how spellchecker should know about it?

I really don't know how you can write a spellchecker to handle a case like
that :)

I think this is just another case that a spellchecker will simply not handle
correctly. That's why spellcheckers typically have options to let the user
accept words that were flagged as errors.

Let's face it: languages are constantly evolving and new words are joining
the language all the time. Fifty years ago, a spellchecker - whether a
software one or a human - would have rejected "Internet" since it wasn't a
word yet; today it is utterly commonplace and no spellchecker should ever
reject it. Any software spellchecker is always going to fail to recognize
the newest words.

It's just foolish to expect a perfect spellchecker.
 
H

HansHenning.Gabriel

Thanks for the links.
But my problem is still not solved. I do not need to check composed
words! I just need some kind of "normal" german dictionary that I can
access from within my Java Code! I guess the hunspell-project does not
have a Java API?!

So, does anybode have some more suggestions?
 
T

Thomas Fritsch

Rhino said:
Vierwaldstatterseedampfschiffgesellschaft.

Lake of the Four Woods Steamship Company

(There might have been another word in their too, something that meant
"travel" or "excursion" but I don't recall for sure.)
Right! Actually it was
Vierwaldstätterseedampfschifffahrtgesellschaft
where
fahrt = travel

A fairly obvious extension is :)
Vierwaldstätterseedampfschifffahrtgesellschaftskapitänsmütze
where
Kapitän = captain
Mütze = cap
==> Cap of the captain of the steamship travel company at the Four Woods
Lake Site
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

Forum statistics

Threads
473,787
Messages
2,569,629
Members
45,331
Latest member
ElaneLyttl

Latest Threads

Top