How to guess the language of a given textstring?

R

Roman

Does anybody know an easy way (or tool) to guess the language of a
given text string?

e.g.
Feeding in "This is an example." --> should return "english" or ISO
code
Feeding in "Das ist ein Beispiel." --> should return "german" or ISO
code
Feeding in "Esto es un ejemplo." --> should return "spanish" or ISO
code

I would prefer something more lightweight than using nltk/corpus/...

And it's ok if the success ratio is just about 90% or so.

Roman
 
L

Lonnie Princehouse

A search to see how many words from the text belong to
english/german/spanish common word dictionaries would be an easy way to
get a crude guess at the language.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,766
Messages
2,569,569
Members
45,042
Latest member
icassiem

Latest Threads

Top