recongizing language

Discussion in 'Java' started by Andrius Klimavi?ius, Jul 1, 2005.

  1. Helo,

    i have a task to recognize two languages: English and Lithuanian.
    First thing, i think, is to search for specific letters, also for
    words such like: are, to, into, the etc. Am i going the right way?:)

    any ideas how to do that?
     
    Andrius Klimavi?ius, Jul 1, 2005
    #1
    1. Advertising

  2. On 1 Jul 2005 01:35:28 -0700, Andrius Klimavi?ius wrote:
    > First thing, i think, is to search for specific letters, also for
    > words such like: are, to, into, the etc. Am i going the right way?:)
    >
    > any ideas how to do that?


    Yes, I believe that searching for "stop words" in each of the
    languages is a simple and reasonably accurate method.

    This kind of problem is often discussed in comp.ai.nat-lang, so you
    might want to ask there too.

    There are various lists of stop words available on the web,
    such as this one: http://meta.wikimedia.org/wiki/Stop_word_list

    /gordon

    --
    [ do not email me copies of your followups ]
    g o r d o n + n e w s @ b a l d e r 1 3 . s e
     
    Gordon Beaton, Jul 1, 2005
    #2
    1. Advertising

  3. thnx Gordon:)
     
    Andrius Klimavi?ius, Jul 4, 2005
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Ed
    Replies:
    24
    Views:
    1,015
    Dimitri Maziuk
    Mar 27, 2006
  2. DaveInSidney
    Replies:
    0
    Views:
    424
    DaveInSidney
    May 9, 2005
  3. pabbu
    Replies:
    8
    Views:
    733
    Marc Boyer
    Nov 7, 2005
  4. Shravani
    Replies:
    8
    Views:
    806
    Bartc
    Mar 16, 2008
  5. Replies:
    28
    Views:
    1,180
Loading...

Share This Page