problem matching accented chars on OS X

Discussion in 'Ruby' started by Alex Fenton, Jun 11, 2005.

  1. Alex Fenton

    Alex Fenton Guest

    Hi

    I'm finding words within strings in Western European languages, so I
    need to account
    for accented characters, such as ê (e circumflex) and à (a grave). On
    ruby 1.8.2
    MSW the following works for me (simplified):

    WORD_PATTERN = /^[\w\xC0-\xD6\xD8-\xF6\xF8-\xFF]+$/s

    \w gets me a-z + A-Z , the hex characters are the positions of the
    accented characters in
    iso-8859-1 encoding. This seems to work, but when I run the same code on
    OS X, I get

    .../lib/weft/backend/sqlite.rb:533: mismatch multibyte code length in
    char-class range: /^[\w\xC0-\xD6\xD8-\xF6\xF8-\xFF]+$/ (SyntaxError)

    Any pointers? I'm not sure what is going wrong.

    Is there a library written that can help me matching letter characters
    (ideally in a
    variety of codesets)? [:alpha:] regex class seeemed to be synonymous
    with \w, which
    doesn't match enough.

    cheers
    alex
     
    Alex Fenton, Jun 11, 2005
    #1
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Kosio

    Floats to chars and chars to floats

    Kosio, Sep 16, 2005, in forum: C Programming
    Replies:
    44
    Views:
    1,295
    Tim Rentsch
    Sep 23, 2005
  2. Hongyu
    Replies:
    9
    Views:
    916
    James Kanze
    Aug 8, 2008
  3. M.Posseth

    receiving ??? chars instead of "special" chars

    M.Posseth, Nov 15, 2004, in forum: ASP .Net Web Services
    Replies:
    3
    Views:
    234
    Dan Rogers
    Nov 16, 2004
  4. Manoel Lemos
    Replies:
    2
    Views:
    104
    Daniel DeLorme
    Jun 1, 2007
  5. Thomas Luedeke
    Replies:
    4
    Views:
    240
    Thomas Luedeke
    Mar 2, 2011
Loading...

Share This Page