RegExp pattern to escape ALL special characters (but exclude unicodechars)

Discussion in 'Javascript' started by Gabriela, Dec 22, 2008.

  1. Gabriela

    Gabriela Guest

    Hi,
    I'd like to write a regexp that converts all special chars to "-".
    I've used this pattern
    [^a-z0-9]
    with ignore case, and it works beautifully.
    BUT - I want to support also unicode chars (and not escape them).
    I could not find a way to do it, except for listing all special
    characters "manually" and escaping them. I'd rather prepare
    "whitelist" - of the chars allowed, then a "blacklist" of all special
    chars.
    Any ideas?
    Thanx,
    Gabi
    Gabriela, Dec 22, 2008
    #1
    1. Advertising

  2. Re: RegExp pattern to escape ALL special characters (but excludeunicode chars)

    Gabriela wrote:

    > I'd like to write a regexp that converts all special chars to "-".
    > I've used this pattern
    > [^a-z0-9]
    > with ignore case, and it works beautifully.
    > BUT - I want to support also unicode chars (and not escape them).
    > I could not find a way to do it, except for listing all special
    > characters "manually" and escaping them. I'd rather prepare
    > "whitelist" - of the chars allowed, then a "blacklist" of all special
    > chars.


    Well in a language where a string is a sequence of Unicode characters
    any character is an Unicode character so I have no idea which kind of
    characters you want to convert and which not.

    --

    Martin Honnen
    http://JavaScript.FAQTs.com/
    Martin Honnen, Dec 22, 2008
    #2
    1. Advertising

  3. Gabriela

    Gabriela Guest

    Re: RegExp pattern to escape ALL special characters (but excludeunicode chars)

    On Dec 22, 5:57 pm, Martin Honnen <> wrote:
    > Gabriela wrote:
    > > I'd like to write a regexp that converts all special chars to "-".
    > > I've used this pattern
    > > [^a-z0-9]
    > > with ignore case, and it works beautifully.
    > > BUT - I want to support also unicode chars (and not escape them).
    > > I could not find a way to do it, except for listing all special
    > > characters "manually" and escaping them. I'd rather prepare
    > > "whitelist" - of the chars allowed, then a "blacklist" of all special
    > > chars.

    >
    > Well in a language where a string is a sequence of Unicode characters
    > any character is an Unicode character so I have no idea which kind of
    > characters you want to convert and which not.
    >
    > --
    >
    > Martin Honnen
    > http://JavaScript.FAQTs.com/


    Isn't there a distinction between a special character (!@#$%^&*()_-
    +..._) and all alphanumeric/literal characters?
    Gabriela, Dec 22, 2008
    #3
  4. Re: RegExp pattern to escape ALL special characters (but excludeunicode chars)

    Gabriela wrote:

    > Isn't there a distinction between a special character (!@#$%^&*()_-
    > +..._) and all alphanumeric/literal characters?


    Maybe you are looking for letters and digits. Unicode defines classes
    for that but the regular expression language in JavaScript/ECMAScript
    does not have much support such constructs.
    \d
    is defined as 0..9, \D as anything which is not in \d.
    \w
    is defined as a..zA..Z0..9_, \W as anything which is not in \w.
    Then there is \s for white space characters. And \S for anything not a
    white space character.

    Other than that you need to define your own ranges of characters.


    --

    Martin Honnen
    http://JavaScript.FAQTs.com/
    Martin Honnen, Dec 22, 2008
    #4
  5. Gabriela

    Tim Greer Guest

    Re: RegExp pattern to escape ALL special characters (but exclude unicode chars)

    Gabriela wrote:

    > Hi,
    > I'd like to write a regexp that converts all special chars to "-".
    > I've used this pattern
    > [^a-z0-9]
    > with ignore case, and it works beautifully.
    > BUT - I want to support also unicode chars (and not escape them).
    > I could not find a way to do it, except for listing all special
    > characters "manually" and escaping them. I'd rather prepare
    > "whitelist" - of the chars allowed, then a "blacklist" of all special
    > chars.
    > Any ideas?
    > Thanx,
    > Gabi


    Just remember, it's better to deny all by default and then specifically
    have a list (whitelist) of allowed characters, than to try and
    specifically list all invalid characters.
    --
    Tim Greer, CEO/Founder/CTO, BurlyHost.com, Inc.
    Shared Hosting, Reseller Hosting, Dedicated & Semi-Dedicated servers
    and Custom Hosting. 24/7 support, 30 day guarantee, secure servers.
    Industry's most experienced staff! -- Web Hosting With Muscle!
    Tim Greer, Dec 22, 2008
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. RJN
    Replies:
    2
    Views:
    19,943
    Frank
    Feb 25, 2005
  2. Victor
    Replies:
    5
    Views:
    26,827
    chaser
    Jul 4, 2007
  3. Martin DeMello

    special characters within [] in a regexp

    Martin DeMello, Aug 15, 2006, in forum: Ruby
    Replies:
    2
    Views:
    71
    Martin DeMello
    Aug 15, 2006
  4. Jon Garvin

    regex to escape special characters

    Jon Garvin, Feb 10, 2009, in forum: Ruby
    Replies:
    4
    Views:
    155
    Tom Cloyd
    Feb 11, 2009
  5. david.karr
    Replies:
    3
    Views:
    131
    david.karr
    Jun 6, 2006
Loading...

Share This Page