Regexp Unicode property names strange behavior?

Discussion in 'Ruby' started by Ammar Ali, Oct 23, 2010.

  1. Ammar Ali

    Ammar Ali Guest

    [Note: parts of this message were removed to make it a legal post.]

    On 1.9.2, I'm seeing an "invalid character property name" error from Regexp
    for the named properties Any, Ascii, and Xdigit, but none of the others. If
    I add the u option to the expression, it works.


    # All good, with or without u option
    [ 'Alnum', 'Alpha', 'Blank', 'Cntrl', 'Digit', 'Graph', 'Lower',
    'Print', 'Punct', 'Space', 'Upper', 'Word'
    ].each {|name| puts /\p{#{name}}/ }


    # Errors raised without u option
    ['Any', 'Ascii', 'Xdigit'].each {|name| puts /\p{#{name}}/ }


    # Now it's good
    ['Any', 'Ascii', 'Xdigit'].each {|name| puts /\p{#{name}}/u }


    I expected that all the names would either require the u option, or they
    wouldn't. If it was just Any and Ascii, I would accept it and move on, but
    Xdigit doesn't seem to belong with the other two.

    Trying to understand why Any, Ascii, and Xdigit are "special". Any clues
    greatly appreciated.


    Thanks,
    Ammar
     
    Ammar Ali, Oct 23, 2010
    #1
    1. Advertising

  2. Ammar Ali

    Ammar Ali Guest

    [Note: parts of this message were removed to make it a legal post.]

    On Sat, Oct 23, 2010 at 9:40 AM, Ammar Ali <> wrote:

    > On 1.9.2, I'm seeing an "invalid character property name" error from Regexp
    > for the named properties Any, Ascii, and Xdigit, but none of the others. If
    > I add the u option to the expression, it works.
    >
    >
    > # All good, with or without u option
    > [ 'Alnum', 'Alpha', 'Blank', 'Cntrl', 'Digit', 'Graph', 'Lower',
    > 'Print', 'Punct', 'Space', 'Upper', 'Word'
    > ].each {|name| puts /\p{#{name}}/ }
    >
    >
    > # Errors raised without u option
    > ['Any', 'Ascii', 'Xdigit'].each {|name| puts /\p{#{name}}/ }
    >
    >
    > # Now it's good
    > ['Any', 'Ascii', 'Xdigit'].each {|name| puts /\p{#{name}}/u }
    >
    >
    > I expected that all the names would either require the u option, or they
    > wouldn't. If it was just Any and Ascii, I would accept it and move on, but
    > Xdigit doesn't seem to belong with the other two.
    >
    > Trying to understand why Any, Ascii, and Xdigit are "special". Any clues
    > greatly appreciated.
    >
    >
    > Thanks,
    > Ammar
    >



    It was bad documentation! Xdigit should be XDigit, and Ascii should be
    ASCII. Any requires encoding to be specified, which makes sense.

    Sorry about the noise.
    Ammar
     
    Ammar Ali, Oct 23, 2010
    #2
    1. Advertising

  3. Ammar Ali

    Ryan Davis Guest

    On Oct 23, 2010, at 00:54 , Ammar Ali wrote:

    > On Sat, Oct 23, 2010 at 9:40 AM, Ammar Ali <> =

    wrote:
    >=20
    >> On 1.9.2, I'm seeing an "invalid character property name" error from =

    Regexp
    >> for the named properties Any, Ascii, and Xdigit, but none of the =

    others. If
    >> I add the u option to the expression, it works.

    >=20
    > It was bad documentation! Xdigit should be XDigit, and Ascii should be
    > ASCII. Any requires encoding to be specified, which makes sense.


    dude. help out. where?
     
    Ryan Davis, Oct 23, 2010
    #3
  4. Ammar Ali

    Ammar Ali Guest

    [Note: parts of this message were removed to make it a legal post.]

    On Sat, Oct 23, 2010 at 12:13 PM, Ryan Davis <>wrote:

    >
    > dude. help out. where?
    >



    If I understood your question correctly, then the erroneous docs are at:
    http://ruby.runpaint.org/regexps#properties

    I submitted an issue against it, with reference to source code, at:
    http://github.com/runpaint/read-ruby/issues/issue/68

    It has been very difficult finding detailed information about many of the
    1.9 regular expression features. Read Ruby has the most coverage I have
    found so far.

    Regards,
    Ammar
     
    Ammar Ali, Oct 23, 2010
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. kevin  cline
    Replies:
    2
    Views:
    2,401
    Gilbert Rebhan
    Jul 26, 2008
  2. Hans Müller
    Replies:
    1
    Views:
    289
    Hans Müller
    Dec 3, 2009
  3. Greg Hurrell
    Replies:
    4
    Views:
    163
    James Edward Gray II
    Feb 14, 2007
  4. Joao Silva
    Replies:
    16
    Views:
    363
    7stud --
    Aug 21, 2009
  5. Replies:
    0
    Views:
    81
Loading...

Share This Page