Formatting a long regex: can a character class [] be split overlines?

Discussion in 'Ruby' started by Alexey Muranov, May 1, 2011.

  1. Hello,
    i am wandering if it is possible to split a character class ([...]) in
    Ruby regex over multiple lines.

    I know that the /x option allows to ignore whitespace, so i can write :

    email_format = /\A(
    [A-Za-z\d\!\#\$\%\&\'\*\+\-\/\=\?\^\_\`\{\|\}\~]+
    \.)*
    [A-Za-z\d\!\#\$\%\&\'\*\+\-\/\=\?\^\_\`\{\|\}\~]+
    @([a-z\d\-]+\.)+[a-z\d\-]+\z/x

    However, if i try to split inside a character class:

    name_format = /\A[A-Za-z\d
    \!\#\$\%\&\'\*\+\-\/\=\?\^\_\`\{\|\}\~]\z/x

    i get the warning:

    warning: character class has duplicated range

    (apparently it is about the space character being included multiple
    times inside []).
    I want the space and newlines to be disregarded inside [] to format it
    over multiple lines, is this possible?

    Thanks,

    Alexey.

    --
    Posted via http://www.ruby-forum.com/.
    Alexey Muranov, May 1, 2011
    #1
    1. Advertising

  2. [Note: parts of this message were removed to make it a legal post.]

    http://es.w3support.net/index.php?db=so&id=150095

    ---
    Jose Calderon-Celis





    2011/5/1 Alexey Muranov <-toulouse.fr>

    > Hello,
    > i am wandering if it is possible to split a character class ([...]) in
    > Ruby regex over multiple lines.
    >
    > I know that the /x option allows to ignore whitespace, so i can write :
    >
    > email_format = /\A(
    > [A-Za-z\d\!\#\$\%\&\'\*\+\-\/\=\?\^\_\`\{\|\}\~]+
    > \.)*
    > [A-Za-z\d\!\#\$\%\&\'\*\+\-\/\=\?\^\_\`\{\|\}\~]+
    > @([a-z\d\-]+\.)+[a-z\d\-]+\z/x
    >
    > However, if i try to split inside a character class:
    >
    > name_format = /\A[A-Za-z\d
    > \!\#\$\%\&\'\*\+\-\/\=\?\^\_\`\{\|\}\~]\z/x
    >
    > i get the warning:
    >
    > warning: character class has duplicated range
    >
    > (apparently it is about the space character being included multiple
    > times inside []).
    > I want the space and newlines to be disregarded inside [] to format it
    > over multiple lines, is this possible?
    >
    > Thanks,
    >
    > Alexey.
    >
    > --
    > Posted via http://www.ruby-forum.com/.
    >
    >
    Jose Calderon-Celis, May 1, 2011
    #2
    1. Advertising

  3. Alexey Muranov

    7stud -- Guest

    Alexey Muranov wrote in post #996071:
    > Hello,
    > i am wandering if it is possible to split a character class ([...]) in
    > Ruby regex over multiple lines.
    >
    > I know that the /x option allows to ignore whitespace, so i can write :
    >
    > email_format = /\A(
    > [A-Za-z\d\!\#\$\%\&\'\*\+\-\/\=\?\^\_\`\{\|\}\~]+
    > \.)*
    > [A-Za-z\d\!\#\$\%\&\'\*\+\-\/\=\?\^\_\`\{\|\}\~]+
    > @([a-z\d\-]+\.)+[a-z\d\-]+\z/x
    >
    > However, if i try to split inside a character class:
    >
    > name_format = /\A[A-Za-z\d
    > \!\#\$\%\&\'\*\+\-\/\=\?\^\_\`\{\|\}\~]\z/x
    >
    > i get the warning:
    >
    > warning: character class has duplicated range
    >
    > (apparently it is about the space character being included multiple
    > times inside []).
    > I want the space and newlines to be disregarded inside [] to format it
    > over multiple lines, is this possible?
    >
    > Thanks,
    >
    > Alexey.



    1) Never write a regex with thousands of escapes. Are you aware that
    inside a character class, the special regex characters lose their
    special meaning?

    2) Break up long regexes into smaller pieces.


    my_char_class = '[A-Za-z#\d!#$%&\'*+-/=?^_`{|}~]'

    my_regex = /
    \A
    #{my_char_class}
    [.]
    #{my_char_class}
    \z
    /x


    if my_regex.match "?./"
    puts 'yes'
    end

    --output:--
    yes

    --
    Posted via http://www.ruby-forum.com/.
    7stud --, May 2, 2011
    #3
  4. Alexey Muranov

    7stud -- Guest

    It's also possible to escape a newline:

    name_format = /\A[A-Za-z\d\
    \!\#\$\%\&\'\*\+\-\/\=\?\^\_\`\{\|\}\~]\z/x

    but then you can't indent the second line or else your regex will
    contain a bunch of spaces.

    --
    Posted via http://www.ruby-forum.com/.
    7stud --, May 2, 2011
    #4
  5. Alexey Muranov

    7stud -- Guest

    You could also use a here doc and avoid having to escape any character
    inside the string:

    str = <<'LOTS_OF_SYMBOLS'
    [A-Za-z\d!#$%&'*+-/=?^_`{|}~]
    LOTS_OF_SYMBOLS

    puts str.chomp

    --output:--
    [A-Za-z\d!#$%&'*+-/=?^_`{|}~]

    --
    Posted via http://www.ruby-forum.com/.
    7stud --, May 2, 2011
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. George Marsaglia

    Assigning unsigned long to unsigned long long

    George Marsaglia, Jul 8, 2003, in forum: C Programming
    Replies:
    1
    Views:
    674
    Eric Sosman
    Jul 8, 2003
  2. Daniel Rudy

    unsigned long long int to long double

    Daniel Rudy, Sep 19, 2005, in forum: C Programming
    Replies:
    5
    Views:
    1,186
    Peter Shaggy Haywood
    Sep 20, 2005
  3. PerlFAQ Server
    Replies:
    0
    Views:
    391
    PerlFAQ Server
    Jan 25, 2011
  4. PerlFAQ Server
    Replies:
    0
    Views:
    473
    PerlFAQ Server
    Apr 13, 2011
  5. Sebastian
    Replies:
    17
    Views:
    351
    Gene Wirchenko
    Feb 4, 2013
Loading...

Share This Page