regexp splitting problem

Discussion in 'Ruby' started by Brett S Hallett, Nov 29, 2003.

  1. Hi,
    I am trying to split the following line of text:

    <button> "btn Exit" "Exit Button" ( note the quotes may be
    " or ' , read from a file)

    in such a way that I can say

    txt = line.split(/regrex/)

    and get back

    txt[0] = <button>
    txt[1] = btn Exit
    txt[2] = Exit Button

    my current regexp

    ans = tst.split(/[\"|\']/)

    does this , except that the last set is missing ! ,


    txt[0] = <button>
    txt[1] = btn Exit
    txt[2] =

    so how do I get the expression to continue processing the line ??
    Thanks
     
    Brett S Hallett, Nov 29, 2003
    #1
    1. Advertising

  2. Brett S Hallett

    Maik Schmidt Guest

    Brett S Hallett wrote:
    >
    > ans = tst.split(/[\"|\']/)

    Your regex can be simplified, because within a character class the pipe
    character means "match a pipe character" and not "or". Additionally, you
    do not have to escape the quotes, so the resulting regex would be /["']/.
    >
    > does this , except that the last set is missing ! ,
    >

    That's not totally correct. The last set isn't missing, but the 3rd set
    is empty. For easier debugging try:

    puts text.split(/["']/).join("\n")

    > so how do I get the expression to continue processing the line ??

    As mentioned before: That isn't the problem. Your are searching for a
    regex that splits a line into tokens. Some of the tokens are enclosed in
    quotes and some are not. Both tokens can contain whitespace. I am not
    sure, if your problem can easily be solved by using a single regex. If
    you can, you should change your input format.

    Is the first token always enclosed in [<>] characters? Are the following
    tokens always enclosed in quotes? Then it would be easier to split the
    line, but you still would need more than one split call. Maybe then it
    would fit in a single call of scan?

    Cheers,

    <maik/>
     
    Maik Schmidt, Nov 29, 2003
    #2
    1. Advertising

  3. "Brett S Hallett" <> schrieb im Newsbeitrag
    news:...
    > Hi,
    > I am trying to split the following line of text:
    >
    > <button> "btn Exit" "Exit Button" ( note the quotes may be
    > " or ' , read from a file)
    >
    > in such a way that I can say
    >
    > txt = line.split(/regrex/)
    >
    > and get back
    >
    > txt[0] = <button>
    > txt[1] = btn Exit
    > txt[2] = Exit Button
    >
    > my current regexp
    >
    > ans = tst.split(/[\"|\']/)
    >
    > does this , except that the last set is missing ! ,
    >
    >
    > txt[0] = <button>
    > txt[1] = btn Exit
    > txt[2] =
    >
    > so how do I get the expression to continue processing the line ??


    txt = line.scan /"[^"]*" | '[^']*' | \S+/x

    robert
     
    Robert Klemme, Dec 1, 2003
    #3
  4. Brett S Hallett

    Alan Chen Guest

    Brett S Hallett <> wrote in message news:<>...
    > Hi,
    > I am trying to split the following line of text:
    >
    > <button> "btn Exit" "Exit Button" ( note the quotes may be
    > " or ' , read from a file)
    >
    > in such a way that I can say
    >
    > txt = line.split(/regrex/)
    >
    > and get back
    >
    > txt[0] = <button>
    > txt[1] = btn Exit
    > txt[2] = Exit Button


    This works for your example, but may be somewhat fragile when you go
    to expand its use over a wider range of inputs...

    require 'test/unit'

    class TC_one < Test::Unit::TestCase
    def test_01
    str = %Q/<button> "btn Exit" "Exit Button"/
    ans = str.split( / *[\"\'] *\"?/)

    assert_equal( ["<button>", "btn Exit", "Exit Button"], ans)
    end
    end

    Cheers,
    - alan
     
    Alan Chen, Dec 3, 2003
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.

Share This Page