Simple question: combine a quoted string into a single token

Discussion in 'Ruby' started by Squeamizh, Aug 7, 2006.

  1. Squeamizh

    Squeamizh Guest

    Hi,

    I have a program which separates each line of a text file into tokens,
    using whitespace as a delimiter (I do this with String.split). This
    suits my needs for the most part, but now I need the ability to treat
    quoted strings as single tokens. Note that the quoted string could be
    multiple words, or even a 0-length string.

    Could anyone recommend a basic strategy for doing this? Should I deal
    with this when I first tokenize each line, or should I combine tokens
    appropriately during parsing when I see a double-quote?

    Help would be greatly appreciated.
     
    Squeamizh, Aug 7, 2006
    #1
    1. Advertisements

  2. On Tuesday, August 08, 2006, at 2:40 AM, Squeamizh wrote:
    >Hi,
    >
    >I have a program which separates each line of a text file into tokens,
    >using whitespace as a delimiter (I do this with String.split). This
    >suits my needs for the most part, but now I need the ability to treat
    >quoted strings as single tokens. Note that the quoted string could be
    >multiple words, or even a 0-length string.
    >
    >Could anyone recommend a basic strategy for doing this? Should I deal
    >with this when I first tokenize each line, or should I combine tokens
    >appropriately during parsing when I see a double-quote?
    >
    >Help would be greatly appreciated.
    >
    >


    You could pull out all the quoted strings into an array and then delete
    them from the original before processing it normally.


    _Kevin
    www.sciwerks.com

    --
    Posted with http://DevLists.com. Sign up and save your mailbox.
     
    Kevin Olbrich, Aug 7, 2006
    #2
    1. Advertisements

  3. Squeamizh wrote:
    > Hi,
    >
    > I have a program which separates each line of a text file into tokens,
    > using whitespace as a delimiter (I do this with String.split). This
    > suits my needs for the most part, but now I need the ability to treat
    > quoted strings as single tokens. Note that the quoted string could be
    > multiple words, or even a 0-length string.
    >
    > Could anyone recommend a basic strategy for doing this? Should I deal
    > with this when I first tokenize each line, or should I combine tokens
    > appropriately during parsing when I see a double-quote?
    >
    > Help would be greatly appreciated.
    >


    Something along the lines of

    line.scan %r{
    "[^"]*" |
    \S+
    }x

    HTH

    robert
     
    Robert Klemme, Aug 7, 2006
    #3
    1. Advertisements

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Ajay
    Replies:
    0
    Views:
    625
  2. Ajay
    Replies:
    2
    Views:
    7,618
    Ajay Brar
    Aug 4, 2004
  3. excite
    Replies:
    0
    Views:
    935
    excite
    Nov 2, 2006
  4. Shea Barton
    Replies:
    2
    Views:
    230
    Shea Barton
    Aug 17, 2010
  5. Clint Olsen
    Replies:
    2
    Views:
    197
    Clint Olsen
    Jun 29, 2004
Loading...

Share This Page