Complex GSUB query

Discussion in 'Ruby' started by Ne Scripter, Oct 20, 2009.

  1. Ne Scripter

    Ne Scripter Guest

    Hello all,

    I am struggling with something and I have yet been able to find anything
    that may help me.

    I have a string like follows:

    string = "\"bloggs, Joe (JBloggs, INFO)\" joebloggs, \"bloggs, Bill
    (BBloggs, INFO)\" billbloggs"

    I want to break this string up into two entries using a _ seperator, one
    for Joe and the other for Bill. I could do this with a simple

    string.gsub(",", "_")

    However the problem with doing this is that there are commas elsewhere
    in the string. So what I need to say is, if the comma is outside of ""
    (quotes) replace it with the _

    Could anyone possibly help me with this?

    Thanks
    --
    Posted via http://www.ruby-forum.com/.
    Ne Scripter, Oct 20, 2009
    #1
    1. Advertising

  2. Ne Scripter wrote:
    >
    > string = "\"bloggs, Joe (JBloggs, INFO)\" joebloggs, \"bloggs, Bill
    > (BBloggs, INFO)\" billbloggs"
    >
    > I want to break this string up into two entries using a _ seperator, one
    > for Joe and the other for Bill. I could do this with a simple
    >
    > string.gsub(",", "_")
    >
    > However the problem with doing this is that there are commas elsewhere
    > in the string. So what I need to say is, if the comma is outside of ""
    > (quotes) replace it with the _
    >
    > Could anyone possibly help me with this?
    >
    > Thanks


    I'd do:
    string.gsub!(", \"", "_ \"")
    # If the comma is followed by a space and double quotes, replace that
    with an undersore, a space and a double quote.
    But that's because I'm really lazy.
    --
    Posted via http://www.ruby-forum.com/.
    Aldric Giacomoni, Oct 20, 2009
    #2
    1. Advertising

  3. Hi --

    On Wed, 21 Oct 2009, Ne Scripter wrote:

    > Hello all,
    >
    > I am struggling with something and I have yet been able to find anything
    > that may help me.
    >
    > I have a string like follows:
    >
    > string = "\"bloggs, Joe (JBloggs, INFO)\" joebloggs, \"bloggs, Bill
    > (BBloggs, INFO)\" billbloggs"
    >
    > I want to break this string up into two entries using a _ seperator, one
    > for Joe and the other for Bill. I could do this with a simple
    >
    > string.gsub(",", "_")
    >
    > However the problem with doing this is that there are commas elsewhere
    > in the string. So what I need to say is, if the comma is outside of ""
    > (quotes) replace it with the _
    >
    > Could anyone possibly help me with this?


    It looks like the pattern /, "/ occurs at the end of one record into
    the beginning of the next one, and nowhere else. Assuming that's
    correct, it suggests something like:

    string.gsub(/,(?=\s+")/, '_')

    i.e., for any comma which is followed by some whitespace and a double
    quote character, replace the comma with an underscore.


    David

    --
    The Ruby training with D. Black, G. Brown, J.McAnally
    Compleat Jan 22-23, 2010, Tampa, FL
    Rubyist http://www.thecompleatrubyist.com

    David A. Black/Ruby Power and Light, LLC (http://www.rubypal.com)
    David A. Black, Oct 20, 2009
    #3
  4. On Oct 20, 2009, at 11:37 AM, David A. Black wrote:

    > Hi --
    >
    > On Wed, 21 Oct 2009, Ne Scripter wrote:
    >
    >> Hello all,
    >>
    >> I am struggling with something and I have yet been able to find
    >> anything
    >> that may help me.
    >>
    >> I have a string like follows:
    >>
    >> string = "\"bloggs, Joe (JBloggs, INFO)\" joebloggs, \"bloggs, Bill
    >> (BBloggs, INFO)\" billbloggs"
    >>
    >> I want to break this string up into two entries using a _
    >> seperator, one
    >> for Joe and the other for Bill. I could do this with a simple
    >>
    >> string.gsub(",", "_")
    >>
    >> However the problem with doing this is that there are commas
    >> elsewhere
    >> in the string. So what I need to say is, if the comma is outside of
    >> ""
    >> (quotes) replace it with the _
    >>
    >> Could anyone possibly help me with this?

    >
    > It looks like the pattern /, "/ occurs at the end of one record into
    > the beginning of the next one, and nowhere else. Assuming that's
    > correct, it suggests something like:
    >
    > string.gsub(/,(?=\s+")/, '_')
    >
    > i.e., for any comma which is followed by some whitespace and a double
    > quote character, replace the comma with an underscore.
    >
    >
    > David
    >
    > --
    > The Ruby training with D. Black, G. Brown, J.McAnally
    > Compleat Jan 22-23, 2010, Tampa, FL
    > Rubyist http://www.thecompleatrubyist.com
    >
    > David A. Black/Ruby Power and Light, LLC (http://www.rubypal.com)


    Or perhaps scan is a better hammer for this nail:

    irb> string = "\"bloggs, Joe (JBloggs, INFO)\" joebloggs, \"bloggs,
    Bill (BBloggs, INFO)\" billbloggs"
    => "\"bloggs, Joe (JBloggs, INFO)\" joebloggs, \"bloggs, Bill
    (BBloggs, INFO)\" billbloggs"
    irb> re = %r{"\w+, \w+ \(\w+, \w+\)" \w+}
    => /"\w+, \w+ \(\w+, \w+\)" \w+/
    irb> string.scan(re)
    => ["\"bloggs, Joe (JBloggs, INFO)\" joebloggs", "\"bloggs, Bill
    (BBloggs, INFO)\" billbloggs"]

    You could paste them back together with a .join('_'), but I suspect
    that you want the pieces later anyway.

    -Rob

    Rob Biedenharn http://agileconsultingllc.com
    Rob Biedenharn, Oct 20, 2009
    #4
  5. Ne Scripter

    Ne Scripter Guest

    Thanks all. I went the suggestion given by David because although the
    structure is consistent I can never be sure on the number of elements in
    the string.

    Many thanks


    Rob Biedenharn wrote:
    > On Oct 20, 2009, at 11:37 AM, David A. Black wrote:
    >
    >>> I have a string like follows:
    >>> However the problem with doing this is that there are commas

    >>
    >> Compleat Jan 22-23, 2010, Tampa, FL
    >> Rubyist http://www.thecompleatrubyist.com
    >>
    >> David A. Black/Ruby Power and Light, LLC (http://www.rubypal.com)

    >
    > Or perhaps scan is a better hammer for this nail:
    >
    > irb> string = "\"bloggs, Joe (JBloggs, INFO)\" joebloggs, \"bloggs,
    > Bill (BBloggs, INFO)\" billbloggs"
    > => "\"bloggs, Joe (JBloggs, INFO)\" joebloggs, \"bloggs, Bill
    > (BBloggs, INFO)\" billbloggs"
    > irb> re = %r{"\w+, \w+ \(\w+, \w+\)" \w+}
    > => /"\w+, \w+ \(\w+, \w+\)" \w+/
    > irb> string.scan(re)
    > => ["\"bloggs, Joe (JBloggs, INFO)\" joebloggs", "\"bloggs, Bill
    > (BBloggs, INFO)\" billbloggs"]
    >
    > You could paste them back together with a .join('_'), but I suspect
    > that you want the pieces later anyway.
    >
    > -Rob
    >
    > Rob Biedenharn http://agileconsultingllc.com
    >


    --
    Posted via http://www.ruby-forum.com/.
    Ne Scripter, Oct 21, 2009
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. news.amnet.net.au
    Replies:
    1
    Views:
    570
    =?UTF-8?b?TMSByrtpZSBUZWNoaWU=?=
    Apr 13, 2004
  2. Stanimir Stamenkov
    Replies:
    2
    Views:
    741
    Stanimir Stamenkov
    Oct 25, 2005
  3. Robert Mark Bram
    Replies:
    0
    Views:
    679
    Robert Mark Bram
    Feb 4, 2007
  4. Kottiyath

    How complex is complex?

    Kottiyath, Mar 18, 2009, in forum: Python
    Replies:
    22
    Views:
    756
  5. aurelianito

    gsub and gsub! are inconsistent

    aurelianito, Nov 8, 2005, in forum: Ruby
    Replies:
    9
    Views:
    160
    Robert Klemme
    Nov 9, 2005
Loading...

Share This Page