Complex GSUB query

N

Ne Scripter

Hello all,

I am struggling with something and I have yet been able to find anything
that may help me.

I have a string like follows:

string = "\"bloggs, Joe (JBloggs, INFO)\" joebloggs, \"bloggs, Bill
(BBloggs, INFO)\" billbloggs"

I want to break this string up into two entries using a _ seperator, one
for Joe and the other for Bill. I could do this with a simple

string.gsub(",", "_")

However the problem with doing this is that there are commas elsewhere
in the string. So what I need to say is, if the comma is outside of ""
(quotes) replace it with the _

Could anyone possibly help me with this?

Thanks
 
A

Aldric Giacomoni

Ne said:
string = "\"bloggs, Joe (JBloggs, INFO)\" joebloggs, \"bloggs, Bill
(BBloggs, INFO)\" billbloggs"

I want to break this string up into two entries using a _ seperator, one
for Joe and the other for Bill. I could do this with a simple

string.gsub(",", "_")

However the problem with doing this is that there are commas elsewhere
in the string. So what I need to say is, if the comma is outside of ""
(quotes) replace it with the _

Could anyone possibly help me with this?

Thanks

I'd do:
string.gsub!(", \"", "_ \"")
# If the comma is followed by a space and double quotes, replace that
with an undersore, a space and a double quote.
But that's because I'm really lazy.
 
D

David A. Black

Hi --

Hello all,

I am struggling with something and I have yet been able to find anything
that may help me.

I have a string like follows:

string = "\"bloggs, Joe (JBloggs, INFO)\" joebloggs, \"bloggs, Bill
(BBloggs, INFO)\" billbloggs"

I want to break this string up into two entries using a _ seperator, one
for Joe and the other for Bill. I could do this with a simple

string.gsub(",", "_")

However the problem with doing this is that there are commas elsewhere
in the string. So what I need to say is, if the comma is outside of ""
(quotes) replace it with the _

Could anyone possibly help me with this?

It looks like the pattern /, "/ occurs at the end of one record into
the beginning of the next one, and nowhere else. Assuming that's
correct, it suggests something like:

string.gsub(/,(?=\s+")/, '_')

i.e., for any comma which is followed by some whitespace and a double
quote character, replace the comma with an underscore.


David

--
The Ruby training with D. Black, G. Brown, J.McAnally
Compleat Jan 22-23, 2010, Tampa, FL
Rubyist http://www.thecompleatrubyist.com

David A. Black/Ruby Power and Light, LLC (http://www.rubypal.com)
 
R

Rob Biedenharn

Hi --



It looks like the pattern /, "/ occurs at the end of one record into
the beginning of the next one, and nowhere else. Assuming that's
correct, it suggests something like:

string.gsub(/,(?=\s+")/, '_')

i.e., for any comma which is followed by some whitespace and a double
quote character, replace the comma with an underscore.


David

--
The Ruby training with D. Black, G. Brown, J.McAnally
Compleat Jan 22-23, 2010, Tampa, FL
Rubyist http://www.thecompleatrubyist.com

David A. Black/Ruby Power and Light, LLC (http://www.rubypal.com)

Or perhaps scan is a better hammer for this nail:

irb> string = "\"bloggs, Joe (JBloggs, INFO)\" joebloggs, \"bloggs,
Bill (BBloggs, INFO)\" billbloggs"
=> "\"bloggs, Joe (JBloggs, INFO)\" joebloggs, \"bloggs, Bill
(BBloggs, INFO)\" billbloggs"
irb> re = %r{"\w+, \w+ \(\w+, \w+\)" \w+}
=> /"\w+, \w+ \(\w+, \w+\)" \w+/
irb> string.scan(re)
=> ["\"bloggs, Joe (JBloggs, INFO)\" joebloggs", "\"bloggs, Bill
(BBloggs, INFO)\" billbloggs"]

You could paste them back together with a .join('_'), but I suspect
that you want the pieces later anyway.

-Rob

Rob Biedenharn http://agileconsultingllc.com
(e-mail address removed)
 
N

Ne Scripter

Thanks all. I went the suggestion given by David because although the
structure is consistent I can never be sure on the number of elements in
the string.

Many thanks


Rob said:
Compleat Jan 22-23, 2010, Tampa, FL
Rubyist http://www.thecompleatrubyist.com

David A. Black/Ruby Power and Light, LLC (http://www.rubypal.com)

Or perhaps scan is a better hammer for this nail:

irb> string = "\"bloggs, Joe (JBloggs, INFO)\" joebloggs, \"bloggs,
Bill (BBloggs, INFO)\" billbloggs"
=> "\"bloggs, Joe (JBloggs, INFO)\" joebloggs, \"bloggs, Bill
(BBloggs, INFO)\" billbloggs"
irb> re = %r{"\w+, \w+ \(\w+, \w+\)" \w+}
=> /"\w+, \w+ \(\w+, \w+\)" \w+/
irb> string.scan(re)
=> ["\"bloggs, Joe (JBloggs, INFO)\" joebloggs", "\"bloggs, Bill
(BBloggs, INFO)\" billbloggs"]

You could paste them back together with a .join('_'), but I suspect
that you want the pieces later anyway.

-Rob

Rob Biedenharn http://agileconsultingllc.com
(e-mail address removed)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads

gsub ? 2
gsub help 7
gsub pattern substitution and ${...} 7
gsub wrapper 4
Gsub!("\n","\n") 9
gsub and reg expressions 8
gsub wrapper re-submitted 0
gsub! and quoting question 5

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,581
Members
45,057
Latest member
KetoBeezACVGummies

Latest Threads

Top