regexp conditional

C

ciapecki

Hi,

How could I easily do the following?:

input = '111|"aaaa" bbbbb|c'

I would like to get as output following:

output = '111|"""aaa"" bbbbb"|c'

to correct wrongly prepared pipe separated file with enclosing the
equation characters.

thanks
chris
 
X

Xavier Noria

Hi,

How could I easily do the following?:

input = '111|"aaaa" bbbbb|c'

I would like to get as output following:

output = '111|"""aaa"" bbbbb"|c'

to correct wrongly prepared pipe separated file with enclosing the
equation characters.

My interpretation of what you need is:

input = '111|"aaaa" bbbbb|c'

fields = input.split('|')
fields[1].gsub!('"', '""')
fields[1] = %Q{"#{fields[1]}"}

output = fields.join('|')

Depending on the input that may not be valid, for example if fields[1]
may contain a pipe. Anyway being a mal-formed input assumptions depend
on the actual data.

-- fxn
 
C

ciapecki

My interpretation of what you need is:

input = '111|"aaaa" bbbbb|c'

fields = input.split('|')
fields[1].gsub!('"', '""')
fields[1] = %Q{"#{fields[1]}"}

output = fields.join('|')

Depending on the input that may not be valid, for example if fields[1]
may contain a pipe. Anyway being a mal-formed input assumptions depend
on the actual data.

-- fxn


your solution works, and thanks for that,
I am waiting though for a regexp solution,

thanks anyway
chris
 
R

Robert Klemme

My interpretation of what you need is:

input = '111|"aaaa" bbbbb|c'

fields = input.split('|')
fields[1].gsub!('"', '""')
fields[1] = %Q{"#{fields[1]}"}

output = fields.join('|')

Depending on the input that may not be valid, for example if fields[1]
may contain a pipe. Anyway being a mal-formed input assumptions depend
on the actual data.

your solution works, and thanks for that,
I am waiting though for a regexp solution,

Even 2 regexps:

irb(main):001:0> input = '111|"aaaa" bbbbb|c'
=> "111|\"aaaa\" bbbbb|c"
irb(main):002:0> input.gsub(/[^|]+/) {|m| m.gsub!(/"/,'""') ?
'"'<<m<<'"' : m}
=> "111|\"\"\"aaaa\"\" bbbbb\"|c"
irb(main):003:0> puts input.gsub(/[^|]+/) {|m| m.gsub!(/"/,'""') ?
'"'<<m<<'"' : m}
111|"""aaaa"" bbbbb"|c
=> nil

Cheers

robert
 
P

Paul Mckibbin

Robert said:
Even 2 regexps:
Cheers

robert

Robert,

How fixed is the input. If it is always of the same format, then what
about:

input = '111|"aaaa" bbbbb|c'
output=input.gsub(/\"/,'""').gsub(/(.*)\|(.*)\|(.*)/,'\1|"\2"|\3')

Mac
 
C

ciapecki

Oops. I meant Chris of course.

Hi Mac,

The format is fixed but contains 49 fields separated by | so your one-
liner could not fit into one line :)

thanks,
chris
 
C

ciapecki

My interpretation of what you need is:
input = '111|"aaaa" bbbbb|c'
fields = input.split('|')
fields[1].gsub!('"', '""')
fields[1] = %Q{"#{fields[1]}"}
output = fields.join('|')
Depending on the input that may not be valid, for example if fields[1]
may contain a pipe. Anyway being a mal-formed input assumptions depend
on the actual data.
your solution works, and thanks for that,
I am waiting though for a regexp solution,

Even 2 regexps:

irb(main):001:0> input = '111|"aaaa" bbbbb|c'
=> "111|\"aaaa\" bbbbb|c"
irb(main):002:0> input.gsub(/[^|]+/) {|m| m.gsub!(/"/,'""') ?
'"'<<m<<'"' : m}
=> "111|\"\"\"aaaa\"\" bbbbb\"|c"
irb(main):003:0> puts input.gsub(/[^|]+/) {|m| m.gsub!(/"/,'""') ?
'"'<<m<<'"' : m}
111|"""aaaa"" bbbbb"|c
=> nil

Cheers

robert

this is just great,
thanks robert for this


chris
 
R

Robert Klemme

2008/3/20 said:
On 19 Mrz., 10:09, Xavier Noria <[email protected]> wrote:
My interpretation of what you need is:
input = '111|"aaaa" bbbbb|c'
fields = input.split('|')
fields[1].gsub!('"', '""')
fields[1] = %Q{"#{fields[1]}"}
output = fields.join('|')
Depending on the input that may not be valid, for example if fields[1]
may contain a pipe. Anyway being a mal-formed input assumptions depend
on the actual data.
your solution works, and thanks for that,
I am waiting though for a regexp solution,

Even 2 regexps:

irb(main):001:0> input = '111|"aaaa" bbbbb|c'
=> "111|\"aaaa\" bbbbb|c"
irb(main):002:0> input.gsub(/[^|]+/) {|m| m.gsub!(/"/,'""') ?
'"'<<m<<'"' : m}
=> "111|\"\"\"aaaa\"\" bbbbb\"|c"
irb(main):003:0> puts input.gsub(/[^|]+/) {|m| m.gsub!(/"/,'""') ?
'"'<<m<<'"' : m}
111|"""aaaa"" bbbbb"|c
=> nil

Cheers

robert


this is just great,
thanks robert for this

You're welcome. Btw, this is even better (also faster)

irb(main):003:0> input = '111|"aaaa" bbbbb|c'
=> "111|\"aaaa\" bbbbb|c"
irb(main):004:0> input.gsub(/"/,'""').gsub(/[^|]*"[^|]*/,'"\\&"')
=> "111|\"\"\"aaaa\"\" bbbbb\"|c"
irb(main):005:0> puts input.gsub(/"/,'""').gsub(/[^|]*"[^|]*/,'"\\&"')
111|"""aaaa"" bbbbb"|c
=> nil

This could even be a bit faster:

irb(main):006:0> input.gsub(/"/,'""').gsub(/[^|"]*"[^|]*/,'"\\&"')
=> "111|\"\"\"aaaa\"\" bbbbb\"|c"

Kind regards

robert
 
P

Paul Mckibbin

ciapecki said:
The format is fixed but contains 49 fields separated by | so your one-
liner could not fit into one line :)

A minor change :)

input = '111|"aaaa" bbbbb|c|"22222" asdasd|ddd|"aaaa"qqqqq|jjjj'
output=input.gsub(/"/,'""').gsub(/(.*?)\|(.*?)\|(.*?)/,'\1|"\2"|\3')

=>111|"""aaaa"" bbbbb"|c|"""22222"" asdasd"|ddd|"""aaaa""qqqqq"|jjjj

This was just to point out that there is no need for multiple replace
options, but you need to know the layout is correct and in a given
pattern. Robert's is the better and more robust solution (and also
shorter).

Mac
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,577
Members
45,054
Latest member
LucyCarper

Latest Threads

Top