regexp conditional

Discussion in 'Ruby' started by ciapecki, Mar 19, 2008.

  1. ciapecki

    ciapecki Guest

    Hi,

    How could I easily do the following?:

    input = '111|"aaaa" bbbbb|c'

    I would like to get as output following:

    output = '111|"""aaa"" bbbbb"|c'

    to correct wrongly prepared pipe separated file with enclosing the
    equation characters.

    thanks
    chris
    ciapecki, Mar 19, 2008
    #1
    1. Advertising

  2. ciapecki

    Xavier Noria Guest

    On Mar 19, 2008, at 10:00 , ciapecki wrote:
    > Hi,
    >
    > How could I easily do the following?:
    >
    > input = '111|"aaaa" bbbbb|c'
    >
    > I would like to get as output following:
    >
    > output = '111|"""aaa"" bbbbb"|c'
    >
    > to correct wrongly prepared pipe separated file with enclosing the
    > equation characters.


    My interpretation of what you need is:

    input = '111|"aaaa" bbbbb|c'

    fields = input.split('|')
    fields[1].gsub!('"', '""')
    fields[1] = %Q{"#{fields[1]}"}

    output = fields.join('|')

    Depending on the input that may not be valid, for example if fields[1]
    may contain a pipe. Anyway being a mal-formed input assumptions depend
    on the actual data.

    -- fxn
    Xavier Noria, Mar 19, 2008
    #2
    1. Advertising

  3. ciapecki

    ciapecki Guest

    On 19 Mrz., 10:09, Xavier Noria <> wrote:

    > My interpretation of what you need is:
    >
    > input = '111|"aaaa" bbbbb|c'
    >
    > fields = input.split('|')
    > fields[1].gsub!('"', '""')
    > fields[1] = %Q{"#{fields[1]}"}
    >
    > output = fields.join('|')
    >
    > Depending on the input that may not be valid, for example if fields[1]
    > may contain a pipe. Anyway being a mal-formed input assumptions depend
    > on the actual data.
    >
    > -- fxn



    your solution works, and thanks for that,
    I am waiting though for a regexp solution,

    thanks anyway
    chris
    ciapecki, Mar 19, 2008
    #3
  4. On 19.03.2008 11:21, ciapecki wrote:
    > On 19 Mrz., 10:09, Xavier Noria <> wrote:
    >
    >> My interpretation of what you need is:
    >>
    >> input = '111|"aaaa" bbbbb|c'
    >>
    >> fields = input.split('|')
    >> fields[1].gsub!('"', '""')
    >> fields[1] = %Q{"#{fields[1]}"}
    >>
    >> output = fields.join('|')
    >>
    >> Depending on the input that may not be valid, for example if fields[1]
    >> may contain a pipe. Anyway being a mal-formed input assumptions depend
    >> on the actual data.

    >
    > your solution works, and thanks for that,
    > I am waiting though for a regexp solution,


    Even 2 regexps:

    irb(main):001:0> input = '111|"aaaa" bbbbb|c'
    => "111|\"aaaa\" bbbbb|c"
    irb(main):002:0> input.gsub(/[^|]+/) {|m| m.gsub!(/"/,'""') ?
    '"'<<m<<'"' : m}
    => "111|\"\"\"aaaa\"\" bbbbb\"|c"
    irb(main):003:0> puts input.gsub(/[^|]+/) {|m| m.gsub!(/"/,'""') ?
    '"'<<m<<'"' : m}
    111|"""aaaa"" bbbbb"|c
    => nil

    Cheers

    robert
    Robert Klemme, Mar 19, 2008
    #4
  5. Robert Klemme wrote:
    >> I am waiting though for a regexp solution,

    > Even 2 regexps:
    > Cheers
    >
    > robert


    Robert,

    How fixed is the input. If it is always of the same format, then what
    about:

    input = '111|"aaaa" bbbbb|c'
    output=input.gsub(/\"/,'""').gsub(/(.*)\|(.*)\|(.*)/,'\1|"\2"|\3')

    Mac
    --
    Posted via http://www.ruby-forum.com/.
    Paul Mckibbin, Mar 20, 2008
    #5
  6. Paul Mckibbin, Mar 20, 2008
    #6
  7. ciapecki

    ciapecki Guest

    On 20 Mrz., 01:18, Paul Mckibbin <> wrote:
    > Paul Mckibbin wrote:
    >
    > > Robert,

    >
    > Oops. I meant Chris of course.
    >
    >
    >
    > > Mac

    >
    > --
    > Posted viahttp://www.ruby-forum.com/.


    Hi Mac,

    The format is fixed but contains 49 fields separated by | so your one-
    liner could not fit into one line :)

    thanks,
    chris
    ciapecki, Mar 20, 2008
    #7
  8. ciapecki

    ciapecki Guest

    On 19 Mrz., 22:28, Robert Klemme <> wrote:
    > On 19.03.2008 11:21, ciapecki wrote:
    >
    >
    >
    > > On 19 Mrz., 10:09, Xavier Noria <> wrote:

    >
    > >> My interpretation of what you need is:

    >
    > >> input = '111|"aaaa" bbbbb|c'

    >
    > >> fields = input.split('|')
    > >> fields[1].gsub!('"', '""')
    > >> fields[1] = %Q{"#{fields[1]}"}

    >
    > >> output = fields.join('|')

    >
    > >> Depending on the input that may not be valid, for example if fields[1]
    > >> may contain a pipe. Anyway being a mal-formed input assumptions depend
    > >> on the actual data.

    >
    > > your solution works, and thanks for that,
    > > I am waiting though for a regexp solution,

    >
    > Even 2 regexps:
    >
    > irb(main):001:0> input = '111|"aaaa" bbbbb|c'
    > => "111|\"aaaa\" bbbbb|c"
    > irb(main):002:0> input.gsub(/[^|]+/) {|m| m.gsub!(/"/,'""') ?
    > '"'<<m<<'"' : m}
    > => "111|\"\"\"aaaa\"\" bbbbb\"|c"
    > irb(main):003:0> puts input.gsub(/[^|]+/) {|m| m.gsub!(/"/,'""') ?
    > '"'<<m<<'"' : m}
    > 111|"""aaaa"" bbbbb"|c
    > => nil
    >
    > Cheers
    >
    > robert


    this is just great,
    thanks robert for this


    chris
    ciapecki, Mar 20, 2008
    #8
  9. 2008/3/20, ciapecki <>:
    > On 19 Mrz., 22:28, Robert Klemme <> wrote:
    > > On 19.03.2008 11:21, ciapecki wrote:
    > >
    > >
    > >
    > > > On 19 Mrz., 10:09, Xavier Noria <> wrote:

    > >
    > > >> My interpretation of what you need is:

    > >
    > > >> input = '111|"aaaa" bbbbb|c'

    > >
    > > >> fields = input.split('|')
    > > >> fields[1].gsub!('"', '""')
    > > >> fields[1] = %Q{"#{fields[1]}"}

    > >
    > > >> output = fields.join('|')

    > >
    > > >> Depending on the input that may not be valid, for example if fields[1]
    > > >> may contain a pipe. Anyway being a mal-formed input assumptions depend
    > > >> on the actual data.

    > >
    > > > your solution works, and thanks for that,
    > > > I am waiting though for a regexp solution,

    > >
    > > Even 2 regexps:
    > >
    > > irb(main):001:0> input = '111|"aaaa" bbbbb|c'
    > > => "111|\"aaaa\" bbbbb|c"
    > > irb(main):002:0> input.gsub(/[^|]+/) {|m| m.gsub!(/"/,'""') ?
    > > '"'<<m<<'"' : m}
    > > => "111|\"\"\"aaaa\"\" bbbbb\"|c"
    > > irb(main):003:0> puts input.gsub(/[^|]+/) {|m| m.gsub!(/"/,'""') ?
    > > '"'<<m<<'"' : m}
    > > 111|"""aaaa"" bbbbb"|c
    > > => nil
    > >
    > > Cheers
    > >
    > > robert

    >
    >
    > this is just great,
    > thanks robert for this


    You're welcome. Btw, this is even better (also faster)

    irb(main):003:0> input = '111|"aaaa" bbbbb|c'
    => "111|\"aaaa\" bbbbb|c"
    irb(main):004:0> input.gsub(/"/,'""').gsub(/[^|]*"[^|]*/,'"\\&"')
    => "111|\"\"\"aaaa\"\" bbbbb\"|c"
    irb(main):005:0> puts input.gsub(/"/,'""').gsub(/[^|]*"[^|]*/,'"\\&"')
    111|"""aaaa"" bbbbb"|c
    => nil

    This could even be a bit faster:

    irb(main):006:0> input.gsub(/"/,'""').gsub(/[^|"]*"[^|]*/,'"\\&"')
    => "111|\"\"\"aaaa\"\" bbbbb\"|c"

    Kind regards

    robert

    --
    use.inject do |as, often| as.you_can - without end
    Robert Klemme, Mar 20, 2008
    #9
  10. ciapecki wrote:
    >
    > The format is fixed but contains 49 fields separated by | so your one-
    > liner could not fit into one line :)


    A minor change :)

    input = '111|"aaaa" bbbbb|c|"22222" asdasd|ddd|"aaaa"qqqqq|jjjj'
    output=input.gsub(/"/,'""').gsub(/(.*?)\|(.*?)\|(.*?)/,'\1|"\2"|\3')

    =>111|"""aaaa"" bbbbb"|c|"""22222"" asdasd"|ddd|"""aaaa""qqqqq"|jjjj

    This was just to point out that there is no need for multiple replace
    options, but you need to know the layout is correct and in a given
    pattern. Robert's is the better and more robust solution (and also
    shorter).

    Mac
    --
    Posted via http://www.ruby-forum.com/.
    Paul Mckibbin, Mar 20, 2008
    #10
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Alec S.
    Replies:
    10
    Views:
    10,144
    Alec S.
    Apr 16, 2005
  2. Greg Hurrell
    Replies:
    4
    Views:
    159
    James Edward Gray II
    Feb 14, 2007
  3. Joao Silva
    Replies:
    16
    Views:
    359
    7stud --
    Aug 21, 2009
  4. David Wake

    Conditional options to a regexp match?

    David Wake, Oct 10, 2003, in forum: Perl Misc
    Replies:
    7
    Views:
    101
    John W. Krahn
    Oct 11, 2003
  5. Kenneth Brun Nielsen

    Newbie: Simple conditional on regexp match

    Kenneth Brun Nielsen, Aug 6, 2008, in forum: Perl Misc
    Replies:
    3
    Views:
    128
Loading...

Share This Page