Multiple substitution in a complex RE

Discussion in 'Perl Misc' started by SqlSlinger, Feb 18, 2004.

  1. SqlSlinger

    SqlSlinger Guest

    I'm trying to convert all occurances of variables of the form

    VARIABLE_ONE, VARIABLE_NUMBER_TWO, a_really_long_VARIABLE_name

    to the form

    VariableOne, VariableNumberTwo, AReallyLongVariableName

    Here's my Program:

    $_='a_really_long_VARIABLE_name';
    s/([A-Za-z])([^_ \t\n]*?)(_([A-Za-z0-9])([^_ \t\n]*?))+/\u$1\L$2\E\u$4\L$5\E/;
    print;
    __END__

    Prints "AReally_long_VARIABLE_name".

    Can anyone help?
    SqlSlinger, Feb 18, 2004
    #1
    1. Advertising

  2. SqlSlinger <> wrote:
    > I'm trying to convert all occurances of variables of the form
    > VARIABLE_ONE, VARIABLE_NUMBER_TWO, a_really_long_VARIABLE_name
    > to the form
    > VariableOne, VariableNumberTwo, AReallyLongVariableName



    my @vars = qw(VARIABLE_ONE VARIABLE_NUMBER_TWO a_really_long_VARIABLE_name);
    my %StudlyCaps;
    foreach (@vars) {
    my $var = ucfirst lc;
    $var =~ s/_(.)/\U$1/g;
    $StudlyCaps{$_} = $var;
    }
    use Data::Dumper;
    print Dumper(\%StudlyCaps);


    --
    Glenn Jackman
    NCF Sysadmin
    Glenn Jackman, Feb 18, 2004
    #2
    1. Advertising

  3. SqlSlinger

    Uri Guttman Guest

    >>>>> "GJ" == Glenn Jackman <> writes:


    GJ> my @vars = qw(VARIABLE_ONE VARIABLE_NUMBER_TWO
    GJ> a_really_long_VARIABLE_name);

    GJ> foreach (@vars) {
    GJ> my $var = ucfirst lc;
    GJ> $var =~ s/_(.)/\U$1/g;
    GJ> }

    s/_?(.)([^_]*)/\U$1\L$2/g ;

    uri

    --
    Uri Guttman ------ -------- http://www.stemsystems.com
    --Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
    Search or Offer Perl Jobs ---------------------------- http://jobs.perl.org
    Uri Guttman, Feb 18, 2004
    #3
  4. Uri Guttman <> wrote:
    > >>>>> "GJ" == Glenn Jackman <> writes:

    > GJ> my $var = ucfirst lc;
    > GJ> $var =~ s/_(.)/\U$1/g;
    >
    > s/_?(.)([^_]*)/\U$1\L$2/g ;


    Good one.

    --
    Glenn Jackman
    NCF Sysadmin
    Glenn Jackman, Feb 18, 2004
    #4
  5. SqlSlinger wrote:
    >
    > I'm trying to convert all occurances of variables of the form
    >
    > VARIABLE_ONE, VARIABLE_NUMBER_TWO, a_really_long_VARIABLE_name
    >
    > to the form
    >
    > VariableOne, VariableNumberTwo, AReallyLongVariableName
    >
    > Here's my Program:
    >
    > $_='a_really_long_VARIABLE_name';
    > s/([A-Za-z])([^_ \t\n]*?)(_([A-Za-z0-9])([^_ \t\n]*?))+/\u$1\L$2\E\u$4\L$5\E/;
    > print;
    > __END__
    >
    > Prints "AReally_long_VARIABLE_name".
    >
    > Can anyone help?



    for ( qw/ VARIABLE_ONE VARIABLE_NUMBER_TWO a_really_long_VARIABLE_name / ) {
    print;
    print join '', map "\u\L$_", split /_/;
    }



    John
    --
    use Perl;
    program
    fulfillment
    John W. Krahn, Feb 19, 2004
    #5
  6. SqlSlinger

    SqlSlinger Guest

    Glenn, Uri, John,

    Excellent suggestions all!

    Now here's the rest of what I should have told you. I use Perl every
    chance I get, but in this case I'm hoping to program a macro in
    TextPad to perform this kind of substitution. TextPad REs support
    most of the Perl RE features, but not the g (global) modifier and not
    loops. This is why I was attempting to get it all done in the s///
    statement. If it won't work, of course, I can program a quick perl
    script to do it using any of your suggestions, but that's slightly
    more inconvenient than the TextPad macro.

    For my own info, is there any way to get a repeated outer grouping in
    the pattern to replace while referencing (and modifying) the inner
    grouping in the replace string?
    SqlSlinger, Feb 19, 2004
    #6
  7. SqlSlinger

    SqlSlinger Guest

    Uri Guttman <> wrote in message news:<>...
    > >>>>> "GJ" == Glenn Jackman <> writes:

    >
    >
    > GJ> my @vars = qw(VARIABLE_ONE VARIABLE_NUMBER_TWO
    > GJ> a_really_long_VARIABLE_name);
    >
    > GJ> foreach (@vars) {
    > GJ> my $var = ucfirst lc;
    > GJ> $var =~ s/_(.)/\U$1/g;
    > GJ> }
    >
    > s/_?(.)([^_]*)/\U$1\L$2/g ;
    >
    > uri


    Nice and compact, but doesn't isolate only variables containing an
    underscore. If run against a source file, would match any word,
    right?

    Vince
    SqlSlinger, Feb 19, 2004
    #7
  8. SqlSlinger

    fifo Guest

    At 2004-02-19 05:37 -0800, SqlSlinger wrote:
    > Now here's the rest of what I should have told you. I use Perl every
    > chance I get, but in this case I'm hoping to program a macro in
    > TextPad to perform this kind of substitution. TextPad REs support
    > most of the Perl RE features, but not the g (global) modifier and not
    > loops. This is why I was attempting to get it all done in the s///
    > statement. If it won't work, of course, I can program a quick perl
    > script to do it using any of your suggestions, but that's slightly
    > more inconvenient than the TextPad macro.
    >


    If you need a single substitution to work over the entire file, how
    about doing something like

    s/ ([a-zA-Z0-9])([a-zA-Z0-9]*)(?=_[a-zA-Z0-9_])
    | (?<=[a-zA-Z0-9_]_)([a-zA-Z0-9])([a-zA-Z0-9]*)
    /\U$1$3\L$2$4/xg
    fifo, Feb 19, 2004
    #8
  9. SqlSlinger

    Uri Guttman Guest

    >>>>> "S" == SqlSlinger <> writes:

    S> Uri Guttman <> wrote in message news:<>...
    >> >>>>> "GJ" == Glenn Jackman <> writes:

    >>
    >>

    GJ> my @vars = qw(VARIABLE_ONE VARIABLE_NUMBER_TWO
    GJ> a_really_long_VARIABLE_name);
    >>

    GJ> foreach (@vars) {
    GJ> my $var = ucfirst lc;
    GJ> $var =~ s/_(.)/\U$1/g;
    GJ> }
    >>
    >> s/_?(.)([^_]*)/\U$1\L$2/g ;
    >>
    >> uri


    S> Nice and compact, but doesn't isolate only variables containing an
    S> underscore. If run against a source file, would match any word,
    S> right?

    probably but that could be fixed with some guard regex stuff. and it
    wasn't in the spec which showed a qw list of tokens. you need to code to
    what is posted (unless it is obvious the OP is having an XY problem or
    way off base).

    uri

    --
    Uri Guttman ------ -------- http://www.stemsystems.com
    --Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
    Search or Offer Perl Jobs ---------------------------- http://jobs.perl.org
    Uri Guttman, Feb 19, 2004
    #9
  10. SqlSlinger

    SqlSlinger Guest

    fifo <> wrote in message news:<20040219143341.GA14410@fleece>...
    > At 2004-02-19 05:37 -0800, SqlSlinger wrote:
    > > Now here's the rest of what I should have told you. I use Perl every
    > > chance I get, but in this case I'm hoping to program a macro in
    > > TextPad to perform this kind of substitution. TextPad REs support
    > > most of the Perl RE features, but not the g (global) modifier and not
    > > loops. This is why I was attempting to get it all done in the s///
    > > statement. If it won't work, of course, I can program a quick perl
    > > script to do it using any of your suggestions, but that's slightly
    > > more inconvenient than the TextPad macro.
    > >

    >
    > If you need a single substitution to work over the entire file, how
    > about doing something like
    >
    > s/ ([a-zA-Z0-9])([a-zA-Z0-9]*)(?=_[a-zA-Z0-9_])
    > | (?<=[a-zA-Z0-9_]_)([a-zA-Z0-9])([a-zA-Z0-9]*)
    > /\U$1$3\L$2$4/xg


    Prints A_Really_Long_Variable_Name. Can we lose the underscores? I'm
    not familiar with the RE directives you've used to be able to work up
    an adjusted version, but I'll look into it. Doesn't seem to affect
    text without the underscore, which is nice.

    Thanks!
    SqlSlinger, Feb 20, 2004
    #10
  11. SqlSlinger <> wrote:

    > Can we lose the underscores?



    Yes (but I think it is a horrid idea...).


    tr/_//d;


    --
    Tad McClellan SGML consulting
    Perl programming
    Fort Worth, Texas
    Tad McClellan, Feb 20, 2004
    #11
  12. SqlSlinger

    fifo Guest

    At 2004-02-20 12:05 -0800, SqlSlinger wrote:
    > fifo <> wrote in message news:<20040219143341.GA14410@fleece>...
    > > If you need a single substitution to work over the entire file, how
    > > about doing something like
    > >
    > > s/ ([a-zA-Z0-9])([a-zA-Z0-9]*)(?=_[a-zA-Z0-9_])
    > > | (?<=[a-zA-Z0-9_]_)([a-zA-Z0-9])([a-zA-Z0-9]*)
    > > /\U$1$3\L$2$4/xg

    >
    > Prints A_Really_Long_Variable_Name. Can we lose the underscores?


    Sure, you just need to transpose two characters:

    s/ ([a-zA-Z0-9])([a-zA-Z0-9]*)(?=_[a-zA-Z0-9_])
    | (?<=[a-zA-Z0-9_])_([a-zA-Z0-9])([a-zA-Z0-9]*)
    /\U$1$3\L$2$4/xg

    Now you should get AReallyLongVariableName.
    fifo, Feb 21, 2004
    #12
  13. SqlSlinger

    SqlSlinger Guest

    Fifo,

    > Sure, you just need to transpose two characters:
    >
    > s/ ([a-zA-Z0-9])([a-zA-Z0-9]*)(?=_[a-zA-Z0-9_])
    > | (?<=[a-zA-Z0-9_])_([a-zA-Z0-9])([a-zA-Z0-9]*)
    > /\U$1$3\L$2$4/xg
    >
    > Now you should get AReallyLongVariableName.


    I did indeed! I'm been staring at the expression and trying to make
    sense of it. I've looked up ?= and ?<= directives, and they're
    starting to crack my brain's permafrost, but I'm still confused as to
    why this works. On first glance, the replacement expression would
    appear to give all upper case followed by all lower case. I would
    have expected the replacement expression to look like

    \u$1\L$2\u$3\L$4

    Can you explain this a bit further? I'm hoping to make greater use of
    the directives once I understand this example...
    SqlSlinger, Feb 24, 2004
    #13
  14. SqlSlinger

    fifo Guest

    At 2004-02-24 06:29 -0800, SqlSlinger wrote:
    > >
    > > s/ ([a-zA-Z0-9])([a-zA-Z0-9]*)(?=_[a-zA-Z0-9_])
    > > | (?<=[a-zA-Z0-9_])_([a-zA-Z0-9])([a-zA-Z0-9]*)
    > > /\U$1$3\L$2$4/xg
    > >
    > > Now you should get AReallyLongVariableName.

    >
    > I did indeed! I'm been staring at the expression and trying to make
    > sense of it. I've looked up ?= and ?<= directives, and they're
    > starting to crack my brain's permafrost, but I'm still confused as to
    > why this works. On first glance, the replacement expression would
    > appear to give all upper case followed by all lower case. I would
    > have expected the replacement expression to look like
    >
    > \u$1\L$2\u$3\L$4
    >
    > Can you explain this a bit further? I'm hoping to make greater use of
    > the directives once I understand this example...


    The regex has two alternate parts: the first part matches a sequence of
    alphnum characters [a-zA-Z0-9] followed by an underscore and another
    alphnum character; the second part matches a sequence of alphnum
    characters that are preceeded by an alphnum character and an underscore.
    In the first case, $1 is set to the first character and $2 to the rest;
    in the second case $3 is set to the first character and $4 to the rest
    (and the underscore is discarded). Then we upcase the first letter
    "$1$3" and downcase the rest "$2$4".
    fifo, Feb 24, 2004
    #14
  15. SqlSlinger

    SqlSlinger Guest

    Thanks, Fifo. Making more sense now.
    SqlSlinger, Feb 27, 2004
    #15
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. news.amnet.net.au
    Replies:
    1
    Views:
    572
    =?UTF-8?b?TMSByrtpZSBUZWNoaWU=?=
    Apr 13, 2004
  2. Stanimir Stamenkov
    Replies:
    2
    Views:
    743
    Stanimir Stamenkov
    Oct 25, 2005
  3. Robert Mark Bram
    Replies:
    0
    Views:
    682
    Robert Mark Bram
    Feb 4, 2007
  4. Replies:
    3
    Views:
    395
    Joseph Kesselman
    Oct 17, 2007
  5. Kottiyath

    How complex is complex?

    Kottiyath, Mar 18, 2009, in forum: Python
    Replies:
    22
    Views:
    757
Loading...

Share This Page