search and replace

Discussion in 'Perl Misc' started by yeti349@yahoo.com, Sep 2, 2005.

  1. Guest

    Hi. The following snippet seems to be doing what I want, but I'd like
    to get suggestions for refinement or improving the code. Thank you.

    #!/usr/bin/perl -w

    use strict;

    my $phrase = "MARKEM 1ST HOLDINGS INC. & MONEY MARKET LLC";

    my @terms = terms();

    foreach my $term (@terms)
    {
    $phrase =~ s/\W/ /ig; #remove nonword characters
    $phrase =~ s/\S*\d\S*/ /ig; #remove numeric strings
    $phrase =~ s/\b($term)\b//ig; #remove business terms
    }

    #new phrase: "MARKEM MONEY MARKET"
    print "$phrase\n";

    sub terms
    {
    my @terms = qw(holdings inc llc);
    }
     
    , Sep 2, 2005
    #1
    1. Advertising

  2. Paul Lalli Guest

    wrote:
    > Hi. The following snippet seems to be doing what I want, but I'd like
    > to get suggestions for refinement or improving the code. Thank you.
    >
    > #!/usr/bin/perl -w


    You enabled Warnings. Very good. However, these days, it is
    preferable to write
    use warnings;
    rather than the -w option on the shebang.

    > use strict;


    Excellent!

    > my $phrase = "MARKEM 1ST HOLDINGS INC. & MONEY MARKET LLC";


    My personal preference is to not use double quotes unless you actually
    need interpolation.

    >
    > my @terms = terms();


    I get nervous when using the same identifier for multiple pieces of
    information. There's nothing *wrong* with this, of course, and Perl
    will readily do exactly what you want. It just seems needlessly
    confusing to the reader. Consider something like `my @terms =
    get_terms();` instead.

    >
    > foreach my $term (@terms)
    > {
    > $phrase =~ s/\W/ /ig; #remove nonword characters


    Your comment doesn't match what the code actually does. You're not
    removing non-word characters. You're replacing each non-word character
    with a single space character.

    There is no reason for the /i modifier in that regexp.

    You're specifying the /g modifer, which means this one single statement
    removes *all* the non-word characters. So why is this line in a loop?
    This code does not in any way depend on $term, and it should be moved
    outside the loop so that it's only executed once.

    > $phrase =~ s/\S*\d\S*/ /ig; #remove numeric strings


    Pretty much the same critiques for this one. The comment doesn't match
    what's happening, the /i modifier is pointless, and it should only be
    executed once.

    > $phrase =~ s/\b($term)\b//ig; #remove business terms


    Okay. *This* one should absolutely remain in the loop, and should use
    the /i modifier, and it is in fact actually removing the term.

    My only critique here is that there's no reason to use (), as you're
    not using $1 anywhere.

    > }
    >
    > #new phrase: "MARKEM MONEY MARKET"
    > print "$phrase\n";


    Again, your comment doesn't match. That is not the new phrase. The
    new phrase is: 'MARKEM MONEY MARKET '

    > sub terms
    > {
    > my @terms = qw(holdings inc llc);
    > }


    If this subroutine has no point other than returning a list of values,
    why are you bothering to create a new variable at all?


    Putting all my critiques together, we're left with:
    #!/usr/bin/perl
    use strict;
    use warnings;

    my $phrase = 'MARKEM 1ST HOLDINGS INC. & MONEY MARKET LLC';

    my @terms = get_terms();

    $phrase =~ s/\W/ /ig; #replace non-word chars with a space
    $phrase =~ s/\S*\d\S*/ /ig; #replace strings containing a digit
    #with a space

    for my $term (@terms) {
    $phrase =~ s/\b$term\b//ig; #remove business terms
    }

    #new phrase: 'MARKEM MONEY MARKET '
    print "$phrase\n";

    sub get_terms {
    qw/holdings inc llc/;
    }

    __END__
     
    Paul Lalli, Sep 2, 2005
    #2
    1. Advertising

  3. Guest

    Excellent repsonse Paul! Thank you for your help.
     
    , Sep 2, 2005
    #3
  4. Anno Siegel Guest

    Paul Lalli <> wrote in comp.lang.perl.misc:
    > wrote:
    > > Hi. The following snippet seems to be doing what I want, but I'd like
    > > to get suggestions for refinement or improving the code. Thank you.


    [most of excellent code revision snipped]

    > > #!/usr/bin/perl -w
    > > use strict;
    > >
    > > my $phrase = "MARKEM 1ST HOLDINGS INC. & MONEY MARKET LLC";
    > >
    > > my @terms = terms();

    >
    > I get nervous when using the same identifier for multiple pieces of
    > information. There's nothing *wrong* with this, of course, and Perl
    > will readily do exactly what you want. It just seems needlessly
    > confusing to the reader. Consider something like `my @terms =
    > get_terms();` instead.


    I don't get nervous when the like-named things represent essentially
    the same information. As this is the case here, I might have used
    "terms" in both senses without compunctions.

    However, in this case the issue can be dodged. The variable @terms is only
    used once, so it can go (okay, there are exceptions). Just call terms()
    directly in its place.

    > > foreach my $term (@terms)


    foreach my $term ( terms() ) {

    Anno
    --
    If you want to post a followup via groups.google.com, don't use
    the broken "Reply" link at the bottom of the article. Click on
    "show options" at the top of the article, then click on the
    "Reply" at the bottom of the article headers.
     
    Anno Siegel, Sep 2, 2005
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Mark McKay
    Replies:
    3
    Views:
    1,316
    Thomas Weidenfeller
    Jan 21, 2004
  2. Brian Blais
    Replies:
    1
    Views:
    382
    Bruno Desthuilliers
    Jun 27, 2006
  3. Greg Ewing
    Replies:
    2
    Views:
    345
    Dieter Maurer
    Jun 29, 2006
  4. Abby Lee
    Replies:
    5
    Views:
    414
    Abby Lee
    Aug 2, 2004
  5. Replies:
    1
    Views:
    518
    Rainer Weikusat
    Jun 21, 2012
Loading...

Share This Page