Substitution

Discussion in 'Perl Misc' started by Vishal G, Nov 6, 2008.

  1. Vishal G

    Vishal G Guest

    Hi Guys,

    A Simple Substitution Problem

    my $dna =
    "***********acgtgcta*****atctgat******actgtaaa***tttt**cccc******ccccc******";

    # I want to replace asterisk(*) with dot(.) if 10 or more
    asterisks occur together

    $dna =~ s/\*{10,}/./g;

    print "$dna\n";

    # output

    .acgtgcta*****atctgat******actgtaaa***tttt**cccc******ccccc******

    As you can see 10 or more asterisk are replaced with dot but what
    I want is this

    ...........acgtgcta*****atctgat******actgtaaa***tttt**cccc******ccccc******


    How to do it

    Vishal
     
    Vishal G, Nov 6, 2008
    #1
    1. Advertising

  2. Vishal G

    xiechao Guest

    $dna=~s/\*{10,}/"." x length($&)/eg;

    On Nov 6, 2:17 pm, Vishal G <> wrote:
    > Hi Guys,
    >
    > A Simple Substitution Problem
    >
    >     my $dna =
    > "***********acgtgcta*****atctgat******actgtaaa***tttt**cccc******ccccc******";
    >
    >     # I want to replace asterisk(*) with dot(.) if 10 or more
    > asterisks occur together
    >
    >     $dna =~ s/\*{10,}/./g;
    >
    >     print "$dna\n";
    >
    >     # output
    >
    >     .acgtgcta*****atctgat******actgtaaa***tttt**cccc******ccccc******
    >
    >     As you can see 10 or more asterisk are replaced with dot but what
    > I want is this
    >
    >     ...........acgtgcta*****atctgat******actgtaaa***tttt**cccc******ccccc******
    >
    > How to do it
    >
    > Vishal
     
    xiechao, Nov 6, 2008
    #2
    1. Advertising

  3. Vishal G

    Tim Greer Guest

    Vishal G wrote:

    > Hi Guys,
    >
    > A Simple Substitution Problem
    >
    > my $dna =
    >

    "***********acgtgcta*****atctgat******actgtaaa***tttt**cccc******ccccc******";
    >
    > # I want to replace asterisk(*) with dot(.) if 10 or more
    > asterisks occur together
    >
    > $dna =~ s/\*{10,}/./g;
    >
    > print "$dna\n";
    >
    > # output
    >
    > .acgtgcta*****atctgat******actgtaaa***tttt**cccc******ccccc******
    >
    > As you can see 10 or more asterisk are replaced with dot but what
    > I want is this
    >
    > ...........acgtgcta*****atctgat******actgtaaa***tttt**cccc******ccccc******
    >
    >
    > How to do it
    >
    > Vishal


    Here's one way, off the top of my head:

    $dna =~ s/(\*{10,})/'.' x length($1)/eg;

    It replaces 10 (or more) instances of \* with ., based on the actual
    length of the (\*{10,}) match, replacing each one, with no fixed length
    limit (so long as there are 10 or more instances). Just sticking with
    your original regex substitution example solution.
    --
    Tim Greer, CEO/Founder/CTO, BurlyHost.com, Inc.
    Shared Hosting, Reseller Hosting, Dedicated & Semi-Dedicated servers
    and Custom Hosting. 24/7 support, 30 day guarantee, secure servers.
    Industry's most experienced staff! -- Web Hosting With Muscle!
     
    Tim Greer, Nov 6, 2008
    #3
  4. Vishal G

    Uri Guttman Guest

    >>>>> "x" == xiechao <> writes:

    x> $dna=~s/\*{10,}/"." x length($&)/eg;

    don't use $& as it will slow down all the other regexes in your
    program. this is a known problem and trivial to get around. just
    explicitly grab the matched string and refer to it with $1


    $dna =~ s/(\*{10,})/ '.' x length($1) /eg;

    uri

    --
    Uri Guttman ------ -------- http://www.sysarch.com --
    ----- Perl Code Review , Architecture, Development, Training, Support ------
    --------- Free Perl Training --- http://perlhunter.com/college.html ---------
    --------- Gourmet Hot Cocoa Mix ---- http://bestfriendscocoa.com ---------
     
    Uri Guttman, Nov 6, 2008
    #4
  5. Vishal G

    Guest

    On Wed, 5 Nov 2008 22:17:37 -0800 (PST), Vishal G <> wrote:

    >Hi Guys,
    >
    >A Simple Substitution Problem
    >
    > my $dna =
    >"***********acgtgcta*****atctgat******actgtaaa***tttt**cccc******ccccc******";
    >
    > # I want to replace asterisk(*) with dot(.) if 10 or more
    >asterisks occur together
    >
    > $dna =~ s/\*{10,}/./g;
    >
    > print "$dna\n";
    >
    > # output
    >
    > .acgtgcta*****atctgat******actgtaaa***tttt**cccc******ccccc******
    >
    > As you can see 10 or more asterisk are replaced with dot but what
    >I want is this
    >
    > ...........acgtgcta*****atctgat******actgtaaa***tttt**cccc******ccccc******
    >
    >
    >How to do it
    >
    >Vishal


    So you wan't 5,000,000 asterisks replaced with 5,000,000 dots?
    Why?

    sln
     
    , Nov 7, 2008
    #5
  6. Vishal G

    Rocco Caputo Guest

    On Wed, 5 Nov 2008 22:17:37 -0800 (PST), Vishal G wrote:
    >
    > my $dna =
    > "***********acgtgcta*****atctgat******actgtaaa***tttt**cccc******ccccc******";


    You seem to be working with genomes. Have you looked at bioperl?
    http://www.bioperl.org/wiki/Main_Page

    I don't know whether it will solve your problem, but I suspect it may.

    --
    Rocco Caputo - http://poe.perl.org/
     
    Rocco Caputo, Nov 7, 2008
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. valentin tihomirov

    Should this substitution be compilable?

    valentin tihomirov, Nov 28, 2004, in forum: VHDL
    Replies:
    12
    Views:
    814
    valentin tihomirov
    Nov 30, 2004
  2. Troll
    Replies:
    6
    Views:
    2,465
    Kris Wempa
    Sep 26, 2003
  3. Justin

    adobe multiline substitution

    Justin, Dec 8, 2003, in forum: Perl
    Replies:
    0
    Views:
    522
    Justin
    Dec 8, 2003
  4. Ashok

    Substitution Problem

    Ashok, Jul 18, 2004, in forum: Perl
    Replies:
    1
    Views:
    669
    Gunnar Hjalmarsson
    Jul 18, 2004
  5. Ian
    Replies:
    4
    Views:
    2,327
    Ben Bacarisse
    Feb 2, 2006
Loading...

Share This Page