running a sub inside regex

Discussion in 'Perl Misc' started by Thomas Isenbarger, Nov 17, 2003.

  1. I have never posted to this group before, so please forgive me if I am
    posting in the wrong place

    in the perl script below, I am trying to construct a string, within a
    regex, based on an earlier captured match.

    i think something about the last elsif block doesn't work. the regex
    within the while loop near the bottom (before the subroutine) returns no
    matches to sequences that should work.

    for molecular biologists out there, I am trying to find the reverse
    complement on the fly inside the regex. the @pairing array is something
    like @pairing = (AU, CG, GC, UA) and defines the substitutions to be made
    in order to build a reverse complement.

    for non molecular biologists out there, a sequence that should match is
    'acgu' if you need a test case.

    input for the program:

    use "acgu" for the target sequence.

    then, for the sequence elements enter "1s=ac", RETURN, "1r", RETURN, RETURN.

    please contact me at isen (AT) mgh (DOT) molbio (DOT) harvard (DOT) edu

    if you are willing to help and/or need more information.

    thank-you,
    Tom Isenbarger


    #!/usr/bin/perl -w
    use re 'eval'; #allows
    execution of code within regex expressions (??{})

    @pairing = (AU, GC, CG, UA, NN);

    print "enter target sequence ";
    $target = uc<STDIN>;
    chomp($target);

    if ($target =~ /[^ACGTU]/) {
    die "invalid characters found in target sequence. exiting isenfind\n\n";
    }

    #ask to keep IUB codes or convert them?
    #ask to leave N or not?

    print "enter sequence elements, one line at a time (return to stop)\n";

    $input = "";
    $i = 0;

    do {
    $input = uc<STDIN>;
    chomp ($input);
    $input =~ s/\s+//g;
    $element[$i] = $input;
    $i++;
    } until ($input eq "");

    for ($i = 0; $i < @element; $i++) {
    $item = $element[$i];
    print "$item\n";
    if ($item =~ /^{/) { #a base
    pairing rule
    $pairingstring = $item;
    $pairingstring =~ s/[{}]//g;
    @pairing = split /,/, $pairingstring;
    #check for valid pair rule syntax
    print "pairing now @pairing\n";
    }

    elsif ($item =~ /^(\d+)S=([ACGUBDHKMNRSVWY]+)/) { #a specific
    sequence element to remember
    $pattern = "(".$2.")";
    $regex .= $pattern;
    $regexpos[$1] = ($i+1); #record the
    position of this sequence in the element list
    }

    elsif ($item =~ /^(\d+)S=(\d+)-(\d+)/) { #a
    non-specific sequence element to remember
    $pattern = "([ACGU]{$2,$3})";
    $regex .= $pattern;
    $regexpos[$1] = ($i+1);
    }

    elsif ($item =~/^(\d+)S/) {
    $lookwhere = $regexpos[$1];
    $pattern = "(\\".$lookwhere.")";
    $regex .= $pattern;
    }

    elsif ($item =~ /^[ACGUBDHKMNRSVWY]+/) { #a specific
    sequence element
    $pattern = "(".$item.")";
    $regex .= $pattern;
    }

    elsif ($item =~ /^(\d+)-(\d+)/) { #a
    non-specific sequence element
    $pattern = "([ACGU]{$1,$2})";
    $regex .= $pattern;
    }

    elsif ($item =~ /^(\d+)P/) { #a palindrome
    of an earlier saved element
    $lookwhere = $regexpos[$1];
    $pattern = "(??{reverse \$".$lookwhere."})";
    $regex .= $pattern;
    }

    elsif ($item =~ /^(\d+)R/) { #a reverse
    complement of an earlier saved element
    $lookwhere = $regexpos[$1];
    $pattern = "(??{revcomp (\$".$lookwhere.', @pairing)})'; #use '
    quotes so that @pairing is not interpolated
    $regex .= $pattern;
    }

    }

    use re 'debug';

    print "regex $regex\n";
    print "target $target\n";

    while ($target =~ /$regex/g) {
    $position = pos $target;
    print "$& $position\n";
    }

    ### subroutines

    sub revcomp {
    my $sequence = shift (@_);
    my @pairing = @_;
    my $rc = undef;

    foreach $pair (@pairing) {
    ($first, $second) = split //, $pair;
    $match{$first} .= $second;
    }
    foreach $key (keys(%match)) {
    if (length($match{$key}) > 1) {
    $match{$key} = "[".$match{$key}."]";
    }
    }

    @string = split (//, reverse ($sequence)); #process string
    one char at a time using substitutions in %match

    foreach $base (@string) {
    $rc .= $match{$base};
    }

    print "in sub rc = $rc\n\n";

    return $rc; #return reverse
    complement of sequence
    }
     
    Thomas Isenbarger, Nov 17, 2003
    #1
    1. Advertising

  2. Thomas Isenbarger wrote:

    > in the perl script below, I am trying to construct a string, within a
    > regex, based on an earlier captured match.


    > i think something about the last elsif block doesn't work. the regex
    > within the while loop near the bottom (before the subroutine) returns no
    > matches to sequences that should work.
    >
    > for molecular biologists out there, I am trying to find the reverse
    > complement on the fly inside the regex. the @pairing array is something
    > like @pairing = (AU, CG, GC, UA) and defines the substitutions to be made
    > in order to build a reverse complement.


    It is quite possible to do it on the fly...something like

    if ($string =~ m/($substring|revcomp($substring))/ig) {
    $match_position = pos + 1;
    $match = $1;
    ...
    }

    sub revcomp {
    #make reverse complement
    return
    }

    Everyone has their own style of perl, but I'm having a hard time
    deciphering what you are hoping to do...so as I'm not all that sure what
    you wish to do, this particular solution may not be what you want...

    Austin
     
    Austin P. So (Hae Jin), Nov 17, 2003
    #2
    1. Advertising

  3. Thomas Isenbarger

    Jay Tilton Guest

    "Austin P. So (Hae Jin)" <> wrote:

    : if ($string =~ m/($substring|revcomp($substring))/ig) {
    : $match_position = pos + 1;
    : $match = $1;
    : ...
    : }
    :
    : sub revcomp {
    : #make reverse complement
    : return
    : }

    Did you test that?

    The m// operator is like a double-quotish string. Subroutine calls are
    not interpolated without some extra work.

    if ($string =~ m/($substring|@{[revcomp($substring)]})/ig) { ... }
    ^^^ ^^
     
    Jay Tilton, Nov 17, 2003
    #3
  4. Jay Tilton <> wrote:
    > "Austin P. So (Hae Jin)" <> wrote:
    >
    >: if ($string =~ m/($substring|revcomp($substring))/ig) {
    >: $match_position = pos + 1;
    >: $match = $1;
    >: ...
    >: }
    >:
    >: sub revcomp {
    >: #make reverse complement
    >: return
    >: }
    >
    > Did you test that?
    >
    > The m// operator is like a double-quotish string. Subroutine calls are
    > not interpolated without some extra work.
    >
    > if ($string =~ m/($substring|@{[revcomp($substring)]})/ig) { ... }



    And shouldn't it be either:

    if ( m// )

    or

    while ( m//g )

    ??


    --
    Tad McClellan SGML consulting
    Perl programming
    Fort Worth, Texas
     
    Tad McClellan, Nov 18, 2003
    #4
  5. [posted & mailed]

    On Mon, 17 Nov 2003, Thomas Isenbarger wrote:

    >for molecular biologists out there, I am trying to find the reverse
    >complement on the fly inside the regex. the @pairing array is something
    >like @pairing = (AU, CG, GC, UA) and defines the substitutions to be made
    >in order to build a reverse complement.
    >
    >for non molecular biologists out there, a sequence that should match is
    >'acgu' if you need a test case.


    A context-free grammar for this would be:

    S -> a S u | u S a | c S g | g S c | [nothing]

    You *could* do this with Perl regexes, but it's unweildy (and inefficient,
    I can assure you):

    my $pair;
    $pair = qr{
    a (??{ $pair }) u |
    u (??{ $pair }) a |
    c (??{ $pair }) g |
    g (??{ $pair }) c |
    (?# nothing )
    }x;
    if ("acgu" =~ /^($pair)$/) {
    print "matched '$1'\n";
    }

    However, it's probably easier just to match a sequence, and then try to
    match its reverse:

    if ("acgu" =~ /^(([acgu]+)(??{ complement($2) }))$/) {
    print "matched '$1' ('$2')\n";
    }

    sub complement {
    my $str = reverse shift;
    $str =~ tr/aucg/uagc/;
    return $str;
    }

    I didn't need to use "use re 'eval'" for either of these, by the way,
    because the variables in the regexes are qr// objects.

    --
    Jeff Pinyan RPI Acacia Brother #734 2003 Rush Chairman
    "And I vos head of Gestapo for ten | Michael Palin (as Heinrich Bimmler)
    years. Ah! Five years! Nein! No! | in: The North Minehead Bye-Election
    Oh. Was NOT head of Gestapo AT ALL!" | (Monty Python's Flying Circus)
     
    Jeff 'japhy' Pinyan, Nov 18, 2003
    #5
  6. Jay Tilton wrote:
    > "Austin P. So (Hae Jin)" <> wrote:
    >
    > : if ($string =~ m/($substring|revcomp($substring))/ig) {
    > : $match_position = pos + 1;
    > : $match = $1;
    > : ...
    > : }


    Oops...it should be:
    $match_position = (pos $string) + 1;

    > : sub revcomp {
    > : #make reverse complement
    > : return
    > : }
    >
    > Did you test that?


    Actually I did after you posted...just to be sure since it had been a
    while since I did this...

    And of course it didn't work...I guess my fallback is that I'm a crappy
    perl programmer...:)...I think I actually rewrote it too way back when...

    > The m// operator is like a double-quotish string. Subroutine calls are
    > not interpolated without some extra work.
    >
    > if ($string =~ m/($substring|@{[revcomp($substring)]})/ig) { ... }


    Yep. That works brilliantly. Thanks.

    And just to polish it further, to get the substring start site, it
    should be:

    if ($string =~ m/(?=($substring|@{[revComp($substring)]}))/ig) {...}

    Good thing I started lurking this newsgroup again... :)

    BTW...where is a reference for subroutine calls within a regex?


    Austin
     
    Austin P. So (Hae Jin), Nov 18, 2003
    #6
  7. Tad McClellan wrote:

    > And shouldn't it be either:
    >
    > if ( m// )
    >
    > or
    >
    > while ( m//g )
    >
    > ??


    Right. My bad...

    the latter to get all the substring instances...

    Austin
     
    Austin P. So (Hae Jin), Nov 18, 2003
    #7
  8. Austin P. So (Hae Jin) <> wrote:

    > BTW...where is a reference for subroutine calls within a regex?



    Step 1 is to recognize that it isn't the regexness that matters,
    it is the double-quotishness that matters.

    Step 2 is to lookup subroutine calls within a double-quotish string. :)


    perldoc -q expand

    How do I expand function calls in a string?


    --
    Tad McClellan SGML consulting
    Perl programming
    Fort Worth, Texas
     
    Tad McClellan, Nov 18, 2003
    #8
  9. Tad McClellan wrote:
    > Austin P. So (Hae Jin) <> wrote:
    >>BTW...where is a reference for subroutine calls within a regex?


    > Step 1 is to recognize that it isn't the regexness that matters,
    > it is the double-quotishness that matters.


    Uh...okay...obviously I haven't gotten to a point where I've had to
    understand these kinds of subtleties in perl...

    > Step 2 is to lookup subroutine calls within a double-quotish string. :)
    > perldoc -q expand
    > How do I expand function calls in a string?


    Hmmm...I guess I never really thought about the search pattern within a
    regex to be considered a "double-quotish string"...


    Austin
     
    Austin P. So (Hae Jin), Nov 18, 2003
    #9
  10. thanks for all your help people.

    here is a (perhaps) better description of what i want to do:

    first, I want to allow the user to input a literal sequence to match such
    as 'ACCCUCUAUUCUC', and also allow the user to match arbitrary seqeunce
    elements of any length and then be able to match that pattern again, the
    reverse, or the reverse complement of that sequence. I also want to allow
    the user to input his own pairing rules for revcomp matches. For example,
    if you would want to match an RNA hairpin structure of 10-20 bases with a
    4 base loop, allowing for non watson and crick pairing, I would think of
    it this way:

    match any sequence that is a minimum length of 10 to a maximum length of
    20, followed by any 4 bases, followed by the reverse complement of what
    was matched in the first part as defined by the pairing rules A-U, C-G,
    G-C, U-A, U-G, G-U.

    in my program this would be:

    (AU, CG, GC, UA, UG, GU)
    1s=10-20
    4-4
    1r

    of course, i don't want any of these particular values hard wired, I want
    them all to be input by the user. the elements I want to include (so far)
    are these:

    -arbitrary sequence of length N to M for matching later: 1s=N-M
    -literal: some string of ACGU such as: ACCCUAUA
    -literal to match and remember for later: 1s=ACCCUA (for example)
    -match a remembered sequence again: 1s (for example to find tandem
    repeats 1 to 10 bases long I would use:

    1s=1-10
    1s

    -match the reverse of a sequence already matched: 1p (for example to
    find inverted repeats 1 to 10 bases long I would use:

    1s=1-10
    1p

    -match the revcomp of a sequence already matched: 1r (for example to
    find hairpins:

    1s=1-10
    1r

    -pairing rules: (AU, CG, UA, GC, AG, GA, UG, GU)

    I also want to eventually do this with approximate matching to allow
    mismatches, insertions, and deletions.

    I am essentially trying to duplicate the PatSearch program (Nucleic Acids
    research 2003, 31(13):3608) that is available on the web, but not as an
    executable I can install locally.

    Thanks for the offer to help and let me know what more information you
    need from me.
     
    Thomas Isenbarger, Nov 18, 2003
    #10
  11. On Mon, 17 Nov 2003, Jeff 'japhy' Pinyan wrote:

    >A context-free grammar for this would be:
    >
    > S -> a S u | u S a | c S g | g S c | [nothing]
    >
    >You *could* do this with Perl regexes, but it's unweildy (and inefficient,
    >I can assure you):


    Actually, in my tests, this method is faster than the "match a lot and
    then match its reverse" method.

    > my $pair;
    > $pair = qr{
    > a (??{ $pair }) u |
    > u (??{ $pair }) a |
    > c (??{ $pair }) g |
    > g (??{ $pair }) c |
    > (?# nothing )
    > }x;
    > if ("acgu" =~ /^($pair)$/) {
    > print "matched '$1'\n";
    > }


    This is also more easily extensible.

    --
    Jeff Pinyan RPI Acacia Brother #734 2003 Rush Chairman
    "And I vos head of Gestapo for ten | Michael Palin (as Heinrich Bimmler)
    years. Ah! Five years! Nein! No! | in: The North Minehead Bye-Election
    Oh. Was NOT head of Gestapo AT ALL!" | (Monty Python's Flying Circus)
     
    Jeff 'japhy' Pinyan, Nov 18, 2003
    #11
  12. Thomas Isenbarger

    Steve Guest

    (Thomas Isenbarger) wrote in message news:<>...
    ......
    > for molecular biologists out there, I am trying to find the reverse
    > complement on the fly inside the regex. the @pairing array is something
    > like @pairing = (AU, CG, GC, UA) and defines the substitutions to be made
    > in order to build a reverse complement.
    >


    I haven't looked in detail at what you are trying to do - I can't
    readily follow it !
    However, your revcomp sub seems a very complex way to go about it. Why
    not just do this:

    ############################################################################
    sub revcompl() #reverse complements primer
    {
    $_[0]=lc($_[0]); # to lowercase
    $_[0]=reverse($_[0]); # reverse order
    $_[0] =~ tr/gatc/ctag/; # complement
    }
    ############################################################################

    or am I missing something ?

    Steve
     
    Steve, Nov 18, 2003
    #12
  13. Thomas Isenbarger

    Greg Bacon Guest

    In article <>,
    Thomas Isenbarger <> wrote:

    : [...]
    : for molecular biologists out there, I am trying to find the reverse
    : complement on the fly inside the regex. the @pairing array is something
    : like @pairing = (AU, CG, GC, UA) and defines the substitutions to be made
    : in order to build a reverse complement.

    Well, Abigail hasn't posted a Single Spiffy Regular Expression to
    do it, so I feel better about not having found one.

    Included below is a solution I developed in test-driven fashion. I
    believe it implements everything but on-the-fly pairing rules.

    [...]

    Enjoy,
    Greg

    -----

    #! /usr/local/bin/perl

    use warnings;
    use strict;

    use Data::Dumper;

    sub revcomp {
    local $_ = reverse shift;

    tr[AGCUN]
    [UCGAN];

    $_;
    }

    sub cmp_pairs {
    my $l1 = shift;
    my $l2 = shift;

    my $pass = @$l1 == @$l2;

    if ($pass) {
    foreach my $i (0 .. $#$l1) {
    my $a = $l1->[$i];
    my $b = $l2->[$i];

    if (@$a != @$b || grep $a->[$_] ne $b->[$_], 0 .. $#$a) {
    $pass = 0;
    last;
    }
    }
    }

    unless ($pass) {
    local $Data::Dumper::Indent = 1;
    my $want = Dumper $l1;
    my $got = Dumper $l2;

    chomp $got;
    die "bad result:\n" .
    " - want:\n" .
    $want .
    " - got:\n" .
    $got;
    }

    1;
    }

    sub cmp_lstpairs {
    my $want = shift;
    my $got = shift;
    my $tag = shift || "";

    $tag = $tag . ": " if $tag;
    die "${tag}bad results (" . scalar @$got . ")\n"
    unless @$got == @$want;

    cmp_pairs $want->[$_], $got->[$_] for 0 .. $#$want;

    1;
    }

    sub mkspec {
    my @spec;

    my %search = (
    # a specific sequence element to remember
    literal => qr/^(\w+-)?S=([ACGUBDHKMNRSVWY]+)\z/s,

    # replay something we remembered
    replay => qr/^(\w+-)?S-(\w+)\z/s,

    # a reverse complement of an earlier saved element
    revcomp => qr/^(\w+-)?R-(\w+)\z/s,

    # a palindrome of an earlier saved element
    palindrome => qr/^(\w+-)?P-(\w+)\z/s,

    # a range; becomes [ACGU]{$m,$n}
    range => qr/^(\w+-)?S=(\d+-\d+)\z/s,
    );

    PART:
    for (@_) {
    while (my($type,$pat) = each %search) {
    if (/$pat/) {
    my @name = defined $1 ? substr($1,0,-1) : ();
    push @spec => [ $type => $2, @name ];

    keys %search; # reset each
    next PART;
    }
    }

    # unrecognized
    die "$0: unrecognized spec: [$_]\n";
    }

    \@spec;
    }

    sub pattern {
    my $pair = shift;
    my $got = shift || {};

    my($type,$what,$name) = @$pair;

    my %mkpat = (
    literal => sub { qr/($what)/ },

    revcomp => sub {
    if (exists $got->{$what}) {
    my $revc = revcomp $got->{$what};
    qr/($revc)/;
    }
    },

    replay => sub {
    if (exists $got->{$what}) {
    my $str = $got->{$what};
    qr/($str)/;
    }
    },

    palindrome => sub {
    if (exists $got->{$what}) {
    my $str = reverse $got->{$what};
    qr/($str)/;
    }
    },

    range => sub {
    my($m,$n) = split /-/, $what;
    qr/([ACGU]{$m,$n})/;
    },
    );

    my $pat = exists $mkpat{$type}
    ? $mkpat{$type}->() || qr/(?!)/
    : qr/(?!)/;

    ($name, $pat);
    }

    sub find {
    my $targ = shift;
    my $spec = shift;

    return unless @$spec;

    my @hits;
    my($name,$pat) = pattern $spec->[0];

    ATTEMPT:
    while ($targ =~ /$pat/g) {
    my $got = {};
    $got->{$name} = $1 if defined $name;

    my @maybe = [ $1, pos($targ) - length $1 ];

    if (@$spec > 1) {
    foreach my $part (@{$spec}[1 .. $#$spec]) {
    my($name,$pat) = pattern $part, $got;

    # need /gc to adjust pos
    if ($targ =~ /\G$pat/gc) {
    push @maybe => [ $1, pos($targ) - length $1 ];
    }
    else {
    next ATTEMPT;
    }
    }
    }

    push @hits => \@maybe;
    }

    @hits;
    }

    ## tests
    my $TEST_TAG = qr/^test_/;

    sub test_cmp_pairs {
    die "should have passed" unless eval {
    cmp_pairs [[ a => 1 ], [ b => 2 ]],
    [[ a => 1 ], [ b => 2 ]]
    };

    die "should have failed" if eval {
    cmp_pairs [[ a => 1 ], [ b => 2 ]],
    [[ a => 1 ], [ b => 2 ], [ c => 3 ]]
    };

    die "should have failed" if eval {
    cmp_pairs [[ a => 1 ], [ b => 2 ]],
    [[ a => 1 ], [ c => 2 ]]
    };
    }

    sub test_cmp_lstpairs {
    my @base;
    my @chk;

    @base = ([[ a => 1 ], [ b => 2 ]]);
    @chk = ([[ a => 1 ], [ b => 2 ]]);
    die "should have passed" unless eval { cmp_lstpairs \@base, \@chk };

    @base = ([[ a => 1 ], [ b => 2 ]], [[ c => 3 ]]);
    @chk = ([[ a => 1 ], [ b => 2 ]], [[ c => 3 ]]);
    die "should have passed" unless eval { cmp_lstpairs \@base, \@chk };

    @base = ([[ a => 1 ], [ b => 2 ]], [[ c => 3 ]]);
    @chk = ([[ a => 1 ], [ b => 2 ]], [[ c => 4 ]]);
    die "should have failed" if eval { cmp_lstpairs \@base, \@chk };
    }

    sub test_mkspec_lit_AC {
    my $want = [[ literal => 'AC', 'str' ]];
    my $got = mkspec 'str-S=AC';

    cmp_pairs $want, $got;
    }

    sub test_mkspec_lit_AC_rev_1 {
    my $want = [[ literal => 'AC', 'one' ], [ revcomp => 'one' ]];
    my $got = mkspec 'one-S=AC', 'R-one';

    cmp_pairs $want, $got;
    }

    sub test_find_lit_AC {
    my @want = ([[ AC => 0 ]]);

    my $spec = mkspec 'S=AC';
    my @got = find 'AC' => $spec;

    cmp_lstpairs \@want, \@got;
    }

    sub test_find_lit_AC_offset {
    my @want = ([[ AC => 3 ]]);

    my $spec = mkspec 'str-S=AC';
    my @got = find 'XXXAC' => $spec;

    cmp_lstpairs \@want, \@got, "str-S=AC";

    $spec = mkspec 'S=AC';
    @got = find 'XXXAC' => $spec;

    cmp_lstpairs \@want, \@got, "S=AC";
    }

    sub test_find_multi_lit_AC {
    my @want = ([[ AC => 0 ]], [[ AC => 3 ]]);

    my $spec = mkspec 'S=AC';
    my @got = find 'ACXAC' => $spec;

    cmp_lstpairs \@want, \@got;
    }

    sub test_find_adjacent_lit {
    my @want;
    my $spec;
    my @got;

    @want = ([[ UNAANU => 0 ], [ GC => 6 ]]);

    $spec = mkspec qw/ S=UNAANU S=GC /;
    @got = find 'UNAANUGC' => $spec;

    cmp_lstpairs \@want, \@got;

    #####

    @want = ();

    $spec = mkspec qw/ S=UNAANU S=GC /;
    @got = find 'UNAANU#GC' => $spec;

    cmp_lstpairs \@want, \@got, "expect no match";
    }

    sub test_find_lit_rev {
    my @want = ([[ AC => 0 ], [ GU => 2 ]]);

    my $spec = mkspec 'foo-S=AC', 'R-foo';
    my @got = find 'ACGU' => $spec;

    cmp_lstpairs \@want, \@got;
    }

    sub test_find_lit_rev_lit {
    my @want = ([[ AC => 0 ], [ GU => 2 ], [ NN => 4 ]]);

    my $spec = mkspec 'a-S=AC', 'R-a', 'b-S=NN';
    my @got = find 'ACGUNN' => $spec;

    cmp_lstpairs \@want, \@got;
    }

    sub test_mkspec_separated {
    my $want = [
    [ literal => 'NN', 'foo' ],
    [ literal => 'AC' ],
    [ replay => 'foo' ],
    ];
    my $got = mkspec 'foo-S=NN', 'S=AC', 'S-foo';

    cmp_pairs $want, $got;
    }

    sub test_find_separated {
    my @want = ([[ NN => 0 ], [ AC => 2 ], [ NN => 4 ]]);

    my $spec = mkspec 'foo-S=NN', 'S=AC', 'S-foo';
    my @got = find 'NNACNN' => $spec;

    cmp_lstpairs \@want, \@got;
    }

    sub test_mkspec_palindrome {
    my $want = [
    [ literal => 'ACGU', 'abc' ],
    [ palindrome => 'abc' ],
    ];
    my $got = mkspec 'abc-S=ACGU', 'P-abc';

    cmp_pairs $want, $got;
    }

    sub test_find_palindrome {
    my @want = ([[ ACGU => 0 ], [ UGCA => 4 ]]);

    my $spec = mkspec 'abc-S=ACGU', 'P-abc';
    my @got = find 'ACGUUGCA' => $spec;

    cmp_lstpairs \@want, \@got;
    }

    sub test_mkspec_range {
    my $want = [[ range => '3-5' ]];
    my $got = mkspec 'S=3-5';

    cmp_pairs $want, $got;

    #####

    $want = [[ range => '1-3', 'quux' ]];
    $got = mkspec 'quux-S=1-3';

    cmp_pairs $want, $got;
    }

    sub test_find_range {
    my @want;
    my $spec;
    my @got;

    @want = ([[ ACG => 0 ]]);

    $spec = mkspec 'S=1-3';
    @got = find 'ACG' => $spec;

    cmp_lstpairs \@want, \@got;

    #######

    @want = ([[ AC => 0 ]]);

    $spec = mkspec 'S=1-3';
    @got = find 'AC' => $spec;

    cmp_lstpairs \@want, \@got;

    #######

    @want = ([[ U => 0 ]]);

    $spec = mkspec 'S=1-3';
    @got = find 'UXXX' => $spec;

    cmp_lstpairs \@want, \@got;

    #####

    @want = ([[ UUU => 2 ]]);

    $spec = mkspec 'S=1-3';
    @got = find 'XXUUU' => $spec;

    cmp_lstpairs \@want, \@got, "UUU => 2";
    }

    sub test_find_dangling_backref {
    my @want;
    my $spec;
    my @got;

    @want = ();
    $spec = mkspec 'S-foobar';
    @got = find 'ACGUUGCA' => $spec;

    cmp_lstpairs \@want, \@got;

    #####

    @want = ();
    $spec = mkspec 'R-notthere';
    @got = find 'ACGUUGCA' => $spec;

    cmp_lstpairs \@want, \@got;
    }

    sub subname {
    my $sub = (caller 1)[3];

    $sub =~ s/^main:://;

    $sub;
    }

    sub find_tests {
    my $caller = shift;

    my $test = {};

    foreach my $sym (keys %main::) {
    no strict 'refs';

    next unless $sym =~ /$TEST_TAG/;

    $test->{$sym} = \&$sym if exists &$sym;
    }

    $test;
    }

    sub run_tests {
    my $total = 0;
    my $pass = 0;
    my $fail = 0;

    my $test = find_tests;

    local $| = 1;
    local $@;

    foreach my $name (sort keys %$test) {
    ++$total;

    my $label = $name;
    $label =~ s/$TEST_TAG//;
    print "$label... ";

    $@ = "";
    eval { $test->{$name}->() };

    if ($@) {
    ++$fail;
    print "FAIL: $@";
    }
    else {
    ++$pass;
    print "ok\n";
    }
    }

    ($total, $pass, $fail);
    }

    sub supertest {
    my $name = subname;
    die "$0: name = [$name]\n" unless $name eq 'supertest';
    die "$0: bad supertest name: $name\n" if $name =~ /$TEST_TAG/;

    my($total, $pass, $fail) = run_tests;

    if ($total != $pass + $fail) {
    print "pass ($pass) + fail ($fail) != total ($total)\n";
    }
    else {
    if ($total) {
    my $result = $fail ? "FAIL" : "PASS";
    my $score = sprintf "%.1f", $pass / $total * 100;

    print "\n",
    "$result: $pass/$total ($score%)\n";
    }
    else {
    print "No tests.\n";
    }
    }

    not $fail == 0;
    }

    ## main
    while (@ARGV && $ARGV[0] =~ /^-/) {
    my $arg = shift;

    last if $arg eq '--';

    if ($arg eq '--test') {
    exit supertest;
    }
    else {
    die "$0: unknown option '$arg'\n";
    }
    }

    my @search = qw/ 1-S=AC R-1 /;
    my $str = 'ACGUXYZACGU';

    print "Searching for [@search] in $str:\n";
    my @matches = find $str => mkspec qw/ 1-S=AC R-1 /;

    if (@matches) {
    for (@matches) {
    print " - match:\n";
    foreach my $match (@$_) {
    my($str,$pos) = @$match;

    print " - [$str] => $pos\n";
    }
    }
    }
    else {
    print " - no matches.\n";
    }

    __END__
    --
    If credit expansion, protectionism, and government spending were a path
    to prosperity, mankind would have long ago created heaven on earth.
    -- Lew Rockwell
     
    Greg Bacon, Nov 21, 2003
    #13
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. THY
    Replies:
    1
    Views:
    414
    Steve C. Orr, MCSD
    Aug 19, 2003
  2. Ben
    Replies:
    2
    Views:
    964
  3. Replies:
    3
    Views:
    834
    Reedick, Andrew
    Jul 1, 2008
  4. Lawrence D'Oliveiro

    Death To Sub-Sub-Sub-Directories!

    Lawrence D'Oliveiro, May 5, 2011, in forum: Java
    Replies:
    92
    Views:
    2,179
    Lawrence D'Oliveiro
    May 20, 2011
  5. Koszalek Opalek

    Outer scope of a sub inside a sub

    Koszalek Opalek, Oct 27, 2010, in forum: Perl Misc
    Replies:
    10
    Views:
    196
    Ilya Zakharevich
    Oct 29, 2010
Loading...

Share This Page