How a JAPH script on CPAN works.

U

usenet

As a learning exercise, I'm trying to understand a number of the JAPH
examples on CPAN (http://www.cpan.org/misc/japh). One script
(attributed to MeowChow at PerlMonks) is interesting - it looks like a
piece of a DNA molecule, and when run it prints the requisite "Just
Another Perl Hacker"

I figured out how this works, and thought I would post it here in case
anyone else is interested. I've reformatted the "processing" part of
the script (at the end) for readability and have added comments to
explain what's happening, but otherwise I have not changed MeowChow's
code.

[...snip...]============================================================
C---G
C--G
CG
AT
T--A
C---G
G----C
G----C
A----T
T---A
G--C
CG
TA
G--C
A---T
G----C
A----T
G----C
.;
# Adenine=0 Cytosine=1 Guanine=2 Thymine=3
@_{A => C => G => T =>} = 0..3;

# Parse the molecule, ignoring all but the nucleotides.
# Every 9 lines, switch whether first or second protein
# is "significant" (used). Replace significant protein
# with its numeric equivalent from the above hash.
s!.*(\w).*(\w).*\n!$_{ ($- ++ /9 % 2) ? $2 : $1}!gex;
# Now $_ is a long string of numbers 0-3, ie: 130013021221...

# Each four bytes of $_ represents the ASCII code of a character
# expressed in base-4 numeric notation. Parse $_ four bytes at
# a time and convert the base-4 value to decimal and replace the
# four-byte substring with the ASCII character it represents:
s!(.)(.)(.)(.)!chr( 64*$1 + 16*$2 + 4*$3 + $4)!gex;
# Now $_ is literally: qq/print"Just Another Perl Hacker\n"/
eval $execute the print command which is the value of $_

[...snip...]============================================================

Further explanation:
The value of $_ after the first substitution looks like this:
13001302122112321310020210221311130313100200100112321233131012201...

Taking that four bytes at a time and converting base-4 to base-10:

1300 = 64*1 + 16*3 + 4*0 + 0 = 112 (chr 112 = "p")
1302 = 64*1 + 16*3 + 4*0 + 2 = 114 (chr 112 = "r")
1221 = 64*1 + 16*2 + 4*2 + 1 = 104 (chr 112 = "i")

(the first three letters of "print" which is the first part of the
command that will be eval'd).

Now on to Erudil's camel code!
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,766
Messages
2,569,569
Members
45,043
Latest member
CannalabsCBDReview

Latest Threads

Top