U
usenet
As a learning exercise, I'm trying to understand a number of the JAPH
examples on CPAN (http://www.cpan.org/misc/japh). One script
(attributed to MeowChow at PerlMonks) is interesting - it looks like a
piece of a DNA molecule, and when run it prints the requisite "Just
Another Perl Hacker"
I figured out how this works, and thought I would post it here in case
anyone else is interested. I've reformatted the "processing" part of
the script (at the end) for readability and have added comments to
explain what's happening, but otherwise I have not changed MeowChow's
code.
[...snip...]============================================================
C---G
C--G
CG
AT
T--A
C---G
G----C
G----C
A----T
T---A
G--C
CG
TA
G--C
A---T
G----C
A----T
G----C
.;
# Adenine=0 Cytosine=1 Guanine=2 Thymine=3
@_{A => C => G => T =>} = 0..3;
# Parse the molecule, ignoring all but the nucleotides.
# Every 9 lines, switch whether first or second protein
# is "significant" (used). Replace significant protein
# with its numeric equivalent from the above hash.
s!.*(\w).*(\w).*\n!$_{ ($- ++ /9 % 2) ? $2 : $1}!gex;
# Now $_ is a long string of numbers 0-3, ie: 130013021221...
# Each four bytes of $_ represents the ASCII code of a character
# expressed in base-4 numeric notation. Parse $_ four bytes at
# a time and convert the base-4 value to decimal and replace the
# four-byte substring with the ASCII character it represents:
s!(.)(.)(.)(.)!chr( 64*$1 + 16*$2 + 4*$3 + $4)!gex;
# Now $_ is literally: qq/print"Just Another Perl Hacker\n"/
eval $execute the print command which is the value of $_
[...snip...]============================================================
Further explanation:
The value of $_ after the first substitution looks like this:
13001302122112321310020210221311130313100200100112321233131012201...
Taking that four bytes at a time and converting base-4 to base-10:
1300 = 64*1 + 16*3 + 4*0 + 0 = 112 (chr 112 = "p")
1302 = 64*1 + 16*3 + 4*0 + 2 = 114 (chr 112 = "r")
1221 = 64*1 + 16*2 + 4*2 + 1 = 104 (chr 112 = "i")
(the first three letters of "print" which is the first part of the
command that will be eval'd).
Now on to Erudil's camel code!
examples on CPAN (http://www.cpan.org/misc/japh). One script
(attributed to MeowChow at PerlMonks) is interesting - it looks like a
piece of a DNA molecule, and when run it prints the requisite "Just
Another Perl Hacker"
I figured out how this works, and thought I would post it here in case
anyone else is interested. I've reformatted the "processing" part of
the script (at the end) for readability and have added comments to
explain what's happening, but otherwise I have not changed MeowChow's
code.
[...snip...]============================================================
C---G
C--G
CG
AT
T--A
C---G
G----C
G----C
A----T
T---A
G--C
CG
TA
G--C
A---T
G----C
A----T
G----C
.;
# Adenine=0 Cytosine=1 Guanine=2 Thymine=3
@_{A => C => G => T =>} = 0..3;
# Parse the molecule, ignoring all but the nucleotides.
# Every 9 lines, switch whether first or second protein
# is "significant" (used). Replace significant protein
# with its numeric equivalent from the above hash.
s!.*(\w).*(\w).*\n!$_{ ($- ++ /9 % 2) ? $2 : $1}!gex;
# Now $_ is a long string of numbers 0-3, ie: 130013021221...
# Each four bytes of $_ represents the ASCII code of a character
# expressed in base-4 numeric notation. Parse $_ four bytes at
# a time and convert the base-4 value to decimal and replace the
# four-byte substring with the ASCII character it represents:
s!(.)(.)(.)(.)!chr( 64*$1 + 16*$2 + 4*$3 + $4)!gex;
# Now $_ is literally: qq/print"Just Another Perl Hacker\n"/
eval $execute the print command which is the value of $_
[...snip...]============================================================
Further explanation:
The value of $_ after the first substitution looks like this:
13001302122112321310020210221311130313100200100112321233131012201...
Taking that four bytes at a time and converting base-4 to base-10:
1300 = 64*1 + 16*3 + 4*0 + 0 = 112 (chr 112 = "p")
1302 = 64*1 + 16*3 + 4*0 + 2 = 114 (chr 112 = "r")
1221 = 64*1 + 16*2 + 4*2 + 1 = 104 (chr 112 = "i")
(the first three letters of "print" which is the first part of the
command that will be eval'd).
Now on to Erudil's camel code!