Anyone care to explain this one?

David Liang · Sep 11, 2009

So, on my machine this gives me

$ echo "abc" | perl -pe 'tr/a-z/a-m/cd'
abck

From reading the man pages it seems to me it should have deleted the
complement of a-z unless a character is in the replacement list, but
where did the "k" come from?

Even more perplexing is

$ echo "abc123op" | perl -pe 'tr/a-z/0-k/cd'
abcabcop:

My LC_COLLATE is "C", LANG is "en_US.utf8", and I'm running Perl
5.10.0.

Alan Curry · Sep 11, 2009

So, on my machine this gives me

$ echo "abc" | perl -pe 'tr/a-z/a-m/cd'
abck

From reading the man pages it seems to me it should have deleted the
complement of a-z unless a character is in the replacement list, but
where did the "k" come from?

echo "abc" generates 4 characters including the newline. So you have the
equivalent of

perl -e '$_ = "abc\n" ; tr/a-z/a-m/cd; print'

Notice that your "abck" was not followed by a newline on perl's stdout! The
result when I ran it actually looked like this:

$ echo "abc" | perl -pe 'tr/a-z/a-m/cd'
abck$

with the shell prompt glued to the k.

Why k? Well, what is the complement of the set a-z? It's the set of all
characters, that aren't a-z. The first character that's not in a-z is "\0".
The next is "\1", then "\2", etc. So the next equivalent to your original
operation is:

perl -e '$_ = "abc\n" ; tr/\0-`{-\377/a-m/d; print'

(I'm not sure \377 is the proper upper limit in this age of large charsets,
but you get the idea.) The "`" is the character before "a" in ASCII, and the
"{" is the character after "z".

So what happened? The 13 replacement characters a-m were matched up against
the first 13 characters in the search list:

\0 \1 \2 \3 \4 \5 \6 \7 \10 \11 \12 \13 \14
a b c d e f g h i j k l m

\12 (octal 12, decimal 10) is also known as \n, the newline character. So it
got translated to k. The "abc" input characters didn't match anything in the
search list (they belong to the a-z set that was complemented out) so they
pass through unchanged. If you had provided any input characters that were
neither a-z nor \0-\14 they would have been matched and removed because of
the /d modifier.

I don't know if it would ever be a good idea to use the /c modifier, the /d
modifier, and a non-empty replacement list all in a single tr/// operation.
Having explained in detail what it did and why, it seems like even if that's
what you wanted to do, you should find a less obfuscated way to do it.

Even more perplexing is

$ echo "abc123op" | perl -pe 'tr/a-z/0-k/cd'
abcabcop:

Just like above, this means a-z pass through unchanged, but this time the
replacement list is much longer. "0" is "\x30" and "k" is "\x6b" in ASCII, so
you have input characters "\0" through "\x3b" being translated to 0-k. This
happens to include 0-9 ("\x30" through "\x39") being translated to "\x60"
through "\x69" (a-i). And the newline became a colon this time. The complete
translation table you've asked for is:

\0 \1 \2 \3 \4 \5 \6 \7 \10 \11 \12 \13 \14 \15 \16 \17
0 1 2 3 4 5 6 7 8 9 : ; < = > ?

\20 \21 \22 \23 \24 \25 \26 \27 \30 \31 \32 \33 \34 \35 \36 \37
@ A B C D E F G H I J K L M N O

SPC ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ;
P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c d e f g h i j k

and anything in the input that's neither a-z nor \0-\x3b would be deleted,
but once again you didn't include any of those.

David Liang · Sep 11, 2009

Ahh... Thanks for that elucidating explanation!

David Liang · Sep 19, 2009

I don't know if it would ever be a good idea to use the /c modifier, the /d
modifier, and a non-empty replacement list all in a single tr/// operation.
Having explained in detail what it did and why, it seems like even if that's
what you wanted to do, you should find a less obfuscated way to do it.

I was trying out those strange cases because I was writing a semi-
clone of the transliteration operator for Python:

http://github.com/bmdavll/StringTransform

Thanks again for the explanation--it really cleared up the whole
complements deal for me.

Could anyone explain this Yahoo! source code?	1	May 4, 2006
Layout of self-closing divs - can anyone explain this?	10	Aug 3, 2007
How to include don't care minterms	3	Jan 26, 2004
public key encryption	0	Sep 9, 2009
can someone explain this to me please	6	Sep 6, 2003
Converting my index.pl(cgi) to html::template one	4	Apr 26, 2005
Anyone know how to work with Xpath of Xmlnode ? need some help with this.	1	Apr 4, 2007
Useful one-liners and other short Perl scripts	18	Jul 12, 2004

Anyone care to explain this one?

David Liang

Alan Curry

David Liang

David Liang

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads