Anyone care to explain this one?

D

David Liang

So, on my machine this gives me

$ echo "abc" | perl -pe 'tr/a-z/a-m/cd'
abck

From reading the man pages it seems to me it should have deleted the
complement of a-z unless a character is in the replacement list, but
where did the "k" come from?

Even more perplexing is

$ echo "abc123op" | perl -pe 'tr/a-z/0-k/cd'
abcabcop:

My LC_COLLATE is "C", LANG is "en_US.utf8", and I'm running Perl
5.10.0.
 
A

Alan Curry

So, on my machine this gives me

$ echo "abc" | perl -pe 'tr/a-z/a-m/cd'
abck

From reading the man pages it seems to me it should have deleted the
complement of a-z unless a character is in the replacement list, but
where did the "k" come from?

echo "abc" generates 4 characters including the newline. So you have the
equivalent of

perl -e '$_ = "abc\n" ; tr/a-z/a-m/cd; print'

Notice that your "abck" was not followed by a newline on perl's stdout! The
result when I ran it actually looked like this:

$ echo "abc" | perl -pe 'tr/a-z/a-m/cd'
abck$

with the shell prompt glued to the k.

Why k? Well, what is the complement of the set a-z? It's the set of all
characters, that aren't a-z. The first character that's not in a-z is "\0".
The next is "\1", then "\2", etc. So the next equivalent to your original
operation is:

perl -e '$_ = "abc\n" ; tr/\0-`{-\377/a-m/d; print'

(I'm not sure \377 is the proper upper limit in this age of large charsets,
but you get the idea.) The "`" is the character before "a" in ASCII, and the
"{" is the character after "z".

So what happened? The 13 replacement characters a-m were matched up against
the first 13 characters in the search list:

\0 \1 \2 \3 \4 \5 \6 \7 \10 \11 \12 \13 \14
a b c d e f g h i j k l m

\12 (octal 12, decimal 10) is also known as \n, the newline character. So it
got translated to k. The "abc" input characters didn't match anything in the
search list (they belong to the a-z set that was complemented out) so they
pass through unchanged. If you had provided any input characters that were
neither a-z nor \0-\14 they would have been matched and removed because of
the /d modifier.

I don't know if it would ever be a good idea to use the /c modifier, the /d
modifier, and a non-empty replacement list all in a single tr/// operation.
Having explained in detail what it did and why, it seems like even if that's
what you wanted to do, you should find a less obfuscated way to do it.
Even more perplexing is

$ echo "abc123op" | perl -pe 'tr/a-z/0-k/cd'
abcabcop:

Just like above, this means a-z pass through unchanged, but this time the
replacement list is much longer. "0" is "\x30" and "k" is "\x6b" in ASCII, so
you have input characters "\0" through "\x3b" being translated to 0-k. This
happens to include 0-9 ("\x30" through "\x39") being translated to "\x60"
through "\x69" (a-i). And the newline became a colon this time. The complete
translation table you've asked for is:

\0 \1 \2 \3 \4 \5 \6 \7 \10 \11 \12 \13 \14 \15 \16 \17
0 1 2 3 4 5 6 7 8 9 : ; < = > ?

\20 \21 \22 \23 \24 \25 \26 \27 \30 \31 \32 \33 \34 \35 \36 \37
@ A B C D E F G H I J K L M N O

SPC ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ;
P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c d e f g h i j k

and anything in the input that's neither a-z nor \0-\x3b would be deleted,
but once again you didn't include any of those.
 
D

David Liang

I don't know if it would ever be a good idea to use the /c modifier, the /d
modifier, and a non-empty replacement list all in a single tr/// operation.
Having explained in detail what it did and why, it seems like even if that's
what you wanted to do, you should find a less obfuscated way to do it.

I was trying out those strange cases because I was writing a semi-
clone of the transliteration operator for Python:

http://github.com/bmdavll/StringTransform

Thanks again for the explanation--it really cleared up the whole
complements deal for me.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,734
Messages
2,569,441
Members
44,832
Latest member
GlennSmall

Latest Threads

Top