"un-meta" the control characters

Discussion in 'Perl Misc' started by Paul Lalli, Nov 2, 2009.

  1. Paul Lalli

    Paul Lalli Guest

    A coworker just presented me with this task. I came up with two
    solutions, but I don't like either of them. He has a text document
    and wants to scan it for characters such as newline, tab, form feed,
    carriage return, vertical tab. If found, he wants to replace them
    with their typical representation (ie, \n, \t, \f, \r, \v).

    I first gave him the obvious:
    $string =~ s/\n/\\n/;
    $string =~ s/\t/\\t/;
    $string =~ s/\f/\\f/;
    $string =~ s/\r/\\r/;
    $string =~ s/\v/\\v/;

    which I don't like because of how much copy/paste is involved. Then I
    came up with:

    for (qw/n t f r v/) {
    my $meta = eval("\\$_");
    $string =~ s/$meta/\\$_/;
    }

    which I don't like, because the comment he'd have to put in the code
    to explain it would be longer than the code itself, or the first
    version.

    So can anyone think of a better way? Is there any kind of intrinsic
    link between a newline character and the letter 'n' that could be used
    to go "backwards" here?

    Thanks,
    Paul Lalli
     
    Paul Lalli, Nov 2, 2009
    #1
    1. Advertising

  2. Paul Lalli

    Uri Guttman Guest

    >>>>> "PL" == Paul Lalli <> writes:

    PL> A coworker just presented me with this task. I came up with two
    PL> solutions, but I don't like either of them. He has a text document
    PL> and wants to scan it for characters such as newline, tab, form feed,
    PL> carriage return, vertical tab. If found, he wants to replace them
    PL> with their typical representation (ie, \n, \t, \f, \r, \v).

    PL> I first gave him the obvious:
    PL> $string =~ s/\n/\\n/;
    PL> $string =~ s/\t/\\t/;
    PL> $string =~ s/\f/\\f/;
    PL> $string =~ s/\r/\\r/;
    PL> $string =~ s/\v/\\v/;

    PL> which I don't like because of how much copy/paste is involved. Then I
    PL> came up with:

    use a hash table for the conversion:

    my %controls = (
    "\n" => '\\n',
    "\t" => '\\t',
    "\r" => '\\r',
    "\f" => '\\f',
    "\v" => '\\v',
    ) ;

    $string =~ s/([\n\t\r\f\v])/$controls{$1}/g;

    and if you want to get anal about dups of the chars do this:

    my @controls = qw( n t r f v ) ;
    my %control_to_escape = map { eval( "\\$_" ) => "\\$_" } @controls ;

    my $controls_re = '[' . join( '', map "\\$_", @controls ) . ']' ;

    $string =~ s/($controls_re)/$controls_to_escape{$1}/g;

    see ma! only one use of the actual control letters!

    uri

    --
    Uri Guttman ------ -------- http://www.sysarch.com --
    ----- Perl Code Review , Architecture, Development, Training, Support ------
    --------- Gourmet Hot Cocoa Mix ---- http://bestfriendscocoa.com ---------
     
    Uri Guttman, Nov 2, 2009
    #2
    1. Advertising

  3. >>>>> "Uri" == Uri Guttman <> writes:

    >>>>> "PL" == Paul Lalli <> writes:

    PL> A coworker just presented me with this task. I came up with two
    PL> solutions, but I don't like either of them. He has a text document
    PL> and wants to scan it for characters such as newline, tab, form feed,
    PL> carriage return, vertical tab. If found, he wants to replace them
    PL> with their typical representation (ie, \n, \t, \f, \r, \v).

    PL> I first gave him the obvious:
    PL> $string =~ s/\n/\\n/;
    PL> $string =~ s/\t/\\t/;
    PL> $string =~ s/\f/\\f/;
    PL> $string =~ s/\r/\\r/;
    PL> $string =~ s/\v/\\v/;

    PL> which I don't like because of how much copy/paste is involved. Then I
    PL> came up with:

    Uri> use a hash table for the conversion:

    Uri> my %controls = (
    Uri> "\n" => '\\n',
    Uri> "\t" => '\\t',
    Uri> "\r" => '\\r',
    Uri> "\f" => '\\f',
    Uri> "\v" => '\\v',
    Uri> ) ;

    Just to scare people:

    my %controls = (
    "\n" => '\n',
    "\t" => '\t',
    "\r" => '\r',
    "\f" => '\f',
    "\v" => '\v',
    );

    Ok, that's downright evil. :)

    print "Just another Perl hacker,"; # the original

    --
    Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
    <> <URL:http://www.stonehenge.com/merlyn/>
    Smalltalk/Perl/Unix consulting, Technical writing, Comedy, etc. etc.
    See http://methodsandmessages.vox.com/ for Smalltalk and Seaside discussion
     
    Randal L. Schwartz, Nov 2, 2009
    #3
  4. Paul Lalli

    Guest

    On Mon, 2 Nov 2009 11:07:56 -0800 (PST), Paul Lalli <> wrote:

    >A coworker just presented me with this task. I came up with two
    >solutions, but I don't like either of them. He has a text document
    >and wants to scan it for characters such as newline, tab, form feed,
    >carriage return, vertical tab. If found, he wants to replace them
    >with their typical representation (ie, \n, \t, \f, \r, \v).
    >
    >I first gave him the obvious:
    >$string =~ s/\n/\\n/;
    >$string =~ s/\t/\\t/;
    >$string =~ s/\f/\\f/;
    >$string =~ s/\r/\\r/;
    >$string =~ s/\v/\\v/;
    >
    >which I don't like because of how much copy/paste is involved. Then I
    >came up with:
    >
    >for (qw/n t f r v/) {
    > my $meta = eval("\\$_");
    > $string =~ s/$meta/\\$_/;
    >}
    >
    >which I don't like, because the comment he'd have to put in the code
    >to explain it would be longer than the code itself, or the first
    >version.
    >
    >So can anyone think of a better way? Is there any kind of intrinsic
    >link between a newline character and the letter 'n' that could be used
    >to go "backwards" here?
    >


    Yet another way..

    use strict;
    use warnings;

    my %translation = (
    '\n'=>"\n",
    '\t'=>"\t",
    '\f'=>"\f",
    '\r'=>"\r",
    # ,'\v'=>"\v" - no 'v' for 'm'e, vt?
    );

    my $sample = "line 1\tsome\nline 2\t\t\f\n\rline 3\n";

    while (my ($literal,$actual) = each %translation) {
    $sample =~ s/$actual/$literal/eg;
    }

    print $sample;

    __END__

    -sln
     
    , Nov 2, 2009
    #4
  5. Paul Lalli

    Guest

    On Mon, 02 Nov 2009 12:21:06 -0800, wrote:

    >On Mon, 2 Nov 2009 11:07:56 -0800 (PST), Paul Lalli <> wrote:
    >
    >while (my ($literal,$actual) = each %translation) {
    > $sample =~ s/$actual/$literal/eg;

    $sample =~ s/$actual/$literal/g;
    -sln
     
    , Nov 2, 2009
    #5
  6. Paul Lalli wrote:
    > A coworker just presented me with this task. I came up with two
    > solutions, but I don't like either of them. He has a text document
    > and wants to scan it for characters such as newline, tab, form feed,
    > carriage return, vertical tab. If found, he wants to replace them
    > with their typical representation (ie, \n, \t, \f, \r, \v).
    >
    > I first gave him the obvious:
    > $string =~ s/\n/\\n/;
    > $string =~ s/\t/\\t/;
    > $string =~ s/\f/\\f/;
    > $string =~ s/\r/\\r/;
    > $string =~ s/\v/\\v/;


    Perl doesn't have a "\v" character:

    $string =~ s/\cK/\\v/;

    Or:

    $string =~ s/\13/\\v/;

    Or:

    $string =~ s/\xB/\\v/;




    John
    --
    The programmer is fighting against the two most
    destructive forces in the universe: entropy and
    human stupidity. -- Damian Conway
     
    John W. Krahn, Nov 2, 2009
    #6
  7. Paul Lalli

    C.DeRykus Guest

    On Nov 2, 11:07 am, Paul Lalli <> wrote:
    > A coworker just presented me with this task.  I came up with two
    > solutions, but I don't like either of them.  He has a text document
    > and wants to scan it for characters such as newline, tab, form feed,
    > carriage return, vertical tab.  If found, he wants to replace them
    > with their typical representation (ie, \n, \t, \f, \r, \v).
    >
    > I first gave him the obvious:
    > $string =~ s/\n/\\n/;
    > $string =~ s/\t/\\t/;
    > $string =~ s/\f/\\f/;
    > $string =~ s/\r/\\r/;
    > $string =~ s/\v/\\v/;
    >
    > which I don't like because of how much copy/paste is involved.  Then I
    > came up with:
    >
    > for (qw/n t f r v/) {
    >    my $meta = eval("\\$_");
    >    $string =~ s/$meta/\\$_/;
    >
    > }
    > ...



    Did that work? I don't understand why the eval is needed
    at all:

    my $string = "1\n 2\t 3\f 4\r 5\cK";
    for (qw/n t f r cK/) {
    my $meta = "\\$_";
    $string =~ s/$meta/\\$_/;
    }
    print $string; # 1\n 2\t 3\f 4\r 5\cK

    --
    Charles DeRykus
     
    C.DeRykus, Nov 3, 2009
    #7
  8. >>>>> "Ben" == Ben Morrow <> writes:

    Ben> For extra added evil:

    Ben> my $bs = "\\";
    Ben> $string =~ s/$bs$_/$bs$_/g for qw/n r t f/;

    And I thought *I* was being bad.

    --
    Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
    <> <URL:http://www.stonehenge.com/merlyn/>
    Smalltalk/Perl/Unix consulting, Technical writing, Comedy, etc. etc.
    See http://methodsandmessages.vox.com/ for Smalltalk and Seaside discussion
     
    Randal L. Schwartz, Nov 3, 2009
    #8
  9. Paul Lalli

    C.DeRykus Guest

    On Nov 3, 11:18 am, Ben Morrow <> wrote:
    > Quoth "C.DeRykus" <>:
    >
    >
    >
    > > On Nov 2, 11:07 am, Paul Lalli <> wrote:

    >
    > > > for (qw/n t f r v/) {
    > > >    my $meta = eval("\\$_");
    > > >    $string =~ s/$meta/\\$_/;

    >
    > > > }

    >
    > > Did that work? I don't understand why the eval is needed
    > > at all:

    >
    > > my $string = "1\n 2\t 3\f 4\r 5\cK";
    > > for (qw/n t f r cK/) {
    > >     my $meta = "\\$_";
    > >     $string =~ s/$meta/\\$_/;
    > > }
    > > print $string;   #  1\n 2\t 3\f  4\r  5\cK

    >
    > That's... evil. It relies on the fact that regexes undergo two separate
    > expansion phases, and requires that variable expansion happens in the
    > first phase but other qqish escapes are expanded in the second. I'm not
    > entirely convinced that's documented behaviour: anyone care to dig out
    > perlre and prove it one way or the other?
    >
    > For extra added evil:
    >
    >     my $bs = "\\";
    >     $string =~ s/$bs$_/$bs$_/g for qw/n r t f/;
    >


    Perl magic is evil? Say it ain't so :)

    I didn't spot a full explanation in perlre but I see perlop
    steps through the compilation in "gory details of parsing
    quoted constructs" and ends with what happens at runtime
    in "parsing regular expressions".

    This closely mirrors Chapter 7's section - Perl Regular
    Expressions in J.Friedl's "Mastering Regular Expressions"
    1st ed.


    --
    Charles DeRykus
     
    C.DeRykus, Nov 4, 2009
    #9
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Nym Pseudo

    META NAME and META HTTP-EQUIV

    Nym Pseudo, Sep 26, 2003, in forum: HTML
    Replies:
    1
    Views:
    611
    =?iso-8859-1?Q?brucie?=
    Sep 26, 2003
  2. Replies:
    2
    Views:
    1,147
    Ingo Menger
    May 31, 2007
  3. Duane Johnson

    Meta methods to govern meta data?

    Duane Johnson, Oct 25, 2005, in forum: Ruby
    Replies:
    6
    Views:
    274
    Adam Sanderson
    Oct 28, 2005
  4. Erik Veenstra

    Meta-Meta-Programming

    Erik Veenstra, Feb 7, 2006, in forum: Ruby
    Replies:
    29
    Views:
    446
    Erik Veenstra
    Feb 8, 2006
  5. Erik Veenstra

    Meta-Meta-Programming, revisited

    Erik Veenstra, Jul 21, 2006, in forum: Ruby
    Replies:
    21
    Views:
    487
    Erik Veenstra
    Jul 25, 2006
Loading...

Share This Page