One-liners: single quotes; altering first line only; printing thechanges?

Discussion in 'Perl Misc' started by Adam Funk, May 20, 2009.

  1. Adam Funk

    Adam Funk Guest

    I had some very large RDF-XML files that had been incorrectly
    generated with the prolog

    <?xml version='1.0' encoding='UTF8'?>

    which I wanted to change to

    <?xml version='1.0' encoding='UTF-8'?>

    so I used the following command.

    perl -pi.bak -e 's!^(<\?xml version=.1\.0. encoding=.)UTF8(.\?\>)!\1UTF-8\2!' *.rdf


    It worked, but I have three questions about doing it better.

    1. Is there any way to specify single quotes (') in the pattern? (I
    realize this is at least as much of a shell problem as a Perl
    problem; this is in bash on GNU/Linux.)

    2. Is it possible to tell the command to look at the first line of
    each file only? (These were very large files.)

    3. Is it possible to make a perl -i command print to STDOUT the
    changes it makes (and only the changed lines)?

    Thanks,
    Adam


    --
    | _
    | ( ) ASCII Ribbon Campaign
    | X Against HTML email & news
    | / \ www.asciiribbon.org
     
    Adam Funk, May 20, 2009
    #1
    1. Advertising

  2. Re: One-liners: single quotes; altering first line only; printingthe changes?

    Adam Funk wrote:
    > I had some very large RDF-XML files that had been incorrectly
    > generated with the prolog
    >
    > <?xml version='1.0' encoding='UTF8'?>
    >
    > which I wanted to change to
    >
    > <?xml version='1.0' encoding='UTF-8'?>
    >
    > so I used the following command.
    >
    > perl -pi.bak -e 's!^(<\?xml version=.1\.0. encoding=.)UTF8(.\?\>)!\1UTF-8\2!' *.rdf


    You should use $1 and $2 in the replacement string instead of \1 and \2.


    > It worked, but I have three questions about doing it better.
    >
    > 1. Is there any way to specify single quotes (') in the pattern? (I
    > realize this is at least as much of a shell problem as a Perl
    > problem; this is in bash on GNU/Linux.)


    $ perl -le'print "\047" x 10'
    ''''''''''

    > 2. Is it possible to tell the command to look at the first line of
    > each file only? (These were very large files.)


    No. Unless you only want one line left in the new files.


    > 3. Is it possible to make a perl -i command print to STDOUT the
    > changes it makes (and only the changed lines)?


    Yes, but only if you explicitly use the STDOUT filehandle because with
    the -i switch the default output filehandle is ARGVOUT.



    John
    --
    Those people who think they know everything are a great
    annoyance to those of us who do. -- Isaac Asimov
     
    John W. Krahn, May 20, 2009
    #2
    1. Advertising

  3. Adam Funk

    smallpond Guest

    On May 20, 7:20 am, Adam Funk <> wrote:
    > I had some very large RDF-XML files that had been incorrectly
    > generated with the prolog
    >
    > <?xml version='1.0' encoding='UTF8'?>
    >
    > which I wanted to change to
    >
    > <?xml version='1.0' encoding='UTF-8'?>
    >
    > so I used the following command.
    >
    > perl -pi.bak -e 's!^(<\?xml version=.1\.0. encoding=.)UTF8(.\?\>)!\1UTF-8\2!' *.rdf
    >
    > It worked, but I have three questions about doing it better.
    >


    >
    > 2. Is it possible to tell the command to look at the first line of
    >    each file only?  (These were very large files.)
    >


    s/UTF8/UTF-8/ if 1 .. 1; # use the line range op
     
    smallpond, May 20, 2009
    #3
  4. Re: One-liners: single quotes; altering first line only; printing the changes?

    "John W. Krahn" <> writes:

    >> 2. Is it possible to tell the command to look at the first line of
    >> each file only? (These were very large files.)

    >
    > No. Unless you only want one line left in the new files.


    You could do something like:

    perl -pi -e 'close ARGV if eof; next if ($. > 1)..0; s/.../.../;'

    it would still read every line, but only the first would be
    processed. Tested, but I'm not sure I would recommend it.

    //Makholm
     
    Peter Makholm, May 20, 2009
    #4
  5. Re: One-liners: single quotes; altering first line only; printing the changes?

    smallpond <> writes:

    >> 2. Is it possible to tell the command to look at the first line of
    >>    each file only?  (These were very large files.)
    >>

    >
    > s/UTF8/UTF-8/ if 1 .. 1; # use the line range op


    Only changes the first line in the first file. You have to reset $. at
    some point. But this works:

    perl -pi -e 's/.../.../ if 1..1; close ARGV if eof;' *

    I don't know why I insisted on using next which forced me to close
    before the substitution.

    //Makholm
     
    Peter Makholm, May 20, 2009
    #5
  6. Re: One-liners: single quotes; altering first line only; printing the changes?

    Adam Funk <> wrote:
    > I had some very large RDF-XML files that had been incorrectly
    > generated with the prolog
    >
    ><?xml version='1.0' encoding='UTF8'?>
    >
    > which I wanted to change to
    >
    ><?xml version='1.0' encoding='UTF-8'?>
    >
    > so I used the following command.
    >
    > perl -pi.bak -e 's!^(<\?xml version=.1\.0. encoding=.)UTF8(.\?\>)!\1UTF-8\2!' *.rdf
    >
    >
    > It worked, but I have three questions about doing it better.
    >
    > 1. Is there any way to specify single quotes (') in the pattern? (I
    > realize this is at least as much of a shell problem as a Perl
    > problem; this is in bash on GNU/Linux.)



    Single quotes won't bother the shell if you use double quotes on
    the argument instead of single quotes.

    Most people stick with slash for the delimiter unless slash is
    part of their pattern.

    perl -pi.bak -e "s/\Q<?xml version='1.0' encoding='UTF8'?>/<?xml version='1.0' encoding='UTF-8'?>/"


    You should use $1 instead of \1 in replacement strings.

    You could use an escape sequence for the single quote character
    (see "Quote and Quote-like Operators" in perlop):

    perl -pi.bak -e 's!^(<\?xml version=\x271\.0\x27 encoding=\x27)UTF8(\x27\?\>)!$1UTF-8$2!'


    > 2. Is it possible to tell the command to look at the first line of
    > each file only? (These were very large files.)



    That depends on what you mean by "look at".

    If you mean: only attempt the s/// on the 1st line,
    then yes, that is easy to do:

    s/// if $. == 1;

    If you mean: process only the 1st line, then no, you can't do that
    in conjunction with in-place editing or you'd end up with a 1-line file.


    > 3. Is it possible to make a perl -i command print to STDOUT the
    > changes it makes (and only the changed lines)?



    Not in the general case, but you can for this particular case:

    perl -pi.bak -e "print STDOUT if s/\Q<?xml version='1.0' encoding='UTF8'?>/<?xml version='1.0' encoding='UTF-8'?>/"


    --
    Tad McClellan
    email: perl -le "print scalar reverse qq/moc.noitatibaher\100cmdat/"
     
    Tad J McClellan, May 20, 2009
    #6
  7. Re: One-liners: single quotes; altering first line only; printingthe changes?

    On 2009-05-20 11:20, Adam Funk <> wrote:
    > perl -pi.bak -e 's!^(<\?xml version=.1\.0. encoding=.)UTF8(.\?\>)!\1UTF-8\2!' *.rdf
    >
    >
    > It worked, but I have three questions about doing it better.
    >
    > 1. Is there any way to specify single quotes (') in the pattern? (I
    > realize this is at least as much of a shell problem as a Perl
    > problem; this is in bash on GNU/Linux.)


    \x{27} (or \047 if you prefer to think in octal) should work.


    > 2. Is it possible to tell the command to look at the first line of
    > each file only? (These were very large files.)


    You have to read and copy the whole file to change a line in it - that's
    unavoidable. You can only do the substitution on line 1 with something
    like

    $. == 1 and s!...!...!

    I'm not sure if this is much of s speedup.

    > 3. Is it possible to make a perl -i command print to STDOUT the
    > changes it makes (and only the changed lines)?


    I haven't tested it, but the docs say that the output file is
    "selected". I take that to mean that STDOUT is unchanged and

    print STDOUT "whatever"

    should do what you want.

    hp
     
    Peter J. Holzer, May 20, 2009
    #7
  8. Adam Funk

    Adam Funk Guest

    Re: One-liners: single quotes; altering first line only; printingthe changes?

    On 2009-05-20, John W. Krahn wrote:

    > Adam Funk wrote:


    >> perl -pi.bak -e 's!^(<\?xml version=.1\.0. encoding=.)UTF8(.\?\>)!\1UTF-8\2!' *.rdf

    >
    > You should use $1 and $2 in the replacement string instead of \1 and \2.
    >
    >
    >> It worked, but I have three questions about doing it better.
    >>
    >> 1. Is there any way to specify single quotes (') in the pattern? (I
    >> realize this is at least as much of a shell problem as a Perl
    >> problem; this is in bash on GNU/Linux.)

    >
    > $ perl -le'print "\047" x 10'
    > ''''''''''


    Aha, thanks!


    --
    It is probable that television drama of high caliber and produced by
    first-rate artists will materially raise the level of dramatic taste
    of the nation. (David Sarnoff, CEO of RCA, 1939; in Stoll 1995)
     
    Adam Funk, May 20, 2009
    #8
  9. Adam Funk

    Adam Funk Guest

    Re: One-liners: single quotes; altering first line only; printingthe changes?

    On 2009-05-20, Tad J McClellan wrote:

    > Single quotes won't bother the shell if you use double quotes on
    > the argument instead of single quotes.


    They sure wouldn't work for me.

    > Most people stick with slash for the delimiter unless slash is
    > part of their pattern.
    >
    > perl -pi.bak -e "s/\Q<?xml version='1.0' encoding='UTF8'?>/<?xml version='1.0' encoding='UTF-8'?>/"


    I know. I tend to use "!" with XML files because I often need the "/"
    in the "<.../>".

    > You should use $1 instead of \1 in replacement strings.


    [reminds himself again]

    > You could use an escape sequence for the single quote character
    > (see "Quote and Quote-like Operators" in perlop):
    >
    > perl -pi.bak -e 's!^(<\?xml version=\x271\.0\x27 encoding=\x27)UTF8(\x27\?\>)!$1UTF-8$2!'


    Thanks.

    >> 2. Is it possible to tell the command to look at the first line of
    >> each file only? (These were very large files.)

    >
    >
    > That depends on what you mean by "look at".
    >
    > If you mean: only attempt the s/// on the 1st line,
    > then yes, that is easy to do:
    >
    > s/// if $. == 1;


    That's what I mean. If I'd thought of that, I wouldn't have
    conglomerated such a long substitution pattern (and felt the need to
    check that the file sizes all differed by 1 afterwards).

    >> 3. Is it possible to make a perl -i command print to STDOUT the
    >> changes it makes (and only the changed lines)?

    >
    >
    > Not in the general case, but you can for this particular case:
    >
    > perl -pi.bak -e "print STDOUT if s/\Q<?xml version='1.0' encoding='UTF8'?>/<?xml version='1.0' encoding='UTF-8'?>/"


    I'll take a look at that. Thanks.


    --
    When Elaine turned 11, her mother sent her to train under
    Donald Knuth in his mountain hideaway. [XKCD 342]
     
    Adam Funk, May 20, 2009
    #9
  10. Adam Funk

    Adam Funk Guest

    Re: One-liners: single quotes; altering first line only; printingthe changes?

    On 2009-05-20, Peter J. Holzer wrote:

    > On 2009-05-20 11:20, Adam Funk <> wrote:
    >> perl -pi.bak -e 's!^(<\?xml version=.1\.0. encoding=.)UTF8(.\?\>)!\1UTF-8\2!' *.rdf
    >>
    >>
    >> It worked, but I have three questions about doing it better.
    >>
    >> 1. Is there any way to specify single quotes (') in the pattern? (I
    >> realize this is at least as much of a shell problem as a Perl
    >> problem; this is in bash on GNU/Linux.)

    >
    > \x{27} (or \047 if you prefer to think in octal) should work.


    Right, thanks.

    >> 2. Is it possible to tell the command to look at the first line of
    >> each file only? (These were very large files.)

    >
    > You have to read and copy the whole file to change a line in it - that's
    > unavoidable. You can only do the substitution on line 1 with something
    > like
    >
    > $. == 1 and s!...!...!
    >
    > I'm not sure if this is much of s speedup.


    I'd expect it to speed things up a bit, since the first test should be
    faster than the regex-matching. It would also eliminate the risk of
    changing anything past the XML prolog.


    --
    It is probable that television drama of high caliber and produced by
    first-rate artists will materially raise the level of dramatic taste
    of the nation. (David Sarnoff, CEO of RCA, 1939; in Stoll 1995)
     
    Adam Funk, May 20, 2009
    #10
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Chris
    Replies:
    1
    Views:
    13,723
    Oisin
    Mar 24, 2006
  2. kj
    Replies:
    2
    Views:
    421
    unayok
    Jun 18, 2009
  3. Tobi Reif
    Replies:
    5
    Views:
    108
    Tobi Reif
    Mar 13, 2006
  4. Larry
    Replies:
    1
    Views:
    115
    Martien Verbruggen
    Feb 3, 2005
  5. Replies:
    10
    Views:
    1,310
    Anno Siegel
    Apr 17, 2006
Loading...

Share This Page