parse unix-style difference reporting

Discussion in 'Perl' started by Liang, Dec 30, 2003.

  1. Liang

    Liang Guest

    Hi all,

    I want to diff two files or two versions of one file, and parse the output
    to find a summary of how many lines of replacement/addition/deletion in the
    two files.

    Known from diff/cleardiff, the output has a style like:
    15a16, 15,17d3, 18c19,21 etc.

    Anyone know how to parse these output to generate a summary?

    Thanks in advance,
    Liang
    Liang, Dec 30, 2003
    #1
    1. Advertising

  2. In article <bsql5f$gsk$>,
    "Liang" <> wrote:

    > Hi all,
    >
    > I want to diff two files or two versions of one file, and parse the output
    > to find a summary of how many lines of replacement/addition/deletion in the
    > two files.
    >
    > Known from diff/cleardiff, the output has a style like:
    > 15a16, 15,17d3, 18c19,21 etc.
    >
    > Anyone know how to parse these output to generate a summary?


    You can use "diff -c" and count the number of "<", ">", and "!" lines.
    Or use the "comm" command and count the number of lines.

    --
    Barry Margolin,
    Arlington, MA
    Barry Margolin, Dec 30, 2003
    #2
    1. Advertising

  3. Liang wrote:
    > I want to diff two files or two versions of one file, and parse the output
    > to find a summary of how many lines of replacement/addition/deletion in the
    > two files.
    >
    > Known from diff/cleardiff, the output has a style like:
    > 15a16, 15,17d3, 18c19,21 etc.
    >
    > Anyone know how to parse these output to generate a summary?


    It isn't very hard to work it out, is it?

    Each item conceptually has four numbers and an operation code:

    N1,N2 op N3,N4

    When there is just one number on one side of the operation, the values
    N1 and N2, or N3 and N4, are the same.

    Inserts are easy: there's always a single number on the LHS, and the
    number of lines inserted is N4-N3+1.

    Similarly, deletes are easy: there's always a single number on the RHS
    of the operator, and the number of lines deleted is N2-N1+1.

    Number of lines replaced has two parts to the value - the number of
    lines removed and the number replacing the removed lines. Depending
    on your viewpoint, you can either choose to count the two values
    separately (number removed NR = N2-N1+1, number inserted NI =
    N4-N3+1), or you can be cleverer about the calculation and decide that
    when NR > NI, then you have NI changed lines and NR-NI deleted lines,
    and that when NR < NI, you have NR changed lines and NI-NR inserted
    lines. When NR = NI, you have NR (or NI) changed lines, of course.

    That took me five minutes to think and type - how long would it have
    taken you to do it? (And cross-posted too?)

    --
    Jonathan Leffler #include <disclaimer.h>
    Email: ,
    Guardian of DBD::Informix v2003.04 -- http://dbi.perl.org/
    Jonathan Leffler, Dec 30, 2003
    #3
  4. Thomas Dickey, Dec 30, 2003
    #4
  5. Liang

    Liang Guest

    >
    > You can use "diff -c" and count the number of "<", ">", and "!" lines.
    > Or use the "comm" command and count the number of lines.
    >

    marvellous! this is the simplest solution.

    Happy new year!

    > --
    > Barry Margolin,
    > Arlington, MA
    Liang, Dec 31, 2003
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Replies:
    4
    Views:
    3,280
    Rolf Howarth
    Jun 25, 2005
  2. Replies:
    12
    Views:
    1,639
    Dave Thompson
    Jan 10, 2005
  3. Compatible Mozilla/4.0EmbeddedWB-

    The Excel-like design style of reporting tool

    Compatible Mozilla/4.0EmbeddedWB-, Apr 5, 2009, in forum: Java
    Replies:
    1
    Views:
    462
  4. mzdude
    Replies:
    19
    Views:
    531
    James Kanze
    Aug 14, 2009
  5. Ken Varn
    Replies:
    0
    Views:
    450
    Ken Varn
    Apr 26, 2004
Loading...

Share This Page