count word repetitions for each pair of lines

Discussion in 'Perl Misc' started by tony, Jul 23, 2003.

  1. tony

    tony Guest

    I wonder if anyone has a script that could do the following?

    -Read a file containing a regular text (e.g. a news story);
    -Count word type repetitions for each pair of lines, disregarding
    numbers and ignoring case. A valid repetition is a word type that
    occurs in both lines of the pair. Words that occur many times in one
    line of the pair only must be disregarded;
    -Print out the count for word repetitions for each sentence pair in
    the formats shown below ;

    For instance:

    input:
    cat: cat is sitting on 2 mats.
    Dog: dog is sitting.
    dog, CAT and 2 mATs.

    output format 1:
    #format:
    #[line][line][repetitions][words repeated]
    [1][2][2][is,sitting]
    [1][3][2][cat,mats]
    [2][3][1][dog]

    output format 2:
    #matrix format:
    #[line 1]:[1 & 1][1 & 2][1 & 3]
    #[line 2]:[2 & 1][2 & 2][2 & 3]
    #[line 3]:[3 & 1][3 & 2][3 & 3]
    [1]:[0][2][2]
    [2]:[2][0][1]
    [3]:[2][1][0]

    thanks very much indeed

    tony berber

    Catholic University of Sao Paulo, Brazil
    Applied Linguistics Postgraduate Program
    tony4 at uol.com.br
     
    tony, Jul 23, 2003
    #1
    1. Advertising

  2. tony wrote:
    > I wonder if anyone has a script that could do the following?
    >
    > -Read a file containing a regular text (e.g. a news story);
    > -Count word type repetitions for each pair of lines, disregarding
    > numbers and ignoring case. A valid repetition is a word type that
    > occurs in both lines of the pair. Words that occur many times in one
    > line of the pair only must be disregarded;
    > -Print out the count for word repetitions for each sentence pair in
    > the formats shown below ;


    Pretty simple:
    - split() both lines into arrays of words, filter out numbers and similar
    unwanted stuff
    - then apply the solution from the FAQ
    "How do I compute the difference of two arrays? How do I compute
    the intersection of two arrays?"
    - and then just print the result

    Where's the problem?

    jue
     
    Jürgen Exner, Jul 23, 2003
    #2
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. darin dimitrov
    Replies:
    4
    Views:
    1,328
  2. Branka
    Replies:
    4
    Views:
    898
    benben
    Apr 4, 2006
  3. nick048

    How to count some repetitions

    nick048, Nov 1, 2006, in forum: C++
    Replies:
    2
    Views:
    351
    Daniel T.
    Nov 1, 2006
  4. nick048

    How to count some repetitions

    nick048, Nov 1, 2006, in forum: C++
    Replies:
    0
    Views:
    308
    nick048
    Nov 1, 2006
  5. Wolfram Humann

    Matching repetitions with /g

    Wolfram Humann, Aug 18, 2011, in forum: Perl Misc
    Replies:
    6
    Views:
    207
    Wolfram Humann
    Aug 23, 2011
Loading...

Share This Page