summarize text

Discussion in 'Python' started by robin, May 29, 2006.

  1. robin

    robin Guest

    hello list,

    does anyone know of a library which permits to summarise text? i've
    been looking at nltk but haven't found anything yet. any help would be
    very welcome.
    thank you all in advance,

    robin
    robin, May 29, 2006
    #1
    1. Advertising

  2. robin

    Tim Chase Guest

    > does anyone know of a library which permits to summarise text?
    > i've been looking at nltk but haven't found anything yet. any
    > help would be very welcome.


    Well, summarizing text is one of those things that generally
    takes a brain-cell or two to do. Automating the process would
    require doing it either smartly (some sort of
    neural-net/NLP/Markov-chain technology, which is a non-trivial
    task--something one might consider braving in the 3rd or 4th-year
    of a university computer-science program), or doing it fairly
    dumbly. As an example of a "dumb" solution, you can use regexps
    to trim off the first few words and the last few words and call
    that a "summary":

    >>> import re
    >>> r = re.compile(r'^(.{8}.*?\b)\s.*\s(\b.{8}.*?)', re.DOTALL)
    >>> s = """This is the first line

    .... and it has a second line
    .... and a third line
    .... and the last line is the fourth line."""
    >>> result = r.sub(r"\1...\2",s.strip())
    >>> result

    'This is the...fourth line.'

    You can adjust the "{8}" portions for more or less
    leader/trailing context characters.

    The regexp might need a bit of tweaking for somewhat short
    strings, but if they're fairly short, one might not need to
    summarize them ;)

    -tkc
    Tim Chase, May 29, 2006
    #2
    1. Advertising

  3. robin

    gene tani Guest

    gene tani, May 29, 2006
    #3
  4. robin

    robin Guest

    thanks for all your replies. lemur looks pretty interesting!
    robin

    gene tani wrote:
    > robin wrote:
    > > hello list,
    > >
    > > does anyone know of a library which permits to summarise text? i've
    > > been looking at nltk but haven't found anything yet. any help would be

    >
    > unclear what you're asking, maybe look at:
    > http://www.cs.waikato.ac.nz/~ml/weka/index.html
    >
    > http://www.kdnuggets.com/software/suites.html
    > http://www.ailab.si/orange
    >
    > http://mallet.cs.umass.edu/index.php/Main_Page
    > http://minorthird.sourceforge.net/
    > http://www.dia.uniroma3.it/db/roadRunner/
    >
    > http://www.lemurproject.org/
    robin, May 31, 2006
    #4
  5. .... sorry, I thought you said "summarize Proust".

    :)
    Lawrence D'Oliveiro, Jun 5, 2006
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. bastardx

    summarize bytes

    bastardx, Jun 7, 2005, in forum: Perl
    Replies:
    7
    Views:
    2,784
    thoomas
    Jun 16, 2005
  2. DeMarcus
    Replies:
    13
    Views:
    585
    BGB / cr88192
    Sep 14, 2010
  3. hhw
    Replies:
    0
    Views:
    90
  4. leo
    Replies:
    1
    Views:
    260
    Bob Lehmann
    Dec 5, 2005
  5. Jeff Dickens
    Replies:
    5
    Views:
    421
    Nathaniel Talbott
    Dec 1, 2003
Loading...

Share This Page