Python parser

Discussion in 'Python' started by Clarendon, Mar 2, 2009.

  1. Clarendon

    Clarendon Guest

    Can somebody recommend a good parser that can be used in Python
    programs? I need a parser with large grammar that can cover a large
    amount of random texts.

    Thank you very much.
    Clarendon, Mar 2, 2009
    #1
    1. Advertising

  2. Clarendon

    Lie Ryan Guest

    Clarendon wrote:
    > Can somebody recommend a good parser that can be used in Python
    > programs?


    Do you want parser that can parse python source code or parser that
    works in python? If the latter, pyparsing is a popular choice. Ply is
    another. There are many choice:
    http://nedbatchelder.com/text/python-parsers.html

    For simple parsing, the re module might be enough.

    > I need a parser with large grammar that can cover a large
    > amount of random texts.


    Random text? Uh... what's the purpose of parsing random text?
    Lie Ryan, Mar 2, 2009
    #2
    1. Advertising

  3. Clarendon

    Clarendon Guest

    Thank you, Lie and Andrew for your help.

    I have studied NLTK quite closely but its parsers seem to be only for
    demo. It has a very limited grammar set, and even a parser that is
    supposed to be "large" does not have enough grammar to cover common
    words like "I".

    I need to parse a large amount of texts collected from the web (around
    a couple hundred sentences at a time) very quickly, so I need a parser
    with a broad scope of grammar, enough to cover all these texts. This
    is what I mean by 'random'.

    An advanced programmer has advised me that Python is rather slow in
    processing large data, and so there are not many parsers written in
    Python. He recommends that I use Jython to use parsers written in
    Java. What are your views about this?

    Thank you very much.
    Clarendon, Mar 2, 2009
    #3
  4. Clarendon

    Robert Kern Guest

    On 2009-03-02 16:14, Clarendon wrote:
    > Thank you, Lie and Andrew for your help.
    >
    > I have studied NLTK quite closely but its parsers seem to be only for
    > demo. It has a very limited grammar set, and even a parser that is
    > supposed to be "large" does not have enough grammar to cover common
    > words like "I".
    >
    > I need to parse a large amount of texts collected from the web (around
    > a couple hundred sentences at a time) very quickly, so I need a parser
    > with a broad scope of grammar, enough to cover all these texts. This
    > is what I mean by 'random'.
    >
    > An advanced programmer has advised me that Python is rather slow in
    > processing large data, and so there are not many parsers written in
    > Python. He recommends that I use Jython to use parsers written in
    > Java. What are your views about this?


    Let me clarify your request: you are asking for a parser of the English
    language, yes? Not just parsers in general? Not many English-language parsers
    are written in *any* language.

    AFAIK, there is no English-language parser written in Python beyond those
    available in NLTK. There are probably none (in any language) which will robustly
    parse all of the grammatically correct English texts you will encounter by
    scraping the web, much less all of the incorrect English you will encounter.

    Python can be rather slow for certain kinds of processing of large volumes (and
    really quite speedy for others). In this case, it's neither here nor there; the
    algorithms are reasonably slow in any language.

    You may try your luck with link-grammar, which is implemented in C:

    http://www.abisource.com/projects/link-grammar/

    Or The Stanford Parser, implemented in Java:

    http://nlp.stanford.edu/software/lex-parser.shtml

    --
    Robert Kern

    "I have come to believe that the whole world is an enigma, a harmless enigma
    that is made terrible by our own mad attempt to interpret it as though it had
    an underlying truth."
    -- Umberto Eco
    Robert Kern, Mar 2, 2009
    #4
  5. En Tue, 03 Mar 2009 22:39:19 -0200, Alan G Isaac <>
    escribió:

    > This reminds me: the SimpleParse developers ran into
    > some troubles porting to Python 2.6. It would be
    > great if someone could give them a hand.


    Do you mean the simpleparser project in Sourceforge? Latest alpha released
    in 2003? Or what?

    --
    Gabriel Genellina
    Gabriel Genellina, Mar 4, 2009
    #5
  6. Clarendon

    Kay Schluehr Guest

    On 2 Mrz., 23:14, Clarendon <> wrote:
    > Thank you, Lie and Andrew for your help.
    >
    > I have studied NLTK quite closely but its parsers seem to be only for
    > demo. It has a very limited grammar set, and even a parser that is
    > supposed to be "large" does not have enough grammar to cover common
    > words like "I".
    >
    > I need to parse a large amount of texts collected from the web (around
    > a couple hundred sentences at a time) very quickly, so I need a parser
    > with a broad scope of grammar, enough to cover all these texts. This
    > is what I mean by 'random'.
    >
    > An advanced programmer has advised me that Python is rather slow in
    > processing large data, and so there are not many parsers written in
    > Python. He recommends that I use Jython to use parsers written in
    > Java. What are your views about this?
    >
    > Thank you very much.


    You'll most likely need a GLR parser.

    There is

    http://www.lava.net/~newsham/pyggy/

    which I tried once and found it to be broken.

    Then there is the Spark toolkit

    http://pages.cpsc.ucalgary.ca/~aycock/spark/

    I checked it out years ago and found it was very slow.

    Then there is bison which can be used with a %glr-parser declaration
    and PyBison bindings

    http://www.freenet.org.nz/python/pybison/

    Bison might be solid and fast. I can't say anything about the quality
    of the bindings though.
    Kay Schluehr, Mar 4, 2009
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Bernd Oninger
    Replies:
    0
    Views:
    756
    Bernd Oninger
    Jun 9, 2004
  2. ZOCOR

    XML Parser VS HTML Parser

    ZOCOR, Oct 3, 2004, in forum: Java
    Replies:
    11
    Views:
    811
    Paul King
    Oct 5, 2004
  3. Bernd Oninger
    Replies:
    0
    Views:
    810
    Bernd Oninger
    Jun 9, 2004
  4. Joel Hedlund
    Replies:
    2
    Views:
    508
    Joel Hedlund
    Nov 11, 2006
  5. Joel Hedlund
    Replies:
    0
    Views:
    306
    Joel Hedlund
    Nov 11, 2006
Loading...

Share This Page