wikipedia parser...

Discussion in 'Java' started by boris, Sep 6, 2011.

  1. boris

    boris Guest

    hi all,
    can anyone recommend working! wikipedia parser (to text or html). I've
    tried to get wikitext working, but it looks like it has some problems..
    thanks.
     
    boris, Sep 6, 2011
    #1
    1. Advertisements

  2. On 06/09/2011 3:03 PM, boris wrote:
    > hi all,
    > can anyone recommend working! wikipedia parser (to text or html). I've
    > tried to get wikitext working, but it looks like it has some problems..
    > thanks.


    So when you tried "java wikitext parser" in Google, what results did you
    get and which parsers did you try?
     
    Travers Naran, Sep 7, 2011
    #2
    1. Advertisements

  3. boris

    Roedy Green Guest

    On Tue, 06 Sep 2011 18:03:29 -0400, boris
    <> wrote, quoted or indirectly quoted
    someone who said :

    >hi all,
    >can anyone recommend working! wikipedia parser (to text or html). I've
    >tried to get wikitext working, but it looks like it has some problems..
    >thanks.


    If there is only a set of things you are trying to extract, just
    download the page with http://mindprod.com/products1.html#HTTP

    Then pick out what you want with regexes and indexOf.

    See http://mindprod.com/jgloss/regex.html
    --
    Roedy Green Canadian Mind Products
    http://mindprod.com
    The modern conservative is engaged in one of man's oldest exercises in moral philosophy; that is,
    the search for a superior moral justification for selfishness.
    ~ John Kenneth Galbraith (born: 1908-10-15 died: 2006-04-29 at age: 97)
     
    Roedy Green, Sep 8, 2011
    #3
  4. boris

    Roedy Green Guest

    On Tue, 06 Sep 2011 18:03:29 -0400, boris
    <> wrote, quoted or indirectly quoted
    someone who said :

    >hi all,
    >can anyone recommend working! wikipedia parser (to text or html). I've
    >tried to get wikitext working, but it looks like it has some problems..
    >thanks.


    have a look at http://mindprod.com/applet/americantax.html
    There are screenscrapers for each state to extract sales tax
    information. You can use one as a starting point for what you need.
    --
    Roedy Green Canadian Mind Products
    http://mindprod.com
    The modern conservative is engaged in one of man's oldest exercises in moral philosophy; that is,
    the search for a superior moral justification for selfishness.
    ~ John Kenneth Galbraith (born: 1908-10-15 died: 2006-04-29 at age: 97)
     
    Roedy Green, Sep 8, 2011
    #4
  5. boris

    Arne Vajhøj Guest

    On 9/8/2011 12:29 AM, Roedy Green wrote:
    > On Tue, 06 Sep 2011 18:03:29 -0400, boris
    > <> wrote, quoted or indirectly quoted
    > someone who said :
    >> can anyone recommend working! wikipedia parser (to text or html). I've
    >> tried to get wikitext working, but it looks like it has some problems..
    >> thanks.

    >
    > If there is only a set of things you are trying to extract, just
    > download the page with http://mindprod.com/products1.html#HTTP


    He is not saying anything about needing help with HTTP
    requests.

    > Then pick out what you want with regexes and indexOf.
    >
    > See http://mindprod.com/jgloss/regex.html


    It will obviously work, but it is a DIY way.

    Arne
     
    Arne Vajhøj, Sep 9, 2011
    #5
  6. boris

    Arne Vajhøj Guest

    On 9/8/2011 12:31 AM, Roedy Green wrote:
    > On Tue, 06 Sep 2011 18:03:29 -0400, boris
    > <> wrote, quoted or indirectly quoted
    > someone who said :
    >> can anyone recommend working! wikipedia parser (to text or html). I've
    >> tried to get wikitext working, but it looks like it has some problems..
    >> thanks.

    >
    > have a look at http://mindprod.com/applet/americantax.html
    > There are screenscrapers for each state to extract sales tax
    > information. You can use one as a starting point for what you need.


    Does any of them use wiki markup?

    Arne
     
    Arne Vajhøj, Sep 9, 2011
    #6
    1. Advertisements

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Bernd Oninger
    Replies:
    0
    Views:
    930
    Bernd Oninger
    Jun 9, 2004
  2. ZOCOR

    XML Parser VS HTML Parser

    ZOCOR, Oct 3, 2004, in forum: Java
    Replies:
    11
    Views:
    1,059
    Paul King
    Oct 5, 2004
  3. Bernd Oninger
    Replies:
    0
    Views:
    949
    Bernd Oninger
    Jun 9, 2004
  4. Joel Hedlund
    Replies:
    2
    Views:
    747
    Joel Hedlund
    Nov 11, 2006
  5. Joel Hedlund
    Replies:
    0
    Views:
    439
    Joel Hedlund
    Nov 11, 2006
  6. Robert
    Replies:
    1
    Views:
    690
    Puppet_Sock
    Apr 14, 2008
  7. Zach Dennis

    HTML-Parser / SGML-Parser

    Zach Dennis, Oct 1, 2003, in forum: Ruby
    Replies:
    5
    Views:
    735
    Bernard Delmée
    Oct 1, 2003
  8. arne
    Replies:
    0
    Views:
    494
Loading...