python-parser running Beautiful Soup only spits out one line of 10.What i have gotten wrong here?

Discussion in 'Python' started by Martin Kaspar, Dec 25, 2010.

  1. Hello dear Community,.


    I am trying to get a scraper up and running: And keep running into
    problems.

    when I try what you have i have learnedd so far I only get:
    <strong>Schuldaten</strong>

    Here is the code that I used:

    import urllib2
    from BeautifulSoup import BeautifulSoup
    page = urllib2.urlopen("http://www.schulministerium.nrw.de/BP/
    SchuleSuchen?action=799.601437941842&SchulAdresseMapDO=142323")
    soup = BeautifulSoup(page)
    table = soup.find('table' ,attrs={'class':'bp_ergebnis_tab_info'})
    first_td = soup.find('td')
    text = first_td.renderContents()
    trimmed_text = text.strip()
    print trimmed_text


    i run it in the template at http://scraperwiki.com/scrapers/new/python

    see the target: http://www.schulministerium.nrw.de/BP/SchuleSuchen?action=799.601437941842&SchulAdresseMapDO=142323

    What have I gotten wrong?

    Can anybody review the code -

    many thanks in Advance

    regards
    matze
    Martin Kaspar, Dec 25, 2010
    #1
    1. Advertising

  2. Martin Kaspar

    John Nagle Guest

    Re: python-parser running Beautiful Soup only spits out one lineof 10. What i have gotten wrong here?

    Your program is doing what you asked it to do. It finds the
    first table with class 'bp_ergebnis_tab_info'. Then it ignores
    that results. Then it finds the first "td" item in the document,
    and prints the contents of that. Then it exits. What did
    you want it to do?

    Try this. It prints out the TD items on each
    row of the table, in order.

    import urllib2
    from BeautifulSoup import BeautifulSoup
    page =
    urllib2.urlopen("http://www.schulministerium.nrw.de/BP/SchuleSuchen?action=799.601437941842&SchulAdresseMapDO=142323")
    soup = BeautifulSoup(page)
    table = soup.find('table' ,attrs={'class':'bp_ergebnis_tab_info'})
    for row in table.findAll('tr') : # for all TR items (table rows)
    for td in row.findAll('td') : # for TD items in row
    text = td.renderContents().strip()
    print(text)
    print('-----') # mark end of row

    John Nagle

    On 12/25/2010 9:58 AM, Martin Kaspar wrote:
    > Hello dear Community,.
    > I am trying to get a scraper up and running: And keep running into
    > problems.
    >
    > when I try what you have i have learned so far I only get:
    > <strong>Schuldaten</strong>
    >
    > Here is the code that I used:
    >
    > import urllib2
    > from BeautifulSoup import BeautifulSoup
    > page = urllib2.urlopen("http://www.schulministerium.nrw.de/BP/
    > SchuleSuchen?action=799.601437941842&SchulAdresseMapDO=142323")
    > soup = BeautifulSoup(page)
    > table = soup.find('table' ,attrs={'class':'bp_ergebnis_tab_info'})
    > first_td = soup.find('td')
    > text = first_td.renderContents()
    > trimmed_text = text.strip()
    > print trimmed_text
    >
    >
    > i run it in the template at http://scraperwiki.com/scrapers/new/python
    >
    > see the target: http://www.schulministerium.nrw.de/BP/SchuleSuchen?action=799.601437941842&SchulAdresseMapDO=142323
    >
    > What have I gotten wrong?
    >
    > Can anybody review the code -
    >
    > many thanks in Advance
    >
    > regards
    > matze
    John Nagle, Dec 25, 2010
    #2
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Replies:
    2
    Views:
    528
    Enigma Curry
    Mar 11, 2006
  2. Tempo

    Using Beautiful Soup

    Tempo, Aug 19, 2006, in forum: Python
    Replies:
    1
    Views:
    531
    Jorge Godoy
    Aug 19, 2006
  3. Francach
    Replies:
    15
    Views:
    711
    George Sakkis
    Sep 21, 2006
  4. Martin Kaspar
    Replies:
    4
    Views:
    767
    Stef Mientki
    Dec 12, 2010
  5. Simon Evans
    Replies:
    41
    Views:
    165
    Rustom Mody
    May 15, 2014
Loading...

Share This Page