Number of Packages in the "cheeseshop"

Discussion in 'Python' started by Michael Rudolf, Mar 5, 2009.

  1. Hi, I just wondered how many Packages are in the Python Package Index.

    I could not find any counter, but I found that there is a category
    overview on http://pypi.python.org/pypi?:action=browse .

    A quick look at the HTML told me that the number of Packages per
    Category is listed surrounded by parentheses, at most one per line.

    So I counted them:

    import urllib
    sum=0
    for t in urllib.urlopen('http://pypi.python.org/pypi?%3Aaction=browse'):
    t=t.split('(')[-1].split(')')[0]
    try:
    sum += int(t)
    except ValueError:
    pass # print "OMG cannot convert %s to int" % t
    print "sum is: %s" % sum

    Which yields: sum is: 31670

    That would be around half the weight of CPAN, which would be a
    not-so-bad result ;)

    My Questions:
    a) Are there package listed in multiple Categories, which would breaking
    my counting?
    b) Did I make some other mistake(s)?
    c) is there a counter which yields the current number of PyPI-Packages?

    PS: Please excuse my bad english, I am not a native speaker.

    THX, Michael
    Michael Rudolf, Mar 5, 2009
    #1
    1. Advertising

  2. Michael Rudolf

    John Machin Guest

    On Mar 5, 9:40 pm, Michael Rudolf <> wrote:
    > Hi, I just wondered how many Packages are in the Python Package Index.
    >
    > I could not find any counter,


    Main page (http://pypi.python.org/pypi), right at the top:
    """
    The Python Package Index is a repository of software for the Python
    programming language. There are currently 5883 packages here.
    """

    The devs must have read your posting and slammed in a quick fix ;-)

    > but I found that there is a category
    > overview onhttp://pypi.python.org/pypi?%3Aaction=browse.
    >
    > A quick look at the HTML told me that the number of Packages per
    > Category is listed surrounded by parentheses, at most one per line.
    >
    > So I counted them:
    >
    > import urllib
    > sum=0
    > for t in urllib.urlopen('http://pypi.python.org/pypi?%3Aaction=browse'):
    >     t=t.split('(')[-1].split(')')[0]


    That statement is a thing of beauty and a joy forever. I wonder what
    it does.

    >     try:
    >         sum += int(t)
    >     except ValueError:
    >         pass # print "OMG cannot convert %s to int" % t
    > print "sum is: %s" % sum
    >
    > Which yields: sum is: 31670
    >
    > That would be around half the weight of CPAN, which would be a
    > not-so-bad result ;)
    >
    > My Questions:
    > a) Are there package listed in multiple Categories, which would breaking
    > my counting?


    Next you'll be asking if items are listed in multiple categories on
    eBay :)

    Have you considered looking at the listing for some individual
    packages? Here's one:

    # Categories

    * Development Status :: 5 - Production/Stable
    * Intended Audience :: Developers
    * License :: OSI Approved :: BSD License
    * Operating System :: OS Independent
    * Programming Language :: Python
    * Topic :: Database
    * Topic :: Internet :: WWW/HTTP :: Dynamic Content :: CGI Tools/
    Libraries
    * Topic :: Office/Business :: Financial :: Spreadsheet
    * Topic :: Software Development :: Libraries :: Python Modules

    So that's 9 categories. And 4 topics -- that'd be "keyword spamming"
    on eBay :)

    > b) Did I make some other mistake(s)?


    Yes.

    > c) is there a counter which yields the current number of PyPI-Packages?


    Yes.
    >
    > PS: Please excuse my bad english, I am not a native speaker.    
    >
    > THX, Michael
    John Machin, Mar 5, 2009
    #2
    1. Advertising

  3. Am Thu, 5 Mar 2009 05:38:58 -0800 (PST)
    schrieb John Machin <>:

    > Main page (http://pypi.python.org/pypi), right at the top:
    > """
    > The Python Package Index is a repository of software for the Python
    > programming language. There are currently 5883 packages here.
    > """


    Ooops... totally missed that... must have been blind, sorry.
    Thank you.

    > > for t in \
    > > urllib.urlopen('http://pypi.python.org/pypi?%3Aaction=browse'):
    > > t=t.split('(')[-1].split(')')[0]

    > That statement is a thing of beauty and a joy forever. I wonder what
    > it does.

    extracts everything between parentheses per line, as long as there is
    exactly one '(' and one ')' in it (true for that site).

    Didnt want to parse the HTML or write a regex for that simple Job.

    Anyways, sorry for that stupid post and thanks for pointing out that
    there actually *is* a counter.
    Next time I will readjust my caffeine-in-blood-level before posting. ;)

    Michael
    Michael Rudolf, Mar 5, 2009
    #3
  4. Gerard Flanagan, Mar 5, 2009
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. metaperl
    Replies:
    6
    Views:
    286
    metaperl
    Sep 23, 2006
  2. Replies:
    3
    Views:
    336
    Richard Jones
    Dec 12, 2006
  3. Jon

    Cheeseshop needs mirrors

    Jon, Mar 30, 2007, in forum: Python
    Replies:
    6
    Views:
    292
    John J. Lee
    Apr 1, 2007
  4. cyb
    Replies:
    3
    Views:
    407
    rweth
    Apr 8, 2007
  5. RobJ
    Replies:
    4
    Views:
    355
Loading...

Share This Page