Re: Vote tallying...

Discussion in 'Python' started by Stefan Behnel, Jan 18, 2013.

  1. Andrew Robinson, 18.01.2013 00:59:
    > I have a problem which may fit in a mysql database


    Everything fits in a MySQL database - not a reason to use it, though. Py2.5
    and later ship with sqlite3 and if you go for an external database, why use
    MySQL if you can have PostgreSQL for the same price?


    > but which I only have
    > python as an alternate tool to solve... so I'd like to hear some opinions...
    >
    > I'm building a experimental content management program on a standard Linux
    > Web server.
    > And I'm needing to keep track of archived votes and their voters -- for years.
    >
    > Periodically, a python program could be given a batch of new votes removed
    > from the database, and some associated comments, which are no longer
    > real-time necessary; and then a python script needs to take that batch of
    > votes, and apply them to an appropriate archive file. It's important to
    > note that it won't just be appending new votes, it will be sorting through
    > a list of 10's of thousands of votes, and changing a *few* of them, and
    > appending the rest.
    >
    > XML may not be the ideal solution, but I am easily able to see how it might
    > work. I imagine a file like the following might be inefficient, but
    > capable of solving the problem:
    >
    > <?xml version="1.0"?>
    > <data>
    >
    > <identify>
    > <contentid>12345A3</contentid>
    > <authorid>FF734B5D</authorid>
    > <permissions>7FBED</permissions>
    > <chapter>The woodstock games</chapter>
    > </identify>
    >
    > <comments>
    > <comment id="FF53524" date="2013.01.12">I think you're on drugs,
    > man.!</comment>
    > <comment id="unregistered" date="2013.01.12">It would have been
    > better if they didn't wake up in the morning.</comment>
    > </comments>
    >
    > <votes>
    > <v id="FF3424">10</v>
    > <v id="F713A4">1</v>
    > <v id="12312234">3</v>
    > </votes>
    > </data>
    >
    > The questions I have are, is using XML for vote recording going to be slow
    > compared to other stock solutions that Python may have to offer? The voter
    > ID's are unique, 32 bits long, and the votes are only from 1 to 10. (4
    > bits.). I'm free to use any import that comes with python 2.5. so if
    > there's something better than XML, I'm interested.
    >
    > And secondly, how likely is this to still work once the vote count reaches
    > 10 million?
    > Is an XML file with millions of entries something someone has already tried
    > succesfully?


    Sure. However, XML files are a rather static thing and meant to be
    processed from start to end on each run. That adds up if the changes are
    small and local while the file is ever growing. You seem to propose one
    file per article, which might work. That's unlikely to become too huge to
    process, and Python's cElementTree is a very fast XML processor.

    However, your problem sounds a lot like you could map it to one of the dbm
    databases that Python ships. They work like dicts, just on disk.

    IIUC, you want to keep track of comments and their associated votes, maybe
    also keep a top-N list of the highest voted comments. So, keep each comment
    and its votes in a dbm record, referenced by the comment's ID (which, I
    assume, you keep a list of in the article that it comments on). You can use
    pickle (see the shelve module) or JSON or whatever you like for storing
    that record. Then, on each votes update, look up the comment, change its
    votes and store it back. If you keep a top-N list for an article, update it
    at the same time. Consider storing it either as part of the article or in
    another record referenced by the article, depending of how you normally
    access it. You can also store the votes independent of the comment (i.e. in
    a separate record for each comment), in case you don't normally care about
    the votes but read the comments frequently. It's just a matter of adding an
    indirection for things that you use less frequently and/or that you use in
    more than one place (not in your case, where comments and votes are unique
    to an article).

    You see, lots of options, even just using the stdlib...

    Stefan
    Stefan Behnel, Jan 18, 2013
    #1
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Andrew Robinson

    Vote tallying...

    Andrew Robinson, Jan 17, 2013, in forum: Python
    Replies:
    0
    Views:
    100
    Andrew Robinson
    Jan 17, 2013
  2. Lie Ryan

    Re: Vote tallying...

    Lie Ryan, Jan 18, 2013, in forum: Python
    Replies:
    0
    Views:
    108
    Lie Ryan
    Jan 18, 2013
  3. Nick Cash

    RE: Vote tallying...

    Nick Cash, Jan 18, 2013, in forum: Python
    Replies:
    0
    Views:
    98
    Nick Cash
    Jan 18, 2013
  4. Tim Chase

    Re: Vote tallying...

    Tim Chase, Jan 18, 2013, in forum: Python
    Replies:
    0
    Views:
    103
    Tim Chase
    Jan 18, 2013
  5. Andrew Robinson

    Re: Vote tallying...

    Andrew Robinson, Jan 18, 2013, in forum: Python
    Replies:
    0
    Views:
    98
    Andrew Robinson
    Jan 18, 2013
Loading...

Share This Page