simultaneous multiple requests to very simple database

Discussion in 'Python' started by Eric S. Johansson, Jan 18, 2005.

  1. I have an application where I need a very simple database, effectively a
    very large dictionary. The very large dictionary must be accessed from
    multiple processes simultaneously. I need to be able to lock records
    within the very large dictionary when records are written to. Estimated
    number of records will be in the ballpark of 50,000 to 100,000 in his
    early phase and 10 times that in the future. Each record will run about
    100 to 150 bytes.

    speed is not a huge concern although I must complete processing in less
    than 90 seconds. The longer the delay however the greater number of
    processes must be running parallel in order to keep the throughput up.
    It's the usual trade-off we have all come to know and love.

    it is not necessary for the dictionary to persist beyond the life of the
    parent process although I have another project coming up in which this
    would be a good idea.

    at this point, I know they will be some kind souls suggesting various
    SQL solutions. While I appreciate the idea, unfortunately I do not have
    time to puzzle out yet another component. Someday I will figure it out
    because I really liked what I see with SQL lite but unfortunately, today
    is not that day (unless they will give me their work, home and cell
    phone numbers so I can call when I am stuck. ;-)

    So the solutions that come to mind are some form of dictionary in shared
    memory with locking semaphore scoreboard or a multithreaded process
    containing a single database (Python native dictionary, metakit, gdbm??)
    and have all of my processes speak to it using xmlrpc which leaves me
    with the question of how to make a multithreaded server using stock xmlrpc.

    so feedback and pointers to information would be most welcome. I'm
    still exploring the idea so I am open to any and all suggestions (except
    maybe SQL :)

    ---eric
    Eric S. Johansson, Jan 18, 2005
    #1
    1. Advertising

  2. "Eric S. Johansson" <> wrote in message
    news:...
    <snip>
    > at this point, I know they will be some kind souls suggesting various
    > SQL solutions. While I appreciate the idea, unfortunately I do not have
    > time to puzzle out yet another component. Someday I will figure it out
    > because I really liked what I see with SQL lite but unfortunately, today
    > is not that day (unless they will give me their work, home and cell
    > phone numbers so I can call when I am stuck. ;-)

    <snip>

    Forgive me if this reply sounds a bit glib. But I do mean it without malice.

    Do you seriously expect to write your own (database) solution and that this
    will save you time and effort over learning an existing (SQL) solution?

    Because -
    If you are seeking to "save time" on "puzzles", you are certainly going
    about it the wrong way.

    Best of luck
    Thomas Bartkus
    Thomas Bartkus, Jan 18, 2005
    #2
    1. Advertising

  3. Thomas Bartkus wrote:
    > "Eric S. Johansson" <> wrote in message
    > news:...
    > <snip>
    >
    >>at this point, I know they will be some kind souls suggesting various
    >>SQL solutions. While I appreciate the idea, unfortunately I do not have
    >>time to puzzle out yet another component. Someday I will figure it out
    >>because I really liked what I see with SQL lite but unfortunately, today
    >>is not that day (unless they will give me their work, home and cell
    >>phone numbers so I can call when I am stuck. ;-)

    >
    > <snip>
    >
    > Forgive me if this reply sounds a bit glib. But I do mean it without malice.


    understood and taken in that spirit.

    > Do you seriously expect to write your own (database) solution and that this
    > will save you time and effort over learning an existing (SQL) solution?
    >
    > Because -
    > If you are seeking to "save time" on "puzzles", you are certainly going
    > about it the wrong way.


    one thing I learned a long time ago was to respect the nagging voice in
    the back of my head that says "there is something wrong". Right now
    with databases, that voice is not nagging but screaming. So I made my
    query to try and prove that intuition wrong. So far, that has not happened.

    When I look at databases, I see a bunch of very good solutions that are
    either overly complex or heavyweight on one hand and very nice and
    simple but unable to deal with concurrency on the other. two sets of
    point solutions that try to stretch themselves and the developers to fit
    other application contexts.

    99.9 percent of what I do (and I suspect this could be true for others)
    could be satisfied by a slightly enhanced super dictionary with a record
    level locking. but, the database world does not fit this model. It has
    a great deal more complication then what is frequently necessary.

    If I ever find the time, I will try to build such a beast probably
    around Metakit. The only reason for reluctance is that I have spent too
    many hours tracking down concurrency problems at the OS level way to
    many years ago and so I do not create multithreaded applications lightly.

    so in conclusion, my only reason for querying was to see if I was
    missing a solution. So far, I have not found any work using because
    they add orders of magnitude more complexity than simple dbm with file
    locking. Obviously, the simple solution has horrible performance right
    now I need simplicity implementation.

    thanks for your commentary.

    ---eric
    Eric S. Johansson, Jan 18, 2005
    #3
  4. On Tue, 18 Jan 2005 17:33:26 -0500, Eric S. Johansson wrote:

    > When I look at databases, I see a bunch of very good solutions that are
    > either overly complex or heavyweight on one hand and very nice and simple
    > but unable to deal with concurrency on the other. two sets of point
    > solutions that try to stretch themselves and the developers to fit other
    > application contexts.
    >


    Have you considerded SQLite/pySQLite ?

    --
    Ricardo
    Ricardo Bugalho, Jan 18, 2005
    #4
  5. On Tue, 18 Jan 2005 17:33:26 -0500, Eric S. Johansson <> wrote:
    > so in conclusion, my only reason for querying was to see if I was
    > missing a solution. So far, I have not found any work using because
    > they add orders of magnitude more complexity than simple dbm with file
    > locking. Obviously, the simple solution has horrible performance right
    > now I need simplicity implementation.
    >
    > thanks for your commentary.


    Maybe you can just get the best of both worlds.

    Have a look at SQLObject. You can ignore the fact that underneath the
    SQLObject there's a postgres (or mysql, or whatever) database, and get
    OO based persistance.

    SQLObject is crippled in that there are degrees of freedom that SQL
    gives you that SQLObject takes away/makes hard to use, but what you're
    trying to do, and what most people actually do with databases, can be
    easily wrapped around with a simple, pythonic wrapper.

    It even has a .createTable() function for those times when you don't
    even want to log into the database.

    Regards,
    Stephen Thorne.
    Stephen Thorne, Jan 18, 2005
    #5
  6. Ricardo Bugalho wrote:
    > On Tue, 18 Jan 2005 17:33:26 -0500, Eric S. Johansson wrote:
    >
    >
    >>When I look at databases, I see a bunch of very good solutions that are
    >>either overly complex or heavyweight on one hand and very nice and simple
    >>but unable to deal with concurrency on the other. two sets of point
    >>solutions that try to stretch themselves and the developers to fit other
    >>application contexts.
    >>

    >
    >
    > Have you considerded SQLite/pySQLite ?


    yep and apparently it won't work

    http://www.sqlite.org/faq.html#q7

    if I had record level locking, the code would do a very common pattern like:

    if record present:
    Lock record
    modify record
    release lock
    else:
    create record atomically (actual method TBB)

    if I read their opinion correctly, the SQL lite folks are wrong in that
    only the applications need massive concurrency. Small applications need
    significant to massive concurrency for very tiny windows on very little
    data.

    but I do appreciate the pointer.
    Eric S. Johansson, Jan 18, 2005
    #6
  7. "Eric S. Johansson" <> wrote in message
    news:...
    <snip>
    > 99.9 percent of what I do (and I suspect this could be true for others)
    > could be satisfied by a slightly enhanced super dictionary with a record
    > level locking.


    BUT - Did you not mention! :
    > Estimated number of records will be in the ballpark of 50,000 to

    100,000 in his
    > early phase and 10 times that in the future. Each record will run

    about
    > 100 to 150 bytes.

    ..
    And
    > The very large dictionary must be accessed from
    > multiple processes simultaneously


    And
    > I need to be able to lock records
    > within the very large dictionary when records are written to


    And
    > although I must complete processing in less than 90 seconds.


    And - the hole in the bottom of the hull -
    all of the above using "a slightly enhanced super dictionary".

    *Super* dictionary??? *Slightly* enhanced???
    Have you attempted any feasability tests? Are you running a Cray?

    There are many database systems available, and Python (probably) has free
    bindings to every one of them. Whichever one might choose, it would add
    simplicity, not complexity to what you are attempting. The problems you
    mention are precisely those that databases are meant to solve. The only
    tough (impossible?) requirement you have is that you don't want to use one.

    When you write that "super dictionary", be sure to post code!
    I could use one of those myself.
    Thomas Bartkus
    Thomas Bartkus, Jan 18, 2005
    #7
  8. Thomas Bartkus wrote:
    > When you write that "super dictionary", be sure to post code!
    > I could use one of those myself.


    hmmm it looks like you have just flung down the gauntlet of "put up or
    quityerwhinging". I need to get the crude implementation done first but
    I think I can do it if I can find a good XMLRPC multithreading framework.

    ---eric
    Eric S. Johansson, Jan 19, 2005
    #8
  9. On Tue, 18 Jan 2005 11:26:46 -0500, Eric S. Johansson wrote:

    > So the solutions that come to mind are some form of dictionary in shared
    > memory with locking semaphore scoreboard or a multithreaded process
    > containing a single database (Python native dictionary, metakit, gdbm??)
    > and have all of my processes speak to it using xmlrpc which leaves me
    > with the question of how to make a multithreaded server using stock
    > xmlrpc.


    Another solution might be to store the records as files in a directory,
    and use file locking to control access to the files (careful over NFS!).

    You might also consider berkeley db, which is a simple database to add to
    an application, (and which I believe supports locks), but I must admit I'm
    not a fan of the library.

    I assume that the bottleneck is processing the records, otherwise this all
    seems a bit academic.

    Jeremy
    Jeremy Sanders, Jan 19, 2005
    #9
  10. Eric S. Johansson

    Tom Loredo Guest

    Just learned of this today, so I don't know enough details to judge
    its suitability for you:

    Durus
    http://www.mems-exchange.org/software/durus/

    It does not do locking, but alleges to be compact and easy to
    understand, so perhaps you could modify it to meet your needs,
    or find some other way to handle that requirement.

    -Tom

    --

    To respond by email, replace "somewhere" with "astro" in the
    return address.
    Tom Loredo, Jan 24, 2005
    #10
  11. Eric S. Johansson

    Guest

    I agree with you, there's a crying need for something like that and
    there's no single "one obvious way to do it" answer.

    Have you looked at bsddb? See also www.sleepycat.com.
    , Jan 24, 2005
    #11
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Eric S. Johansson
    Replies:
    2
    Views:
    299
    Tim Jarman
    Jan 19, 2005
  2. Replies:
    2
    Views:
    2,091
    Daniel Magliola
    Feb 2, 2007
  3. olivier.melcher

    Help running a very very very simple code

    olivier.melcher, May 12, 2008, in forum: Java
    Replies:
    8
    Views:
    2,271
  4. Lord0
    Replies:
    4
    Views:
    386
    Arne Vajhøj
    Mar 4, 2009
  5. prakash

    Simultaneous requests using LWP

    prakash, Oct 25, 2005, in forum: Perl Misc
    Replies:
    1
    Views:
    151
    J. Gleixner
    Oct 25, 2005
Loading...

Share This Page