Question about 'remote objects'

Discussion in 'Python' started by Frank Millman, Dec 9, 2009.

  1. Hi all

    I am writing a multi-user business/accounting application. It is getting
    rather complex and I am looking at how to, not exactly simplify it, but find
    a way to manage the complexity.

    I have realised that it is logically made up of a number of services -
    database service with connection to database
    workflow engine for business processes
    services manager to handle automated services, such as web services
    client manager to service logins from client workstations
    possibly others

    I have made a start by splitting some of these off into separate modules and
    running them in their own threads.

    I am concerned about scalability if they are all running on the same
    machine, so I am looking into how to enable these services to run on
    separate servers if required.

    My first thought was to look into Pyro. It seems quite nice. One concern I
    had was that it creates a separate thread for each object made available by
    the server. My database server creates separate objects for each instance of
    a row read in from the database, and with multiple users running multiple
    applications, with each one opening multiple tables, this could run into
    hundreds, so I was not sure if that would work.

    Then I read that the multiprocessing module allows processes to be spread
    across multiple servers. The documentation is not as clear as Pyro's, but it
    looks as if it could do what I want. I assume it would use processes rather
    than threads to make multiple objects available, but I don't know if there
    is a practical limit.

    Then I thought that, instead of the database server exposing each object
    remotely, I could create one 'proxy' object on the server through which all
    clients would communicate, and it in turn would communicate with each
    instance locally.

    That felt more managable, but then I thought - why bother with remote
    objects at all? Why not just run a SocketServer on the database server, and
    design a mini-protocol to allow clients to make requests and receive
    results. This is a technology I am already comfortable with, as this is how
    I handle client workstation logins. If I did go this route, I could apply
    the same principle to all the services.

    I don't have the experience to make an informed decision at this point, so I
    thought I would see if there is any consensus on the best way to go from
    here.

    Is there any particular benefit in using remote objects as opposed to
    writing a SocketServer?

    Any advice will be much appreciated.

    Thanks

    Frank Millman
     
    Frank Millman, Dec 9, 2009
    #1
    1. Advertising

  2. "Frank Millman" <> writes:

    > Hi all
    >
    > I am writing a multi-user business/accounting application. It is getting
    > rather complex and I am looking at how to, not exactly simplify it, but find
    > a way to manage the complexity.
    >
    > I have realised that it is logically made up of a number of services -
    > database service with connection to database
    > workflow engine for business processes
    > services manager to handle automated services, such as web services
    > client manager to service logins from client workstations
    > possibly others
    >
    > I have made a start by splitting some of these off into separate modules and
    > running them in their own threads.
    >
    > I am concerned about scalability if they are all running on the same
    > machine, so I am looking into how to enable these services to run on
    > separate servers if required.


    Have you finished the application already?

    At my job we're still serving just over 1M+ web requests (a month),
    processing 15k+ uploads, and searching through over 5M+ database records
    a day. We're still running on 3 boxes. You can get a lot out of your
    machines before you have to think about the complex task of
    scaling/distributing.


    > My first thought was to look into Pyro. It seems quite nice. One concern I
    > had was that it creates a separate thread for each object made available by
    > the server. My database server creates separate objects for each instance of
    > a row read in from the database, and with multiple users running multiple
    > applications, with each one opening multiple tables, this could run into
    > hundreds, so I was not sure if that would work.


    It probably will work.

    Pyro is a very nice framework and one that I've built a few applications
    on. It has a lot of flexible design patterns available. Just look in
    the examples included with the distribution.

    >
    > Then I read that the multiprocessing module allows processes to be spread
    > across multiple servers. The documentation is not as clear as Pyro's, but it
    > looks as if it could do what I want. I assume it would use processes rather
    > than threads to make multiple objects available, but I don't know if there
    > is a practical limit.


    There is a theoretical limit to all of the resources on a machine.
    Threads don't live outside of that limit. They just have a speedier
    start-up time and are able to communicate with one another in a single
    process. It doesn't sound like that will buy you a whole lot in your
    application.

    You can spawn as many processes as you need.

    >
    > Then I thought that, instead of the database server exposing each object
    > remotely, I could create one 'proxy' object on the server through which all
    > clients would communicate, and it in turn would communicate with each
    > instance locally.
    >
    > That felt more managable, but then I thought - why bother with remote
    > objects at all? Why not just run a SocketServer on the database server, and
    > design a mini-protocol to allow clients to make requests and receive
    > results. This is a technology I am already comfortable with, as this is how
    > I handle client workstation logins. If I did go this route, I could apply
    > the same principle to all the services.


    Because unless you wrote your own database or are using some arcane
    relic, it should already have its own configurable socket interface?

    >
    > I don't have the experience to make an informed decision at this point, so I
    > thought I would see if there is any consensus on the best way to go from
    > here.


    Finish building the application.

    Do the benchmarks. Profile. Optimize.

    Find the clear boundaries of each component.

    Build an API along those boundaries.

    Add a network layer in front of the boundaries. Pyro is a good choice,
    twisted is also good. Roll your own if you think you can do better or
    it would fit your projects' needs.

    > Is there any particular benefit in using remote objects as opposed to
    > writing a SocketServer?


    Abstraction. Pyro is just an abstraction over an RPC mechanism.
    Nothing special about it. Twisted has libraries to do the same thing.
    Writing your own socket-level code can be messy if you don't do it
    right.

    >
    > Any advice will be much appreciated.
    >
    > Thanks
    >
    > Frank Millman


    Best of luck.
     
    J Kenneth King, Dec 9, 2009
    #2
    1. Advertising

  3. > I am writing a multi-user business/accounting application. It is getting
    > rather complex and I am looking at how to, not exactly simplify it, but
    > find a way to manage the complexity.
    >
    > I have realised that it is logically made up of a number of services -
    > database service with connection to database
    > workflow engine for business processes
    > services manager to handle automated services, such as web services
    > client manager to service logins from client workstations
    > possibly others
    >
    > I have made a start by splitting some of these off into separate modules
    > and running them in their own threads.


    I wouldn't do that. Creating threads (or processes) with potentially
    interacting components ramps up complexity a great deal, with little if any
    benefit at your current stage, and only a vague possibility that scaling
    issues appear and can be remedied by that.

    Instead, use threading or multi-processing to create various instances of
    your application that synchronize only over the DB, using locks where it is
    needed.

    Introducing RPC of whatever kind to your design will make you lose a lot of
    the power and flexibility code-wise, because all of a sudden you can only
    pass data, not behavior around.

    And as J Kenneth already said, deal with performance issues when the crop
    up.

    At work, we had a design with a whole bunch of separated XMLRPC-connected
    services, all of them restricting their access to only certain sub-schemas
    of the DB. This was done so that we could run them on separate servers if
    we wanted, with several databases. The creators of that system had the same
    fears as you.

    Guess what? Most of the time the system spend in serializing and
    de-serializing XML for making RPC-calls. We had no referential integrity
    between schemas, and no single transactions around HTTP-requests, which
    didn't exactly make crap out of our data, but the occasional hickup was in
    there. And through the limited RCP-interface, a great deal of code
    consisted of passing around dicts, lists and strings - with no rich
    OO-interface of whatever kind.

    Once we got rid of these premature optimizations, the system improved in
    performance, and the code-base was open to a *lot* of cleaning up that is
    still under way, but already massively improved the design.

    Diez
     
    Diez B. Roggisch, Dec 9, 2009
    #3
  4. On 9-12-2009 13:56, Frank Millman wrote:
    > My first thought was to look into Pyro. It seems quite nice. One concern I
    > had was that it creates a separate thread for each object made available by
    > the server.


    It doesn't. Pyro creates a thread for every active proxy connection.
    You can register thousands of objects on the server, as long as your
    client programs only access a fraction of those at the same time you
    will have as many threads as there are proxies in your client programs.

    This behavior can be tuned a little as well:
    - you can tell Pyro to not use threading at all
    (that will hurt concurrency a lot though)
    - you can limit the number of proxies that can be connected
    to the daemon at a time.


    > Then I thought that, instead of the database server exposing each object
    > remotely, I could create one 'proxy' object on the server through which all
    > clients would communicate, and it in turn would communicate with each
    > instance locally.


    I think that this is the better design in general: access large amounts
    of remote objects not individually, but as a batch. Lots of small remote
    calls are slow. A few larger calls are more efficient.

    > Is there any particular benefit in using remote objects as opposed to
    > writing a SocketServer?


    It saves you reinventing the wheel and dealing with all its problems
    again, problems that have been solved already in existing remote object
    libraries such as Pyro. Think about it: do you want to spend time
    implementing a stable, well defined communication protocol, or do you
    want to spend time building your actual application logic?

    Regards,
    Irmen.
     
    Irmen de Jong, Dec 9, 2009
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. 7stud
    Replies:
    11
    Views:
    700
    Dennis Lee Bieber
    Mar 20, 2007
  2. davidj411
    Replies:
    7
    Views:
    3,283
    Tim Golden
    Oct 8, 2009
  3. Tim Chandler
    Replies:
    0
    Views:
    218
    Tim Chandler
    Oct 7, 2003
  4. Jeff Wood
    Replies:
    7
    Views:
    198
    Jeff Wood
    Oct 18, 2005
  5. Markus Arike
    Replies:
    2
    Views:
    101
    Markus Arike
    Aug 19, 2008
Loading...

Share This Page