Question about 'remote objects'

F

Frank Millman

Hi all

I am writing a multi-user business/accounting application. It is getting
rather complex and I am looking at how to, not exactly simplify it, but find
a way to manage the complexity.

I have realised that it is logically made up of a number of services -
database service with connection to database
workflow engine for business processes
services manager to handle automated services, such as web services
client manager to service logins from client workstations
possibly others

I have made a start by splitting some of these off into separate modules and
running them in their own threads.

I am concerned about scalability if they are all running on the same
machine, so I am looking into how to enable these services to run on
separate servers if required.

My first thought was to look into Pyro. It seems quite nice. One concern I
had was that it creates a separate thread for each object made available by
the server. My database server creates separate objects for each instance of
a row read in from the database, and with multiple users running multiple
applications, with each one opening multiple tables, this could run into
hundreds, so I was not sure if that would work.

Then I read that the multiprocessing module allows processes to be spread
across multiple servers. The documentation is not as clear as Pyro's, but it
looks as if it could do what I want. I assume it would use processes rather
than threads to make multiple objects available, but I don't know if there
is a practical limit.

Then I thought that, instead of the database server exposing each object
remotely, I could create one 'proxy' object on the server through which all
clients would communicate, and it in turn would communicate with each
instance locally.

That felt more managable, but then I thought - why bother with remote
objects at all? Why not just run a SocketServer on the database server, and
design a mini-protocol to allow clients to make requests and receive
results. This is a technology I am already comfortable with, as this is how
I handle client workstation logins. If I did go this route, I could apply
the same principle to all the services.

I don't have the experience to make an informed decision at this point, so I
thought I would see if there is any consensus on the best way to go from
here.

Is there any particular benefit in using remote objects as opposed to
writing a SocketServer?

Any advice will be much appreciated.

Thanks

Frank Millman
 
J

J Kenneth King

Frank Millman said:
Hi all

I am writing a multi-user business/accounting application. It is getting
rather complex and I am looking at how to, not exactly simplify it, but find
a way to manage the complexity.

I have realised that it is logically made up of a number of services -
database service with connection to database
workflow engine for business processes
services manager to handle automated services, such as web services
client manager to service logins from client workstations
possibly others

I have made a start by splitting some of these off into separate modules and
running them in their own threads.

I am concerned about scalability if they are all running on the same
machine, so I am looking into how to enable these services to run on
separate servers if required.

Have you finished the application already?

At my job we're still serving just over 1M+ web requests (a month),
processing 15k+ uploads, and searching through over 5M+ database records
a day. We're still running on 3 boxes. You can get a lot out of your
machines before you have to think about the complex task of
scaling/distributing.

My first thought was to look into Pyro. It seems quite nice. One concern I
had was that it creates a separate thread for each object made available by
the server. My database server creates separate objects for each instance of
a row read in from the database, and with multiple users running multiple
applications, with each one opening multiple tables, this could run into
hundreds, so I was not sure if that would work.

It probably will work.

Pyro is a very nice framework and one that I've built a few applications
on. It has a lot of flexible design patterns available. Just look in
the examples included with the distribution.
Then I read that the multiprocessing module allows processes to be spread
across multiple servers. The documentation is not as clear as Pyro's, but it
looks as if it could do what I want. I assume it would use processes rather
than threads to make multiple objects available, but I don't know if there
is a practical limit.

There is a theoretical limit to all of the resources on a machine.
Threads don't live outside of that limit. They just have a speedier
start-up time and are able to communicate with one another in a single
process. It doesn't sound like that will buy you a whole lot in your
application.

You can spawn as many processes as you need.
Then I thought that, instead of the database server exposing each object
remotely, I could create one 'proxy' object on the server through which all
clients would communicate, and it in turn would communicate with each
instance locally.

That felt more managable, but then I thought - why bother with remote
objects at all? Why not just run a SocketServer on the database server, and
design a mini-protocol to allow clients to make requests and receive
results. This is a technology I am already comfortable with, as this is how
I handle client workstation logins. If I did go this route, I could apply
the same principle to all the services.

Because unless you wrote your own database or are using some arcane
relic, it should already have its own configurable socket interface?
I don't have the experience to make an informed decision at this point, so I
thought I would see if there is any consensus on the best way to go from
here.

Finish building the application.

Do the benchmarks. Profile. Optimize.

Find the clear boundaries of each component.

Build an API along those boundaries.

Add a network layer in front of the boundaries. Pyro is a good choice,
twisted is also good. Roll your own if you think you can do better or
it would fit your projects' needs.
Is there any particular benefit in using remote objects as opposed to
writing a SocketServer?

Abstraction. Pyro is just an abstraction over an RPC mechanism.
Nothing special about it. Twisted has libraries to do the same thing.
Writing your own socket-level code can be messy if you don't do it
right.
Any advice will be much appreciated.

Thanks

Frank Millman

Best of luck.
 
D

Diez B. Roggisch

I am writing a multi-user business/accounting application. It is getting
rather complex and I am looking at how to, not exactly simplify it, but
find a way to manage the complexity.

I have realised that it is logically made up of a number of services -
database service with connection to database
workflow engine for business processes
services manager to handle automated services, such as web services
client manager to service logins from client workstations
possibly others

I have made a start by splitting some of these off into separate modules
and running them in their own threads.

I wouldn't do that. Creating threads (or processes) with potentially
interacting components ramps up complexity a great deal, with little if any
benefit at your current stage, and only a vague possibility that scaling
issues appear and can be remedied by that.

Instead, use threading or multi-processing to create various instances of
your application that synchronize only over the DB, using locks where it is
needed.

Introducing RPC of whatever kind to your design will make you lose a lot of
the power and flexibility code-wise, because all of a sudden you can only
pass data, not behavior around.

And as J Kenneth already said, deal with performance issues when the crop
up.

At work, we had a design with a whole bunch of separated XMLRPC-connected
services, all of them restricting their access to only certain sub-schemas
of the DB. This was done so that we could run them on separate servers if
we wanted, with several databases. The creators of that system had the same
fears as you.

Guess what? Most of the time the system spend in serializing and
de-serializing XML for making RPC-calls. We had no referential integrity
between schemas, and no single transactions around HTTP-requests, which
didn't exactly make crap out of our data, but the occasional hickup was in
there. And through the limited RCP-interface, a great deal of code
consisted of passing around dicts, lists and strings - with no rich
OO-interface of whatever kind.

Once we got rid of these premature optimizations, the system improved in
performance, and the code-base was open to a *lot* of cleaning up that is
still under way, but already massively improved the design.

Diez
 
I

Irmen de Jong

My first thought was to look into Pyro. It seems quite nice. One concern I
had was that it creates a separate thread for each object made available by
the server.

It doesn't. Pyro creates a thread for every active proxy connection.
You can register thousands of objects on the server, as long as your
client programs only access a fraction of those at the same time you
will have as many threads as there are proxies in your client programs.

This behavior can be tuned a little as well:
- you can tell Pyro to not use threading at all
(that will hurt concurrency a lot though)
- you can limit the number of proxies that can be connected
to the daemon at a time.

Then I thought that, instead of the database server exposing each object
remotely, I could create one 'proxy' object on the server through which all
clients would communicate, and it in turn would communicate with each
instance locally.

I think that this is the better design in general: access large amounts
of remote objects not individually, but as a batch. Lots of small remote
calls are slow. A few larger calls are more efficient.
Is there any particular benefit in using remote objects as opposed to
writing a SocketServer?

It saves you reinventing the wheel and dealing with all its problems
again, problems that have been solved already in existing remote object
libraries such as Pyro. Think about it: do you want to spend time
implementing a stable, well defined communication protocol, or do you
want to spend time building your actual application logic?

Regards,
Irmen.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,483
Members
44,902
Latest member
Elena68X5

Latest Threads

Top