simple pub/sub

S

Steve Howell

Hi, I'm looking for ideas on building a simple architecture that
allows a bunch of independent Python processes to exchange data using
files and perform calculations.

One Python program would be collecting data from boat instruments on a
serial port, then writing that info out to a file, and also
transmitting data to instruments as needed (again through the serial
port). It would be the most complex program in terms of having to
poll file descriptors, or something similar, but I want to limit its
job to pure communication handling.

Then, I would have a bunch of programs doing various calculations on
the input stream. It's totally acceptable if they just sleep a bit,
wake up, see if there is any new data, and if so, do some
calculations, so I will probably just use this:

http://www.dabeaz.com/generators/follow.py

Some of the "calculator" programs might be looking for trends in the
data, and when a certain threshold is reached, they will need to
notify the comm program of the new message. To avoid collisions
between all the different calculator programs, I am thinking the
simplest thing is just have them each write to their own output file.

So then the comm program has multiple data sources. It wants to track
to the serial port, but it also wants to monitor the output files from
the various calculator processes. I want to do this in a mostly
nonblocking way, so I would probably poll on the file descriptors.

The only problems are that select "...cannot be used on regular files
to determine whether a file has grown since it was last read," and it
only works on Unix for files. (I'm using Unix myself, but one of the
other programmers insists on Windows, and is resisting Python as
well.)

So a variation would be that the calculator programs talk to the comm
program via sockets, which seems slightly on the heavy side, but it is
certainly feasible.

I know that something like twisted could probably solve my problem,
but I'm also wondering if there is some solution that is a little more
lightweight and batteries-included.

At the heart of the matter, I just want the programs to be able to do
the following tasks.

1) All programs will occasionally need to write a line of text to
some data stream without having to lock it. (I don't care if the
write blocks.)
2) All programs will need to be able to do a non-blocking read of a
line of text from a data stream. (And there may be multiple
consumers.)
3) The comm program will need to be able to poll the serial port and
input data streams to see which ones are ready.

Any thoughts?
 
A

Adam Tauno Williams

Hi, I'm looking for ideas on building a simple architecture that
allows a bunch of independent Python processes to exchange data using
files and perform calculations.
One Python program would be collecting data from boat instruments on a
serial port, then writing that info out to a file, and also
transmitting data to instruments as needed (again through the serial
port). It would be the most complex program in terms of having to
poll file descriptors, or something similar, but I want to limit its
job to pure communication handling.

This should be pretty easy using multiprocessing. In OpenGroupware
Coils we have a master process
<http://coils.hg.sourceforge.net/hgweb/coils/coils/file/2c7847ef0527/src/coils-master-service.py> that spins up children (workers, that each provide a distinct service) it opens a Pipe to each child to send messages to the child and a Queue from which it reads [all children (clients) write to the Queue and listen on their Pipe]. Then the master just listens on its pipe and forwards messages from children to other children.

All children are Service objects
<http://coils.hg.sourceforge.net/hgweb/coils/coils/file/2c7847ef0527/src/coils/core/service.py>. Then implementing a new service [type of worker] is as easy as (our brutally simple pubsub service) <http://coils.hg.sourceforge.net/hgw...c7847ef0527/src/coils/logic/pubsub/service.py>

Hope that helps.
 
S

Steve Howell

Hi, I'm looking for ideas on building a simple architecture that
allows a bunch of independent Python processes to exchange data using
files and perform calculations.
One Python program would be collecting data from boat instruments on a
serial port, then writing that info out to a file, and also
transmitting data to instruments as needed (again through the serial
port).  It would be the most complex program in terms of having to
poll file descriptors, or something similar, but I want to limit its
job to pure communication handling.

This should be pretty easy using multiprocessing.  In OpenGroupware
Coils we have a master process
<http://coils.hg.sourceforge.net/hgweb/coils/coils/file/2c7847ef0527/s...> that spins up children (workers, that each provide a distinct service) it opens a Pipe to each child to send messages to the child and a Queue from which it reads [all children (clients) write to the Queue and listen on their Pipe].  Then the master just listens on its pipe and forwards messages from children to other children.

All children are Service objects
<http://coils.hg.sourceforge.net/hgweb/coils/coils/file/2c7847ef0527/s...>.  Then implementing a new service [type of worker] is as easy as (our brutally simple pubsub service) <http://coils.hg.sourceforge.net/hgweb/coils/coils/file/2c7847ef0527/s...>

Hope that helps.

It does indeed! I'm gonna try this approach for now.
 
A

Aahz

Hi, I'm looking for ideas on building a simple architecture that
allows a bunch of independent Python processes to exchange data using
files and perform calculations.

SQLite?
 
S

Steve Howell


The data is just a stream of instrument readings, so putting it into a
relational database doesn't really buy me much. So far
"multiprocessing" seems to be the way to go.
 
M

Mike Driscoll

Hi, I'm looking for ideas on building a simple architecture that
allows a bunch of independent Python processes to exchange data using
files and perform calculations.

One Python program would be collecting data from boat instruments on a
serial port, then writing that info out to a file, and also
transmitting data to instruments as needed (again through the serial
port).  It would be the most complex program in terms of having to
poll file descriptors, or something similar, but I want to limit its
job to pure communication handling.

Then, I would have a bunch of programs doing various calculations on
the input stream.  It's totally acceptable if they just sleep a bit,
wake up, see if there is any new data, and if so, do some
calculations, so I will probably just use this:

http://www.dabeaz.com/generators/follow.py

Some of the "calculator" programs might be looking for trends in the
data, and when a certain threshold is reached, they will need to
notify the comm program of the new message.  To avoid collisions
between all the different calculator programs, I am thinking the
simplest thing is just have them each write to their own output file.

So then the comm program has multiple data sources.  It wants to track
to the serial port, but it also wants to monitor the output files from
the various calculator processes.  I want to do this in a mostly
nonblocking way, so I would probably poll on the file descriptors.

The only problems are that select "...cannot be used on regular files
to determine whether a file has grown since it was last read," and it
only works on Unix for files.  (I'm using Unix myself, but one of the
other programmers insists on Windows, and is resisting Python as
well.)

So a variation would be that the calculator programs talk to the comm
program via sockets, which seems slightly on the heavy side, but it is
certainly feasible.

I know that something like twisted could probably solve my problem,
but I'm also wondering if there is some solution that is a little more
lightweight and batteries-included.

At the heart of the matter, I just want the programs to be able to do
the following tasks.

  1) All programs will occasionally need to write a line of text to
some data stream without having to lock it.  (I don't care if the
write blocks.)
  2) All programs will need to be able to do a non-blocking read of a
line of text from a data stream.  (And there may be multiple
consumers.)
  3) The comm program will need to be able to poll the serial port and
input data streams to see which ones are ready.

Any thoughts?

Did you look at the pubsub module at all? See http://pubsub.sourceforge.net/
for more information.

I use it a lot in my wxPython programs.

-------------------
Mike Driscoll

Blog: http://blog.pythonlibrary.org

PyCon 2010 Atlanta Feb 19-21 http://us.pycon.org/
 
B

bobicanprogram

Hi, I'm looking for ideas on building a simple architecture that
allows a bunch of independent Python processes to exchange data using
files and perform calculations.

One Python program would be collecting data from boat instruments on a
serial port, then writing that info out to a file, and also
transmitting data to instruments as needed (again through the serial
port). It would be the most complex program in terms of having to
poll file descriptors, or something similar, but I want to limit its
job to pure communication handling.

Then, I would have a bunch of programs doing various calculations on
the input stream. It's totally acceptable if they just sleep a bit,
wake up, see if there is any new data, and if so, do some
calculations, so I will probably just use this:

http://www.dabeaz.com/generators/follow.py

Some of the "calculator" programs might be looking for trends in the
data, and when a certain threshold is reached, they will need to
notify the comm program of the new message. To avoid collisions
between all the different calculator programs, I am thinking the
simplest thing is just have them each write to their own output file.

So then the comm program has multiple data sources. It wants to track
to the serial port, but it also wants to monitor the output files from
the various calculator processes. I want to do this in a mostly
nonblocking way, so I would probably poll on the file descriptors.

The only problems are that select "...cannot be used on regular files
to determine whether a file has grown since it was last read," and it
only works on Unix for files. (I'm using Unix myself, but one of the
other programmers insists on Windows, and is resisting Python as
well.)

So a variation would be that the calculator programs talk to the comm
program via sockets, which seems slightly on the heavy side, but it is
certainly feasible.

I know that something like twisted could probably solve my problem,
but I'm also wondering if there is some solution that is a little more
lightweight and batteries-included.

At the heart of the matter, I just want the programs to be able to do
the following tasks.

1) All programs will occasionally need to write a line of text to
some data stream without having to lock it. (I don't care if the
write blocks.)
2) All programs will need to be able to do a non-blocking read of a
line of text from a data stream. (And there may be multiple
consumers.)
3) The comm program will need to be able to poll the serial port and
input data streams to see which ones are ready.

Any thoughts?


You should definitely check out the SIMPL toolkit (http://
www.icanprogram.com/simpl). It has some pretty mature Python hooks.
SIMPL has been used to construct data acquisition applications not
unlike the one you describe.

bob
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,534
Members
45,008
Latest member
Rahul737

Latest Threads

Top