Adding behaviour for managing "task" dependencies

M

m.pricejones

Hi,

I'm currently writing an animation pipeline in Python which is a
system for controlling the flow of work and assets for a team of
people working on a computer animated film. The system will be fairly
large with a database backend.

One particular problem that I'd like to address is the need for
managing dependent tasks. Often you need to process one particular set
of information before starting on another so when you batch them all
together and send them off to a set of computers to process you need
some way of formulating the dependencies. This preferably ends up
looking a lot like a Makefile in the end with syntax a bit like:

task task_name (dependencies):
commands to complete the task
...

Where the dependencies are a list of other tasks that need to be
completed first.

However I'd really like to do it in Python, but I'm thinking I'd might
need to extend Python a bit in order to achieve the new syntax. I've
seen attempts to do this within the Python syntax (Scons and buildIt)
and I'm not a big fan of the way it ends up looking. I've worked at a
place that has written it's own language to handle that sort of thing
but then you end up with a language that is good for that but rubbish
at everything else. Python seems like a good basis if I could tweak it
slightly.

One particular point that interests me is the idea of maintaining
compatibility with Python modules so you still have all the
functionality. This makes me think of the "from __future__ import ..."
statements which, if I understand them correctly, can introduce new
syntax like the with_statement, whilst still maintaining compatibility
with older modules?

Is this correct? Can anyone write a syntax changing module or is
__future__ a hard coded special case? I realise I'll have to get into
the C side of things for this. Are there other approaches to this that
I really should be considering instead?

Any thoughts would be most appreciated, though I would like to stress
that I don't think Python should support the syntax I'm proposing I'd
just like to know if I can extend a copy of it to do that.

Mike
 
S

Stargaming

Hi,

I'm currently writing an animation pipeline in Python which is a system
for controlling the flow of work and assets for a team of people working
on a computer animated film. The system will be fairly large with a
database backend.

One particular problem that I'd like to address is the need for managing
dependent tasks. Often you need to process one particular set of
information before starting on another so when you batch them all
together and send them off to a set of computers to process you need
some way of formulating the dependencies. This preferably ends up
looking a lot like a Makefile in the end with syntax a bit like:

task task_name (dependencies):
commands to complete the task
...

Where the dependencies are a list of other tasks that need to be
completed first.

However I'd really like to do it in Python, but I'm thinking I'd might
need to extend Python a bit in order to achieve the new syntax. I've
seen attempts to do this within the Python syntax (Scons and buildIt)
and I'm not a big fan of the way it ends up looking. I've worked at a
place that has written it's own language to handle that sort of thing
but then you end up with a language that is good for that but rubbish at
everything else.

Doesn't seem too bad if you just want to write down dependencies. Better
have one tool/language doing it's job perfectly than having a big monster
tool doing lots of jobs pretty bad.

This is also sometimes called domain-specific languages or `language
oriented programming <http://www.onboard.jetbrains.com/is1/articles/04/10/
lop/>`_.
Python seems like a good basis if I could tweak it slightly.

One particular point that interests me is the idea of maintaining
compatibility with Python modules so you still have all the
functionality.

If you disagreed with my statements above, that's a good point. Would
make deserialization of those nearly-Makefiles easier, too.
This makes me think of the "from __future__ import ..."
statements which, if I understand them correctly, can introduce new
syntax like the with_statement, whilst still maintaining compatibility
with older modules?

I might be wrong here but AFAIK those `__future__` imports just set some
'compiler' flags. The documentation isn't saying much about the treatment
of those statements but I guess they're handled internally (since
__future__.py doesn't seem to do anything at all). See the `module
documentation <http://docs.python.org/lib/module-future.html>`_ for
details.
Is this correct? Can anyone write a syntax changing module or is
__future__ a hard coded special case? I realise I'll have to get into
the C side of things for this. Are there other approaches to this that I
really should be considering instead?

Syntax (as in: grammar) is compiled into the python binary. You can
change the python source (as it's free) and recompile but, FWIW, I would
not suggest this. You're _forking_ CPython then and well, this would be
an additional project to maintain (merging all updates, fixes, releases
into your fork etc.).

`PyPy <http://codespeak.net/pypy/>`_ could be slightly easier to adjust
but still: extra project.

Another approach could be `Logix <http://livelogix.net/logix/>`_. It's an
"alternate front-end for Python" with the ability to change its syntax on-
the-fly. I don't know how actively maintained it is, though.
Any thoughts would be most appreciated, though I would like to stress
that I don't think Python should support the syntax I'm proposing I'd
just like to know if I can extend a copy of it to do that.

As stated above, there are several ways to change Python to support the
syntax you want but hey, why do you need this? As far as I understood
you, you want to use *normal* python functions and add dependencies to
them. First thing popping into my mind is a simple function eg. ``depends
(source, target)`` or ``depends(target)(source)``. The second example
would have the advantage of adapting neatly into the decorator syntax::

def early_task():
foo()

@depends(early_task)
def late_task():
bar()

The wrapping `depends` function could decide whether `early_task` is done
already or not (and invoke it, if neccessary) and finally pass through
the call to `late_task`.

This is just an example how you could use existing syntax to modelize
dependencies, there are other ways I guess.

HTH,
Stargaming
 
M

m.pricejones

Stargaming:

Thanks, that's given me plenty to think about. Some wise words. I
guess I should appreciate that with my particular goal there aren't
going to be easy solutions but I definitely don't want to dive down
the wrong track if it can be avoided.

Cheers,
Mike
 
P

Paddy

Hi,

I'm currently writing an animation pipeline in Python which is a
system for controlling the flow of work and assets for a team of
people working on a computer animated film. The system will be fairly
large with a database backend.

One particular problem that I'd like to address is the need for
managing dependent tasks. Often you need to process one particular set
of information before starting on another so when you batch them all
together and send them off to a set of computers to process you need
some way of formulating the dependencies. This preferably ends up
looking a lot like a Makefile in the end with syntax a bit like:

task task_name (dependencies):
commands to complete the task
...

Where the dependencies are a list of other tasks that need to be
completed first.

However I'd really like to do it in Python, but I'm thinking I'd might
need to extend Python a bit in order to achieve the new syntax. I've
seen attempts to do this within the Python syntax (Scons and buildIt)
and I'm not a big fan of the way it ends up looking. I've worked at a
place that has written it's own language to handle that sort of thing
but then you end up with a language that is good for that but rubbish
at everything else. Python seems like a good basis if I could tweak it
slightly.

One particular point that interests me is the idea of maintaining
compatibility with Python modules so you still have all the
functionality. This makes me think of the "from __future__ import ..."
statements which, if I understand them correctly, can introduce new
syntax like the with_statement, whilst still maintaining compatibility
with older modules?

Is this correct? Can anyone write a syntax changing module or is
__future__ a hard coded special case? I realise I'll have to get into
the C side of things for this. Are there other approaches to this that
I really should be considering instead?

Any thoughts would be most appreciated, though I would like to stress
that I don't think Python should support the syntax I'm proposing I'd
just like to know if I can extend a copy of it to do that.

Mike

You could use a professional Job scheduling system such as LSF or Suns
Grid Engine that both support job dependencies (and a whole lot more).

For example, search for 'Job Dependencies' on this page:
http://docs.sun.com/app/docs/doc/820-0699/6nce0ht7s?a=view

- Paddy.
 
B

Bruno Desthuilliers

David a écrit :
You can use syntax like this:

class MyJob1(Job):
depends(MyJob2)
depends(MyJob3)
(snip)
(where 'depends' is a DSL-like construct. See Elixir
(elixir.ematia.de) for an example of how to implement DSL statements
like "depends". Check their implementation of the "belongs_to"
statement.

+1 on this. I've payed a bit with elixir's 'statements' API, and while
it's relying on a (in)famous hack, it does the job.

(snip)
 
R

rayrayson

D

David

Any thoughts would be most appreciated, though I would like to stress
that I don't think Python should support the syntax I'm proposing I'd
just like to know if I can extend a copy of it to do that.

You can use syntax like this:

class MyJob1(Job):
depends(MyJob2)
depends(MyJob3)

Or with quotes (if MyJob2 and MyJob3 could be declared later):

class MyJob1(Job):
depends('MyJob2')
depends('MyJob3')

(where 'depends' is a DSL-like construct. See Elixir
(elixir.ematia.de) for an example of how to implement DSL statements
like "depends". Check their implementation of the "belongs_to"
statement.

You could also extend your "depends" dsl statement to allow more than
one dep at a time, eg:

class MyJob1(Job):
depends('MyJob2', 'MyJob3')

Whatever approach you use, you should also look into implementing
"topological sort" logic. This lets you resolve programatically which
order the inter-dependent tasks should be handled in to satisfy their
dependencies.

You may find this module interesting in this regard:

http://pypi.python.org/pypi/topsort

David.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,269
Messages
2,571,100
Members
48,773
Latest member
Kaybee

Latest Threads

Top