Erlang style processes for Python

Discussion in 'Python' started by Kay Schluehr, May 10, 2007.

  1. Kay Schluehr

    Kay Schluehr Guest

    Every once in a while Erlang style [1] message passing concurrency [2]
    is discussed for Python which does not only imply Stackless tasklets
    [3] but also some process isolation semantics that lets the runtime
    easily distribute tasklets ( or logical 'processes' ) across physical
    processes. Syntactically a tasklet might grow out of a generator by
    reusing the yield keyword for sending messages:

    yield_expr : 'yield' ([testlist] | testlist 'to' testlist)

    where the second form is specific for tasklets ( one could also use a
    new keyword like "emit" if this becomes confusing - the semantics is
    quite different ) and the addition of a new keyword for assigning the
    "mailbox" e.g:

    required_stmt: 'required' ':' suite

    So tasklets could be identified on a lexical level ( just like
    generators today ) and compiled accordingly. I just wonder about
    sharing semantics. Would copy-on-read / copy-on-write and new opcodes
    be needed? What would happen when sharing isn't dropped at all but
    when the runtime moves a tasklet around into another OS level thread /
    process it will be pickled and just separated on need? I think it
    would be cleaner to separate it completely but what are the costs?

    What do you think?

    [1] http://en.wikipedia.org/wiki/Erlang_programming_language
    [2] http://en.wikipedia.org/wiki/Actor_model
    [3] http://www.stackless.com/
     
    Kay Schluehr, May 10, 2007
    #1
    1. Advertisements

  2. Kay Schluehr

    Jacob Lee Guest

    Funny enough, I'm working on a project right now that is designed for
    exactly that: PARLEY, http://osl.cs.uiuc.edu/parley . (An announcement
    should show up in clp-announce as soon as the moderators release it). My
    essential thesis is that syntactic sugar should not be necessary -- that a
    nice library would be sufficient. I do admit that Erlang's pattern
    matching would be nice, although you can get pretty far by using uniform
    message formats that can easily be dispatched on -- the tuple
    (tag, sender, args, kwargs)
    in the case of PARLEY, which maps nicely to instance methods of a
    dispatcher class.

    The questions of sharing among multiple physical processes is interesting.
    Implicit distribution of actors may not even be necessary if it is easy
    enough for two hosts to coordinate with each other. In terms of the
    general question of assigning actors to tasklets, threads, and processes,
    there are added complications in terms of the physical limitations of
    Python and Stackless Python:
    - because of the GIL, actors in the same process do not gain the
    advantag of true parallel computation
    - all tasklet I/O has to be non-blocking
    - tasklets are cooperative, while threads are preemptive
    - communication across processes is slower, has to be serialized, etc.
    - using both threads and tasklets in a single process is tricky

    PARLEY currently only works within a single process, though one can choose
    to use either tasklets or threads. My next goal is to figure out I/O, at
    which point I get to tackle the fun question of distribution.

    So far, I've not run into any cases where I've wanted to change the
    interpreter, though I'd be interested in hearing ideas in this direction
    (especially with PyPy as such a tantalizing platform!).
     
    Jacob Lee, May 10, 2007
    #2
    1. Advertisements

  3. Kay Schluehr

    Kay Schluehr Guest

    Synsugar is helpfull when you want to control compiler actions. Of
    course you can do this also by means of __special__ attributes but I
    guess this becomes clutter when you work with certain exposed sections
    in the code.
    Yes, I do think so too. It is more interesting to think about what
    might be qualify as a message. Destructuring it is not hard in anyway
    and I do also have a few concerns with naive pattern matching:

    http://www.fiber-space.de/EasyExtend/doc/gallery/gallery.html#4._Chainlets_and_the_switch-statement
    Actors don't need locking primitives since their data is locked by
    virtue of the actors definition. That's also why I'm in favour for a
    runtime / compiler based solution. Within the shiny world of actors
    and actresses the GIL has no place. So a thread that runs actors only,
    does not need to be blocked or block other threads - at least not for
    data locking purposes. It is used much like an OS level process with
    better sharing capabilities ( for mailbox addresses and messages ).
    Those threads shall not take part of the access/release GIL game. They
    might also not be triggered explicitely using the usual threading
    API.
    I guess you mean tantalizing in both of its meanings ;)

    Good luck and inform us when you find interesting results.

    Kay
     
    Kay Schluehr, May 10, 2007
    #3
  4. Kay Schluehr

    jkn Guest

    jkn, May 10, 2007
    #4
  5. Kay Schluehr

    Jacob Lee Guest

    I did look at Candygram. I wasn't so keen on the method of dispatch (a
    dictionary of lambdas that is passed to the receive function). It also
    only works with threads and doesn't communicate across processes.

    I definitely used Candygram as a reference point when determining what
    features to hoist from Erlang.
     
    Jacob Lee, May 10, 2007
    #5
  6. Kay Schluehr

    Jacob Lee Guest

    Interesting. Scala's pattern matching also looks nice. They have a
    construct called a "case class" which is sort of like an algebraic data
    type in that == compares the actual internal structure of the objects...
    come to think of it, it reminds me of the proposal for named tuples that
    floated around one of the python lists recently.
    There are also a lot of places where Python implicitly shares data, though.
    Global variables are one -- if you disallow those, then each actor has
    to have its own copy of all imported modules. I think the GC is also not
    at all threadsafe. I'm not familiar enough with how the interpreter works
    to judge whether disallowing shared memory would make any of the existing
    obstacles to removing the GIL easier to deal with.

    Certainly, if it's doable, it would be a big win to tackle these problems.
    Thanks!
     
    Jacob Lee, May 10, 2007
    #6
  7. Kay Schluehr

    Michael Guest

    Have you seen Kamaelia? Some people have noted that Kamaelia seems to have a
    number of similarities to Erlang's model, which seems to come from a common
    background knowledge. (Kamaelia's model is based on a blending of what I
    know from a very basic recasting of CSP, Occam, unix pipelines and async
    hardware verification).

    Home:
    http://kamaelia.sourceforge.net/Home

    Intros:
    http://kamaelia.sourceforge.net/Introduction
    http://kamaelia.sourceforge.net/t/TN-LinuxFormat-Kamaelia.pdf
    http://www.bbc.co.uk/rd/pubs/whp/whp113.shtml
    * http://kamaelia.sourceforge.net/t/TN-LightTechnicalIntroToKamaelia.pdf
    http://kamaelia.sourceforge.net/Docs/NotationForVisualisingAxon

    The one *'d is perhaps the best at the moment.

    Detail:
    http://kamaelia.sourceforge.net/Cookbook
    http://kamaelia.sourceforge.net/Components


    Michael.
     
    Michael, May 11, 2007
    #7
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.