RE: Help me use my Dual Core CPU!

Discussion in 'Python' started by Tim Golden, Sep 12, 2006.

  1. Tim Golden

    Tim Golden Guest

    [Simon Wittber]

    | I write cross platform games in Python, and I'd really like to be able
    | to use this second core (on my machine, and on user's
    | machines) for any new games I might write.

    | I know threads won't help (in CPython at least) so I'm investigating
    | other types of concurrency which I might be able to use. I really like
    | the PyLinda approach

    I find it very elegant. I only wish I had some real-world use for it!

    | Is there any cross platform way to share python objects across
    | processes? (I've found POSH, but it's old, and doesn't appear to be
    | maintained). I could implement my own object space using
    | shared memory, but from what I can see, this is not available on
    Win32.

    There is a bunch of Python sub-industries dedicated to
    inter-process cooperation. But I don't know if that's
    what you're after. Setting aside the strict shared-memory
    option (for which I don't know of a cross-platform solution),
    you have -- in particular order:

    + Pyro - http://pyro.sf.net
    + Corba - eg omniorb http://omniorb.sourceforge.net/
    + SPyRO - http://lsc.fie.umich.mx/~sadit/spyro/spyro.html
    + mmap - (built-in module) http://docs.python.org/lib/module-mmap.html
    + twisted - (because it can do everything), esp.
    http://twistedmatrix.com/projects/core/spread
    + Spread - http://www.spread.org/
    + Roll-your-own sockets -
    http://docs.python.org/lib/module-SocketServer.html

    etc. etc. etc.

    But I have the feeling I'm teaching my grandmother... Is that
    the kind of thing you were after? Or not?

    TJG

    ________________________________________________________________________
    This e-mail has been scanned for all viruses by Star. The
    service is powered by MessageLabs. For more information on a proactive
    anti-virus service working around the clock, around the globe, visit:
    http://www.star.net.uk
    ________________________________________________________________________
     
    Tim Golden, Sep 12, 2006
    #1
    1. Advertising

  2. Tim Golden wrote:
    > + Pyro - http://pyro.sf.net
    > + Corba - eg omniorb http://omniorb.sourceforge.net/
    > + SPyRO - http://lsc.fie.umich.mx/~sadit/spyro/spyro.html
    > + mmap - (built-in module) http://docs.python.org/lib/module-mmap.html
    > + twisted - (because it can do everything), esp.
    > http://twistedmatrix.com/projects/core/spread
    > + Spread - http://www.spread.org/
    > + Roll-your-own sockets -
    > http://docs.python.org/lib/module-SocketServer.html


    For game programming purposes, I was hoping someone could point me to a
    technique for sharing objects across Python processes, preferably
    without any kind of marshal/unmarshal steps. It's a long shot, I know.
    To be viable, I'll need to be able to pass messages between processes
    very quickly. For example, It would be simple to paralellize some
    graphics calculations over two processors with each process sharing a
    common read-only scene data structure; however, a marshalling step in
    this kind of process would be too costly.

    I've used three of the libraries you mention, however, they are not
    very usable for the task I had in mind, but are of course excellent for
    other server based programming tasks.

    > But I have the feeling I'm teaching my grandmother... Is that
    > the kind of thing you were after? Or not?


    I'm not familiar with the expression 'teaching my grandmother'. What
    exactly does it mean?

    -Sw.
     
    Simon Wittber, Sep 12, 2006
    #2
    1. Advertising

  3. Tim Golden

    Paul Rubin Guest

    "Simon Wittber" <> writes:
    > For game programming purposes, I was hoping someone could point me to a
    > technique for sharing objects across Python processes, preferably
    > without any kind of marshal/unmarshal steps. It's a long shot, I know.
    > To be viable, I'll need to be able to pass messages between processes
    > very quickly. For example, It would be simple to paralellize some
    > graphics calculations over two processors with each process sharing a
    > common read-only scene data structure; however, a marshalling step in
    > this kind of process would be too costly.


    If it's for some specific calculation in the game, the bluntest
    approach is probably to write a C extension that releases the GIL to
    do the calculation, and then use ordinary threads for concurrency.
    Put appropriate locks in the object to synchronize on.
     
    Paul Rubin, Sep 12, 2006
    #3
  4. In article <>,
    Tim Golden <> wrote:
    .
    .
    .
    >| I know threads won't help (in CPython at least) so I'm investigating
    >| other types of concurrency which I might be able to use. I really like
    >| the PyLinda approach
    >
    >I find it very elegant. I only wish I had some real-world use for it!

    .
    .
    .
    Me, too. I'd love to talk over Linda with other aficionados,
    and/or hunt together for an excuse to use her/it.
     
    Cameron Laird, Sep 12, 2006
    #4
  5. Cameron Laird wrote:
    > In article <>,
    > Tim Golden <> wrote:
    > .
    > .
    > .
    >
    >> | I know threads won't help (in CPython at least) so I'm investigating
    >> | other types of concurrency which I might be able to use. I really like
    >> | the PyLinda approach
    >>
    >> I find it very elegant. I only wish I had some real-world use for it!
    >>

    > .
    > .
    > .
    > Me, too. I'd love to talk over Linda with other aficionados,
    > and/or hunt together for an excuse to use her/it.
    >

    Funny you should mention that. I've had PyLinda opened in firefox for a
    couple days waiting to be read about. I have a large distributed system
    I'm hoping to use PyLinda for, or at least I hope PyLinda will help me with.

    -c



    --

    Carl J. Van Arsdall

    Build and Release
    MontaVista Software
     
    Carl J. Van Arsdall, Sep 12, 2006
    #5
  6. On 9/13/06, Carl J. Van Arsdall <> wrote:
    > Cameron Laird wrote:
    > > Tim Golden <> wrote:


    (...)
    > >> | other types of concurrency which I might be able to use. I really like
    > >> | the PyLinda approach
    > >>


    (...)
    > > .
    > > Me, too. I'd love to talk over Linda with other aficionados,
    > > and/or hunt together for an excuse to use her/it.
    > >

    > Funny you should mention that. I've had PyLinda opened in firefox for a
    > couple days waiting to be read about. I have a large distributed system
    > I'm hoping to use PyLinda for, or at least I hope PyLinda will help me with.


    You might also want to check

    http://www.lindaspaces.com/products/NWS_overview.html

    by the guys who "invented" Linda.


    (The Oz language/Mozart system is a good example of a different and
    very neat approach to concurrency; somewhat similar Python solutions
    can be found at Kamaelia and Candygram. Links and other stuff at:

    http://codepoetics.com/wiki/index.php?title=Topics:CTM_in_other_languages#Concurrency_in_Python
    )

    --
    Ramon Diaz-Uriarte
    Spanish National Cancer Centre (CNIO)
    http://ligarto.org/rdiaz
     
    Ramon Diaz-Uriarte, Sep 13, 2006
    #6
  7. Tim Golden

    Paul Rubin Guest

    (Cameron Laird) writes:
    > Me, too. I'd love to talk over Linda with other aficionados,
    > and/or hunt together for an excuse to use her/it.


    How about an Mnesia-like database for Python? (Mnesia is an embedded
    database for Erlang programs.)

    I see in the PyLinda page that

    * 13/1/05 - Version 0.4
    * Removed SysV shared memory and made Unix Domain Sockets the
    default as they are quicker.

    That sounds to me like the shared memory version was broken somehow.
    I don't see how a socket-based approach needing context switches could
    be faster than a zero-copy, all-userspace approach.

    FWIW, I remember seeing a web page about someone else implementing
    tuplespace in Python for his home Beowulf cluster, but although it was
    written as a C extension, it didn't seem all that serious. Found it:

    http://willware.net/beowulf.pdf
    http://sourceforge.net/projects/linuxtuples
     
    Paul Rubin, Sep 13, 2006
    #7
  8. Tim Golden

    Paul Rubin Guest

    "Ramon Diaz-Uriarte" <> writes:
    > You might also want to check
    > http://www.lindaspaces.com/products/NWS_overview.html
    > by the guys who "invented" Linda.


    Cool, I guess.

    > (The Oz language/Mozart system is a good example of a different and
    > very neat approach to concurrency; somewhat similar Python solutions
    > can be found at Kamaelia and Candygram. Links and other stuff at:


    I looked at these. Oz/Mozart is a whole nother language, worth
    examining for its ideas, but the implementation is quite slow.
    Kamaelia doesn't attempt concurrency at all. Its main idea is to use
    generators to simulate microthreads. Candygram is a module that lets
    you write code in Python that's sort of like Erlang code, but it uses
    OS threads for the equivalent of Erlang processes. That misses the
    point of Erlang pretty badly, which is that processes are extremely
    lightweight (i.e. normally they are microthreads) so you can have
    millions of them active simultaneously (e.g. one for each active
    telephone connected to a big phone switch).

    Right now I want to check out GHCI (the Glasgow Haskell compiler),
    which may be the closest thing to a "winner":

    - very advanced language, even higher level than Python, once described
    by somebody as "what Python should have become"
    - native code compilation
    - lock-free concurrency using software transactional memory (STM)

    The GHCI concurrency stuff is not yet really complete (e.g. the GC
    still stops the world) but there is already a good speedup with
    multiple processors, and the STM approach experimentally outperforms
    the usual locking approach, in addition to being much less bug-prone:

    http://lambda-the-ultimate.org/node/463
    http://research.microsoft.com/users/simonpj/papers/stm/

    This approach might also be of some use in PyPy.
     
    Paul Rubin, Sep 17, 2006
    #8
  9. On 17 Sep 2006 00:55:09 -0700, Paul Rubin
    <"http://phr.cx"@nospam.invalid> wrote:
    > "Ramon Diaz-Uriarte" <51> writes:
    > > You might also want to check
    > > http://www.lindaspaces.com/products/NWS_overview.html52
    > > by the guys who "invented" Linda.

    >
    > Cool, I guess.



    I've only played a little bit with it, but it does certainly look
    nice. Nick Carriero (from network spaces) also developed a similar
    thing for R, the GNU S statistical programming language (also from the
    above url), and the demonstration I saw of it was _really_ impressive.


    > I looked at these. Oz/Mozart is a whole nother language, worth
    > examining for its ideas, but the implementation is quite slow.


    Yes, that is true. On the plus side, though, the "Concepts,
    techniques, and models of computer programming" book, by Van Roy and
    Haridi, uses Oz/Mozart, so you get a thorough, pedagogical, and
    extended ---900 pages--- "tutorial" of it. But the speed and, so far,
    limited ways of being friendly to other languages, can be show
    stoppers.

    > Kamaelia doesn't attempt concurrency at all. Its main idea is to use
    > generators to simulate microthreads. Candygram is a module that lets
    > you write code in Python that's sort of like Erlang code, but it uses
    > OS threads for the equivalent of Erlang processes. That misses the
    > point of Erlang pretty badly, which is that processes are extremely
    > lightweight (i.e. normally they are microthreads) so you can have
    > millions of them active simultaneously (e.g. one for each active
    > telephone connected to a big phone switch).


    Thanks for the clarification (I had only cursorily looked at them).

    >
    > Right now I want to check out GHCI (the Glasgow Haskell compiler),
    > which may be the closest thing to a "winner":
    >
    > - very advanced language, even higher level than Python, once described
    > by somebody as "what Python should have become"
    > - native code compilation
    > - lock-free concurrency using software transactional memory (STM)


    Thanks for this. I'll check it out!!

    Best,

    R.
     
    Ramon Diaz-Uriarte, Sep 17, 2006
    #9
  10. Paul Rubin wrote:
    > "Ramon Diaz-Uriarte" <> writes:
    > > You might also want to check
    > > http://www.lindaspaces.com/products/NWS_overview.html
    > > by the guys who "invented" Linda.

    >
    > Cool, I guess.
    >
    > > (The Oz language/Mozart system is a good example of a different and
    > > very neat approach to concurrency; somewhat similar Python solutions
    > > can be found at Kamaelia and Candygram. Links and other stuff at:

    >
    > I looked at these. Oz/Mozart is a whole nother language, worth
    > examining for its ideas, but the implementation is quite slow.
    > Kamaelia doesn't attempt concurrency at all. Its main idea is to use
    > generators to simulate microthreads.


    Regarding Kamaelia, that's not been the case for over a year now.

    We've had threaded components as well as generator based ones since
    around last July, however their API stablised properly about 4 months
    back. If you use C extensions that release the GIL and are using an OS
    that puts threads on different CPUs then you have genuine concurrency.
    (those are albeit some big caveats, but not uncommon ones in python).

    Also integrating things as a sub process is as simple instantiating a
    component that talks to the subprocess over stdin/out to the
    inbox/outbox model of Kamaelia and then just using it. Something
    concrete this is useful for:
    mencoder_options = "-ovc lavc -oac mp3lame -ffourcc DX50 -lavcopts
    acodec=mp3:vbitrate=200:abitrate=128 -vf scale=320:-2 -"
    ...# assume 'encodingfile' is defined above
    Pipeline( DVB_TuneToChannel(channel="BBC ONE",fromDemuxer="MUX1"),
    UnixProcess("mencoder -o "+encodingfile+"
    "+mencoder_options)
    ).run()

    On a dual CPU machine that code does indeed both use CPUs (as you'd
    want and expect).

    Also whilst we haven't had the chance to implement OS level process
    based components, that doesn't mean to say we're not interested in
    them, it's just that 2 people have to focus on something so we've been
    focussed on building things using the system rather than fleshing out
    the concurrently. To say we don't attempt implies that we don't want to
    go down these routes of adding in genuine concurrency. (Which is really
    why I'm replying - that's not the case - I do want to go down these
    routes, and it's more man-hours than desire that are the issue).

    Personally, I'm very much in the camp that says "shared data is
    invariably a bad idea unless you really know what you're doing"
    (largely because it's the most common source of bugs for people where
    they're trying to do more than one thing at a time). People also
    generally appear to find writing threadsafe code very hard. (not
    everyone, just the people who aren't at the top end of the bell curve
    for writing code that does more than one thing at a time)

    This is why Kamaelia is message based (ie it's a concious choice in
    favour), except for certain types of data (where we have a linda-esque
    type system for more systemic information). The reason for this is to
    help the average programmer from shooting himself in his own foot (with
    a 6 CPU-barrelled shotgun :).

    In terms of how this is *implemented* however, we have zero copying of
    data (except to/from threads at the moment) and so data is shared
    directly, but in a location the user of the system thinks its natural
    to have handoff to someone else. This approach we find tends to
    encourage arbitration of access to shared resources, which IMO is a
    good (defensive) approach to avoiding the problems people have with
    shared resources.

    But if it turns out our approach sucks for the average programmer, then
    that's a bug, so we'd have to work to fix it. And if new approaches are
    better, we'd welcome implementations since not all problems are screws
    and not all tools are hammers :) (as a result I'd also welcome people
    saying what sucks and why, but preferably based on the system as it is
    today, not as it was :)

    Have fun :)


    Michael.
     
    Michael Sparks, Sep 19, 2006
    #10
  11. Tim Golden

    Paul Rubin Guest

    "Michael Sparks" <> writes:
    > > Kamaelia doesn't attempt concurrency at all. Its main idea is to use
    > > generators to simulate microthreads.

    >
    > Regarding Kamaelia, that's not been the case for over a year now.
    >
    > We've had threaded components as well as generator based ones since
    > around last July, however their API stablised properly about 4 months
    > back. If you use C extensions that release the GIL and are using an OS
    > that puts threads on different CPUs then you have genuine concurrency.
    > (those are albeit some big caveats, but not uncommon ones in python).


    Oh neat, this is good to hear.

    > Personally, I'm very much in the camp that says "shared data is
    > invariably a bad idea unless you really know what you're doing"
    > (largely because it's the most common source of bugs for people where
    > they're trying to do more than one thing at a time). People also
    > generally appear to find writing threadsafe code very hard. (not
    > everyone, just the people who aren't at the top end of the bell curve
    > for writing code that does more than one thing at a time)


    I don't think it's that bad. Yes, free-threaded programs synchronized
    by seat-of-the-pants locking turns to a mess pretty quickly. But
    ordinary programmers write real-world applications with shared data
    all the time, namely database apps. The database server provides the
    synchronization and a good abstraction that lets the programmer not
    have to go crazy thinking about the fine points, while maintaining the
    shared data. The cost is all the query construction, the marshalling
    and demarshalling of the data from Python object formats into byte
    strings, then copying these byte strings around through OS-supplied
    IPC mechanisms involving multiple context switches and sometimes trips
    up and down network protocol stacks even when the client and server
    are on the same machine. This is just silly, and wasteful of the
    efforts of the hardworking chip designers who put that nice cache
    coherence circuitry into our CPU's, to mediate shared data access at
    the sub-instruction level so we don't need all that IPC hair.

    Basically if the hardware gods have blessed us with concurrent cpu's
    sharing memory, it's our nerdly duty to figure out how to use it. We
    need better abstractions than raw locks, but this is hardly new.
    Assembly language programmers made integer-vs-pointer aliasing or
    similar type errors all the time, so we got compiled languages with
    type consistency enforcement. Goto statements turned ancient Fortran
    code into spaghetti, so we got languages with better control
    structures. We found memory allocation bookkeeping to do by hand in
    complex programs, so we use garbage collection now. And to deal with
    shared data, transactional databases have been a very effective tool
    despite all the inefficiency mentioned above.

    Lately I've been reading about "software transactional memory" (STM),
    a scheme for treating shared memory as if it were a database, without
    using locks except for during updates. In some versions, STM
    transactions are composable, so nowhere near as bug-prone as
    fine-grained locks; and because readers don't need locks (they instead
    have to abort and restart transactions in the rare event of a
    simultaneous update) STM actually performs -faster- than traditional
    locking. I posted a couple of URL's in another thread and will try
    writing a more detailed post sometime. It is pretty neat stuff.
    There are some C libraries for it that it might be possible to port to
    Python.
     
    Paul Rubin, Sep 23, 2006
    #11
  12. Tim Golden

    Michael Guest

    Paul Rubin wrote:

    > "Michael Sparks" <> writes:
    >> > Kamaelia doesn't attempt concurrency at all. Its main idea is to use
    >> > generators to simulate microthreads.

    >>
    >> Regarding Kamaelia, that's not been the case for over a year now.
    >>
    >> We've had threaded components as well as generator based ones since
    >> around last July, however their API stablised properly about 4 months
    >> back. If you use C extensions that release the GIL and are using an OS
    >> that puts threads on different CPUs then you have genuine concurrency.
    >> (those are albeit some big caveats, but not uncommon ones in python).

    >
    > Oh neat, this is good to hear.


    :)

    Ironically it was worth mentioning because we made a number of optimisations
    earlier in the year specifically to make it such that CPU usage of Kamaelia
    systems was much lower generally speaking to allow us to take advantage of
    multiple CPUs where available for an internal project described here:
    * http://kamaelia.sourceforge.net/KamaeliaMacro.html

    A "look, but can't get" front end here:

    * http://bbc.kamaelia.org/cgi-bin/blog/blog.cgi

    Code here:

    http://svn.sourceforge.net/viewvc/k...lia/Examples/DVB_Systems/Macro.py?view=markup

    If there were (say) 4 CPUs or cores on that system, then the graphline at
    the end could become:

    Graphline(
    SOURCE=DVB_Multiplex(freq, pids["NEWS24"] +
    pids["BBC ONE"] +
    pids["CBBC"] +
    pids["BBC TWO"]+pids["EIT"],
    feparams),
    DEMUX=DVB_Demuxer({
    600: ["BBCONE"],
    601: ["BBCONE"],
    610: ["BBCTWO"],
    611: ["BBCTWO"],
    620: ["CBBC"],
    621: ["CBBC"],
    640: ["NEWS24"],
    641: ["NEWS24"],
    18: ["BBCONE","BBCTWO", "CBBC","NEWS24"],
    }),
    NEWS24 = ChannelTranscoder(service_ids["NEWS24"], **params["HI"]),
    BBCONE = ChannelTranscoder(service_ids["BBC ONE"], **params["HI"]),
    BBCTWO = ChannelTranscoder(service_ids["BBC TWO"], **params["HI"]),
    CBBC = ChannelTranscoder(service_ids["CBBC"], **params["HI"]),
    linkages={
    ("SOURCE", "outbox"):("DEMUX","inbox"),
    ("DEMUX", "NEWS24"): ("NEWS24", "inbox"),
    ("DEMUX", "BBCONE"): ("BBCONE", "inbox"),
    ("DEMUX", "BBCTWO"): ("BBCTWO", "inbox"),
    ("DEMUX", "CBBC"): ("CBBC", "inbox"),
    }
    ).run()

    And that would naturally take advantage of all 4 CPUs.

    Admittedly this is in a limited scenario right now, and is the exception not
    the rule, but does give an idea of where we'd like to end up, even if at
    the moment, like most people's machines, we default to making the most of a
    single CPU :)

    So, whilst we don't automatically parallelise your code, or magically run
    across multiple CPUs, and whether that happens really depends on how
    practical it is. (I must admit I suspect it is doable though, and will be
    something worth addressing, and I'll look at it at some point if no-one
    else does :)

    At the moment this means components explicitly working that way (such as the
    above ones do by the way the transcoder works). However I suspect explicit
    parallelisation or hinting for parallelisation (eg via baseclass, perhaps
    metaclass) will be doable and can be intuitive and practical :) I might
    be alone in believing that of course :)

    >> Personally, I'm very much in the camp that says "shared data is
    >> invariably a bad idea unless you really know what you're doing"
    >> (largely because it's the most common source of bugs for people where
    >> they're trying to do more than one thing at a time). People also
    >> generally appear to find writing threadsafe code very hard. (not
    >> everyone, just the people who aren't at the top end of the bell curve
    >> for writing code that does more than one thing at a time)

    >
    > I don't think it's that bad. Yes, free-threaded programs synchronized
    > by seat-of-the-pants locking turns to a mess pretty quickly.


    That's an agreement with my point (or rather I'm agreeing

    > But ordinary programmers write real-world applications with shared data
    > all the time, namely database apps.


    I don't call that shared data because access to the shared data is
    arbitrated by a third party - namely the database. I mean where 2 or more
    people[*] hold a lock on an object and share it - specifically the kind of
    thing you reference above as turning into a mess.

    [*] Sorry, I have a nasty habit of thinking of software as little robots or
    people. I blame usborne books of the early 80s for that :)

    > This is just silly, and wasteful of the
    > efforts of the hardworking chip designers who put that nice cache
    > coherence circuitry into our CPU's, to mediate shared data access at
    > the sub-instruction level so we don't need all that IPC hair.


    Aside from the fact it's enabled millions of programmers to deal with
    shared data by communicating with a database?

    > Basically if the hardware gods have blessed us with concurrent cpu's
    > sharing memory, it's our nerdly duty to figure out how to use it.


    It's also our duty to figure out to make it easier for the bulk of
    programmers though. After all, that's the point of an operating system or
    programming language in many respects, or can be the aim of a library :)
    (your mileage may well vary)

    Incidentally, it's worth noting that for the bulk of components (not all)
    we don't do copying between things anymore. We used to before doing
    optimisations earlier this year because we took the simplest, most
    literal implementation of the metaphor as the starting point (a postman
    picking up messages from an outbox, walking to another desk/component
    and delivering to the inbox).

    For generator based components we collapse inboxes into outboxes which means
    all that's happening when someone puts a piece of data into an outbox,
    they're simply saying "I'm no longer going to use this", and the recipient
    can use it straight away.

    This is traditional-lock free, and the at the same time encourages safety
    because of the real world metaphor - once I post something through a letter
    box, I can't do anything more with it. This also has natural performance
    benefits. Sure people can break the rules and odd things can happen, but
    the metaphor encourages people not to do that.

    (For thread based components Queue.Queues are used)

    > We need better abstractions than raw locks, but this is hardly new.
    > Assembly language programmers made integer-vs-pointer aliasing or
    > similar type errors all the time, so we got compiled languages with
    > type consistency enforcement. Goto statements turned ancient Fortran
    > code into spaghetti, so we got languages with better control
    > structures. We found memory allocation bookkeeping to do by hand in
    > complex programs, so we use garbage collection now.


    I think you're essentially agreeing in principle here or I'm agreeing with
    you in principle.

    > And to deal with
    > shared data, transactional databases have been a very effective tool
    > despite all the inefficiency mentioned above.


    I think I'm still agreeing here :)

    > Lately I've been reading about "software transactional memory" (STM),
    > a scheme for treating shared memory as if it were a database, without
    > using locks except for during updates. In some versions, STM
    > transactions are composable, so nowhere near as bug-prone as
    > fine-grained locks; and because readers don't need locks (they instead
    > have to abort and restart transactions in the rare event of a
    > simultaneous update) STM actually performs -faster- than traditional
    > locking. I posted a couple of URL's in another thread and will try
    > writing a more detailed post sometime. It is pretty neat stuff.
    > There are some C libraries for it that it might be possible to port to
    > Python.


    I've been hearing about it as well, but not digged into it. If the promises
    hold out as people hope, I'd hope to add in support into Kamaelia if it
    makes any sense (probably would because the co-ordinating assistant tracker
    is a potential location this would be helpul for).

    Whilst we're a component system, my personal aim is to make it easier for
    people to write maintainable software that uses concurrency naturally
    because its easier. OK, that might be mad, but if the side effect is trying
    things that might make people's lives easier I can live with that :)

    Interestingly, because of some of the discussions in this thread I took a
    look at Erlang. Probably due to it's similarities, in passing, to occam,
    there's some stark similarities to Kamaelia there as well - mailboxes, the
    ability for lightweight threads to hibernate, that sort of thing.

    The difference really is that I'm not claiming this is special to the
    particular language (we've got a proof of concept in C++ after all), and
    we're aiming to use metaphors that are accessible, along with a bunch of
    code as a proof of concept that we're finding useful. Whilst this is *soo*
    the wrong place to say it, I'd really like to see a C++ or Java version
    simply to see what has to change in a statically typed environment.

    I suppose the other thing is that I'm not saying we're right, just that
    we're finding it useful, and hoping others do too :) (a couple of years
    ago it was "we don't know if it works, but it might", now at least I'm at
    the stage "This works, at least for us and a few others, it might for you",
    which in my mind quite a jump :)

    If you do dig out those STM references, I'd be interested :)

    Regards,


    Michael.
     
    Michael, Sep 28, 2006
    #12
  13. Tim Golden

    Paul Rubin Guest

    Michael <> writes:
    > > But ordinary programmers write real-world applications with shared data
    > > all the time, namely database apps.

    >
    > I don't call that shared data because access to the shared data is
    > arbitrated by a third party - namely the database. I mean where 2 or
    > more people[*] hold a lock on an object and share it - specifically
    > the kind of thing you reference above as turning into a mess.


    Ehhh, I don't see a big difference between having the shared data
    arbitrated by an external process with cumbersome message passing,
    or having it arbitrated by an in-process subroutine or even by support
    built into the language. If you can go for that, I think we agree on
    most other points.


    > > This is just silly, and wasteful of the
    > > efforts of the hardworking chip designers


    > Aside from the fact it's enabled millions of programmers to deal with
    > shared data by communicating with a database?


    Well, sure, but like spreadsheets, its usefulness is that it lets
    people get non-computationally-demanding tasks (of which there are a
    lot) done with relatively little effort. More demanding tasks aren't
    so well served by spreadsheets, and lots of them are using databases
    running on massively powerful and expensive computers when they could
    get by with lighter weight communications mechanisms and thereby get
    the needed performance from much cheaper hardware. That in turn would
    let normal folks run applications that are right now only feasible for
    relatively complex businesses. If you want, I can go into why this is
    important far beyond the nerdy realm of software geekery.

    > For generator based components we collapse inboxes into outboxes
    > which means all that's happening when someone puts a piece of data
    > into an outbox, they're simply saying "I'm no longer going to use
    > this", and the recipient can use it straight away.


    But either you're copying stuff between processes, or you're running
    in-process without multiprocessor concurrency, right?

    > This is traditional-lock free,


    > > Lately I've been reading about "software transactional memory" (STM),


    > I've been hearing about it as well, but not digged into it....
    > If you do dig out those STM references, I'd be interested :)


    They're in the post you responded to:

    http://lambda-the-ultimate.org/node/463
    http://research.microsoft.com/users/simonpj/papers/stm/

    In particular, the one about the GHCI implementation is here:

    http://research.microsoft.com/users/simonpj/papers/stm/lock-free-flops06.pdf

    The Wikipedia article is also informative:

    http://en.wikipedia.org/wiki/Transactional_memory
     
    Paul Rubin, Oct 1, 2006
    #13
  14. Tim Golden

    Michael Guest

    Paul Rubin wrote:

    > Michael <> writes:
    >> > But ordinary programmers write real-world applications with shared data
    >> > all the time, namely database apps.

    >>
    >> I don't call that shared data because access to the shared data is
    >> arbitrated by a third party - namely the database. I mean where 2 or
    >> more people[*] hold a lock on an object and share it - specifically
    >> the kind of thing you reference above as turning into a mess.

    >
    > Ehhh, I don't see a big difference between having the shared data
    > arbitrated by an external process with cumbersome message passing,
    > or having it arbitrated by an in-process subroutine or even by support
    > built into the language. If you can go for that, I think we agree on
    > most other points.


    The difference from my perspective is that there are two (not mutually
    exclusive) options:
    A Have something arbitrate access and provide useful abstractions
    designed to simplify things for the user.
    B Use a lower level abstraction (eg built into the language, direct
    calling etc)

    I don't see these as mutually exclusive, except the former is aimed at
    helping the programmer, whereas the latter can be aimed at better
    performance. In which case you're back to the same sort of argument
    regarding assembler, compiled or dymanic languages are a good idea, and I'd
    always respond with "depends on the problem in hand".

    As for why you don't see much difference I can see why you think that,
    but I personally believe that with A) you can shared best practice [1],
    whereas B) means you need to be able to implement best practice.

    [1] Which is always an opinion :) (after all, once upon a time people
    thought goto was a good idea :)

    >> > This is just silly, and wasteful of the
    >> > efforts of the hardworking chip designers

    >
    >> Aside from the fact it's enabled millions of programmers to deal with
    >> shared data by communicating with a database?

    >
    > Well, sure, but like spreadsheets, its usefulness is that it lets
    > people get non-computationally-demanding tasks (of which there are a
    > lot) done with relatively little effort. More demanding tasks aren't
    > so well served by spreadsheets, and lots of them are using databases
    > running on massively powerful and expensive computers when they could
    > get by with lighter weight communications mechanisms and thereby get
    > the needed performance from much cheaper hardware. That in turn would
    > let normal folks run applications that are right now only feasible for
    > relatively complex businesses. If you want, I can go into why this is
    > important far beyond the nerdy realm of software geekery.


    I'd personally be interested to hear why you think that. I can think of
    reasons myself, but would be curious to hear yours.

    >> For generator based components we collapse inboxes into outboxes
    >> which means all that's happening when someone puts a piece of data
    >> into an outbox, they're simply saying "I'm no longer going to use
    >> this", and the recipient can use it straight away.

    >
    > But either you're copying stuff between processes, or you're running
    > in-process without multiprocessor concurrency, right?


    For generator components, that's in-process and not multiprocessor
    concurrency, yes.

    For threaded components we use Queue.Queues, which means essentially the
    reference is copied for most real world data, not the data itself.

    One step at a time I suppose really :) One option for interprocess sharing
    we're considering (since POSH looks unsupported, alpha, and untested on
    recent pythons), is to use memory mapped files. Thing is that means
    serialising everything which could be icky, so it'll have to be something
    we come back to later.

    (Much of our day to day work on Kamaelia is focussed on solving specific
    problems for work which rolls back into fleshing out the toolkit. It would
    be extremely nice to spend time on solving a particular issue that would
    benefit from optimising interprocess comms).

    If we can make kamaelia benefit from the work the hardware people have done
    for shared memory, that's great. However it's interesting to see things
    like the CELL don't tend to use shared memory, and use this style of
    communications approach. What approach will be most useful going forward?
    Dunno :) I'm only claiming we find it useful :)

    >> This is traditional-lock free,

    >
    >> > Lately I've been reading about "software transactional memory" (STM),

    >
    >> I've been hearing about it as well, but not digged into it....
    >> If you do dig out those STM references, I'd be interested :)

    >
    > They're in the post you responded to:


    Sorry, brain fart on my part. Thanks :)


    Michael.
     
    Michael, Oct 1, 2006
    #14
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. RezaA

    Dual CPU and Webgardening

    RezaA, Apr 18, 2005, in forum: ASP .Net
    Replies:
    6
    Views:
    983
    Juan T. Llibre
    Apr 19, 2005
  2. Astan Chee

    dual CPU-mode in python

    Astan Chee, Mar 5, 2006, in forum: Python
    Replies:
    1
    Views:
    318
    Rene Pijlman
    Mar 5, 2006
  3. Simon Wittber

    Help me use my Dual Core CPU!

    Simon Wittber, Sep 12, 2006, in forum: Python
    Replies:
    23
    Views:
    779
    mystilleef
    Sep 18, 2006
  4. Sanny
    Replies:
    29
    Views:
    1,020
    Twisted
    Jul 25, 2007
  5. nikhilketkar
    Replies:
    6
    Views:
    499
    Alex Martelli
    Aug 18, 2007
Loading...

Share This Page