Embedding multiple interpreters

Discussion in 'Python' started by Garthy, Dec 6, 2013.

  1. Garthy

    Garthy Guest

    Hi!

    I hope I've got the right list here- there were a few to choose from. :}

    I am trying to embed Python with multiple interpreters into an existing
    application. I have things working fine with a single interpreter thus
    far. I am running into problems when using multiple interpreters [1] and
    I am presently trying to track down these issues. Can anyone familiar
    with the process of embedding multiple interpreters have a skim of the
    details below and let me know of any obvious problems? If I can get the
    essentials right, then presumably it's just a matter of my tracking down
    any problems with my code.

    I am presently using Python 3.3.3.

    What I am after:

    - Each sub-interpreter will have its own dedicated thread. Each thread
    will have no more than one sub-interpreter. Basically, there is a
    one-to-one mapping between threads and interpreters (some threads are
    unrelated to Python though).

    - The default interpreter in the main thread will never be used,
    although I can explicitly use it if it'll help in some way.

    - Each thread is created and managed outside of Python. This can't be
    readily changed.

    - I have a single internal module I need to be able to use for each
    interpreter.

    - I load scripts into __main__ and create objects from it to bootstrap.

    - I understand that for the most part only a single interpreter will be
    running at a time due to the GIL. This is unfortunate but not a major
    problem.

    - I don't need to share objects between interpreters (if it is even
    possible- I don't know).

    - My fallback if I can't do this is to implement each instance in a
    dedicated *process* rather than per-thread. However, there is a
    significant cost to doing this that I would rather not incur.

    Could I confirm:

    - There is one GIL in a given process, shared amongst all (sub)
    interpreters. There seems some disagreement on this one online, although
    I'm fairly confident that there is only the one GIL.

    - I am using the mod_wsgi source for inspiration. Is there a better
    source for an example of embedding multiple interpreters?

    A query re the doco:

    http://docs.python.org/3/c-api/init.html#gilstate

    "Python supports the creation of additional interpreters (using
    Py_NewInterpreter()), but mixing multiple interpreters and the
    PyGILState_*() API is unsupported."

    Is this actually correct? mod_wsgi seems to do it. Have I misunderstood?

    I've extracted what I have so far from my code into a form that can be
    followed more easily. Hopefully I have not made any mistakes in doing
    so. The essence of what my code calls should be as follows:

    === Global init, run once:

    static PyThreadState *mtstate = NULL;

    PyImport_AppendInittab("myinternalmodule", PyInit_myinternalmodule);
    Py_SetProgramName((wchar_t *)"foo");
    Pu_InitializeEx(0);
    PyEval_InitThreads();
    mtstate = PyThreadState_Get();
    PyEval_ReleaseThread(mtstate);

    === Global shutdown, run once at end:

    Py_Finalize();

    === Per-interpreter init in main thread before launching child thread:

    (none thus far)

    === Init in dedicated thread for each interpreter:

    // NB: Also protected by a single global non-Python mutex to be sure.

    PyGILState_STATE gil = PyGILState_Ensure();
    PyThreadState *save_tstate = PyThreadState_Swap(NULL);
    state = Py_NewInterpreter();
    PyThreadState_Swap(save_tstate);

    PyObject *mmodule = PyImport_AddModule("__main__");
    Py_INCREF(mmodule);

    PyImport_ImportModule("myinternalmodule");

    PyGILState_Release(gil);

    === Shutdown in dedicated thread for each interpreter:

    // NB: Also protected by the same single global non-Python mutex as in
    the init.

    PyGILState_STATE gil = PyGILState_Ensure();
    PyThreadState *save_tstate = PyThreadState_Swap(state);
    Py_EndInterpreter(state);
    PyThreadState_Swap(save_tstate);
    PyGILState_Release(gil);

    === Placed at top of scope where calls made to Python C API:

    SafeLock lock;

    === SafeLock implementation:

    class SafeLock
    {
    public:
    SafeLock() {gil = PyGILState_Ensure();}
    ~SafeLock() {PyGILState_Release(gil);}

    private:
    PyGILState_STATE gil;
    };

    ===

    Does this look roughly right? Have I got the global and per-interpreter
    init and shutdown right? Am I locking correctly in SafeLock- is
    PyGILState_Ensure() and PyGILState_Release() sufficient?

    Is there an authoritative summary of the global and per-interpreter init
    and shutdown somewhere that I have missed? Any resource I should be reading?

    Cheers,
    Garth

    [1] It presently crashes in Py_EndInterpreter() after running through a
    series of tests during the shutdown of the 32nd interpreter I create. I
    don't know if this is significant, but the tests pass for the first 31
    interpreters.
     
    Garthy, Dec 6, 2013
    #1
    1. Advertising

  2. Garthy wrote:
    > I am running into problems when using multiple interpreters [1] and
    > I am presently trying to track down these issues. Can anyone familiar
    > with the process of embedding multiple interpreters have a skim of the
    > details below and let me know of any obvious problems?


    As far as I know, multiple interpreters in one process is
    not really supported. There *seems* to be partial support for
    it in the code, but there is no way to fully isolate them
    from each other.

    Why do you think you need multiple interpreters, as opposed
    to one interpreter with multiple threads? If you're trying
    to sandbox the threads from each other and/or from the rest
    of the system, be aware that it's extremely difficult to
    securely sandbox Python code. You'd be much safer to run
    each one in its own process and rely on OS-level protections.

    > - I understand that for the most part only a single interpreter will be
    > running at a time due to the GIL.


    Yes, as far as I can tell, there is only one GIL in a given
    process.

    > - I don't need to share objects between interpreters (if it is even
    > possible- I don't know).


    The hard part is *not* sharing objects between interpreters.
    If nothing else, all the builtin type objects, constants, etc.
    will be shared.

    --
    Greg
     
    Gregory Ewing, Dec 6, 2013
    #2
    1. Advertising

  3. Garthy

    Garthy Guest

    Hi Gregory,

    On 06/12/13 17:28, Gregory Ewing wrote:
    > Garthy wrote:
    >> I am running into problems when using multiple interpreters [1] and I
    >> am presently trying to track down these issues. Can anyone familiar
    >> with the process of embedding multiple interpreters have a skim of the
    >> details below and let me know of any obvious problems?

    >
    > As far as I know, multiple interpreters in one process is
    > not really supported. There *seems* to be partial support for
    > it in the code, but there is no way to fully isolate them
    > from each other.


    That's not good to hear.

    Is there anything confirming that it's an incomplete API insofar as
    multiple interpreters are concerned? Wouldn't this carry consequences
    for say mod_wsgi, which also does this?

    > Why do you think you need multiple interpreters, as opposed
    > to one interpreter with multiple threads? If you're trying
    > to sandbox the threads from each other and/or from the rest
    > of the system, be aware that it's extremely difficult to
    > securely sandbox Python code. You'd be much safer to run
    > each one in its own process and rely on OS-level protections.


    To allow each script to run in its own environment, with minimal chance
    of inadvertent interaction between the environments, whilst allowing
    each script the ability to stall on conditions that will be later met by
    another thread supplying the information, and to fit in with existing
    infrastructure.

    >> - I don't need to share objects between interpreters (if it is even
    >> possible- I don't know).

    >
    > The hard part is *not* sharing objects between interpreters.
    > If nothing else, all the builtin type objects, constants, etc.
    > will be shared.


    I understand. To clarify: I do not need to pass any Python objects I
    create or receive back and forth between different interpreters. I can
    imagine some environments would not react well to this.

    Cheers,
    Garth
     
    Garthy, Dec 6, 2013
    #3
  4. Garthy

    Garthy Guest

    Hi Gregory,

    On 06/12/13 17:28, Gregory Ewing wrote:
    > Garthy wrote:
    >> I am running into problems when using multiple interpreters [1] and I
    >> am presently trying to track down these issues. Can anyone familiar
    >> with the process of embedding multiple interpreters have a skim of the
    >> details below and let me know of any obvious problems?

    >
    > As far as I know, multiple interpreters in one process is
    > not really supported. There *seems* to be partial support for
    > it in the code, but there is no way to fully isolate them
    > from each other.


    That's not good to hear.

    Is there anything confirming that it's an incomplete API insofar as
    multiple interpreters are concerned? Wouldn't this carry consequences
    for say mod_wsgi, which also does this?

    > Why do you think you need multiple interpreters, as opposed
    > to one interpreter with multiple threads? If you're trying
    > to sandbox the threads from each other and/or from the rest
    > of the system, be aware that it's extremely difficult to
    > securely sandbox Python code. You'd be much safer to run
    > each one in its own process and rely on OS-level protections.


    To allow each script to run in its own environment, with minimal chance
    of inadvertent interaction between the environments, whilst allowing
    each script the ability to stall on conditions that will be later met by
    another thread supplying the information, and to fit in with existing
    infrastructure.

    >> - I don't need to share objects between interpreters (if it is even
    >> possible- I don't know).

    >
    > The hard part is *not* sharing objects between interpreters.
    > If nothing else, all the builtin type objects, constants, etc.
    > will be shared.


    I understand. To clarify: I do not need to pass any Python objects I
    create or receive back and forth between different interpreters. I can
    imagine some environments would not react well to this.

    Cheers,
    Garth

    PS. Apologies if any of these messages come through more than once. Most
    lists that I've posted to set reply-to meaning a normal reply can be
    used, but python-list does not seem to. The replies I have sent manually
    to instead don't seem to have appeared. I'm not
    quite sure what is happening- apologies for any blundering around on my
    part trying to figure it out.
     
    Garthy, Dec 6, 2013
    #4
  5. On Fri, Dec 6, 2013 at 7:21 PM, Garthy
    <> wrote:
    > PS. Apologies if any of these messages come through more than once. Most
    > lists that I've posted to set reply-to meaning a normal reply can be used,
    > but python-list does not seem to. The replies I have sent manually to
    > instead don't seem to have appeared. I'm not quite
    > sure what is happening- apologies for any blundering around on my part
    > trying to figure it out.


    They are coming through more than once. If you're subscribed to the
    list, sending to should be all you need to do -
    where else are they going?

    ChrisA
     
    Chris Angelico, Dec 6, 2013
    #5
  6. Garthy

    Garthy Guest

    Hi Chris,

    On 06/12/13 19:03, Chris Angelico wrote:
    > On Fri, Dec 6, 2013 at 6:59 PM, Garthy
    > <> wrote:
    >> Hi Chris (and Michael),

    >
    > Hehe. People often say that to me IRL, addressing me and my brother.
    > But he isn't on python-list, so you clearly mean Michael Torrie, yet
    > my brain still automatically thought you were addressing Michael
    > Angelico :)


    These strange coincidences happen from time to time- it's entertaining
    when they do. :)

    >> To allow each script to run in its own environment, with minimal

    chance of
    >> inadvertent interaction between the environments, whilst allowing each
    >> script the ability to stall on conditions that will be later met by

    another
    >> thread supplying the information, and to fit in with existing
    >> infrastructure.

    >
    > Are the scripts written cooperatively, or must you isolate one from
    > another? If you need to isolate them for trust reasons, then there's
    > only one solution, and that's separate processes with completely
    > separate interpreters. But if you're prepared to accept that one
    > thread of execution is capable of mangling another's state, things are
    > a lot easier. You can protect against *inadvertent* interaction much
    > more easily than malicious interference. It may be that you can get
    > away with simply running multiple threads in one interpreter;
    > obviously that would have problems if you need more than one CPU core
    > between them all (hello GIL), but that would really be your first
    > limit. One thread could fiddle with __builtins__ or a standard module
    > and thus harass another thread, but you would know if that's what's
    > going on.


    I think the ideal is completely sandboxed, but it's something that I
    understand I may need to make compromises on. The bare minimum would be
    protection against inadvertent interaction. Better yet would be a setup
    that made such interaction annoyingly difficult, and the ideal would be
    where it was impossible to interfere. My approaching this problem with
    interpreters was based on an assumption that it might provide a
    reasonable level of isolation- perhaps not ideal, but hopefully good enough.

    The closest analogy for understanding would be browser plugins: Scripts
    from multiple authors who for the most part aren't looking to create
    deliberate incompatibilities or interference between plugins. The
    isolation is basic, and some effort is made to make sure that one plugin
    can't cripple another trivially, but the protection is not exhaustive.

    Strangely enough, the GIL restriction isn't a big one in this case. For
    the application, the common case is actually one script running at a
    time, with other scripts waiting or not running at that time. They do
    sometimes overlap, but this isn't the common case. If it turned out that
    only one script could be progressing at a time, it's an annoyance but
    not a deal-breaker. If it's suboptimal (as seems to be the case), then
    it's actually not a major issue.

    With the single interpreter and multiple thread approach suggested, do
    you know if this will work with threads created externally to Python,
    ie. if I can create a thread in my application as normal, and then call
    something like PyGILState_Ensure() to make sure that Python has the
    internals it needs to work with it, and then use the GIL (or similar) to
    ensure that accesses to it remain thread-safe? If the answer is yes I
    can integrate such a thing more easily as an experiment. If it requires
    calling a dedicated "control" script that feeds out threads then it
    would need a fair bit more mucking about to integrate- I'd like to avoid
    this if possible.

    Cheers,
    Garth
     
    Garthy, Dec 6, 2013
    #6
  7. Garthy

    Garthy Guest

    Hi Chris,

    On 06/12/13 19:57, Chris Angelico wrote:
    > On Fri, Dec 6, 2013 at 7:21 PM, Garthy
    > <> wrote:
    >> PS. Apologies if any of these messages come through more than once. Most
    >> lists that I've posted to set reply-to meaning a normal reply can be

    used,
    >> but python-list does not seem to. The replies I have sent manually to
    >> instead don't seem to have appeared. I'm not

    quite
    >> sure what is happening- apologies for any blundering around on my part
    >> trying to figure it out.

    >
    > They are coming through more than once. If you're subscribed to the
    > list, sending to should be all you need to do -
    > where else are they going?


    I think I've got myself sorted out now. The mailing list settings are a
    bit different from what I am used to and I just need to reply to
    messages differently than I normally do.

    First attempt for three emails each went to the wrong place, second
    attempt for each appeared to have disappeared into the ether and I
    assumed non-delivery, but I was incorrect and they all actually arrived
    along with my third attempt at each.

    Apologies to all for the inadvertent noise.

    Cheers,
    Garth
     
    Garthy, Dec 6, 2013
    #7
  8. Garthy

    Tim Golden Guest

    On 06/12/2013 09:27, Chris Angelico wrote:
    > On Fri, Dec 6, 2013 at 7:21 PM, Garthy
    > <> wrote:
    >> PS. Apologies if any of these messages come through more than once. Most
    >> lists that I've posted to set reply-to meaning a normal reply can be used,
    >> but python-list does not seem to. The replies I have sent manually to
    >> instead don't seem to have appeared. I'm not quite
    >> sure what is happening- apologies for any blundering around on my part
    >> trying to figure it out.

    >
    > They are coming through more than once. If you're subscribed to the
    > list, sending to should be all you need to do -
    > where else are they going?



    I released a batch from the moderation queue from Garthy first thing
    this [my] morning -- ie about 1.5 hours ago. I'm afraid I didn't check
    first as to whether they'd already got through to the list some other way.

    TJG
     
    Tim Golden, Dec 6, 2013
    #8
  9. On Fri, Dec 6, 2013 at 8:35 PM, Garthy
    <> wrote:
    > I think the ideal is completely sandboxed, but it's something that I
    > understand I may need to make compromises on. The bare minimum would be
    > protection against inadvertent interaction. Better yet would be a setup that
    > made such interaction annoyingly difficult, and the ideal would be where it
    > was impossible to interfere.


    In Python, "impossible to interfere" is a pipe dream. There's no way
    to stop Python from fiddling around with the file system, and if
    ctypes is available, with memory in the running program. The only way
    to engineer that kind of protection is to prevent _the whole process_
    from doing those things (using OS features, not Python features),
    hence the need to split the code out into another process (which might
    be chrooted, might be running as a user with no privileges, etc).

    A setup that makes such interaction "annoyingly difficult" is possible
    as long as your users don't think Ruby. For instance:

    # script1.py
    import sys
    sys.stdout = open("logfile", "w")
    while True: print("Blah blah")

    # script2.py
    import sys
    sys.stdout = open("otherlogfile", "w")
    while True: print("Bleh bleh")


    These two scripts won't play nicely together, because each has
    modified global state in a different module. So you'd have to set that
    as a rule. (For this specific example, you probably want to capture
    stdout/stderr to some sort of global log file anyway, and/or use the
    logging module, but it makes a simple example.) Most Python scripts
    aren't going to do this sort of thing, or if they do, will do very
    little of it. Monkey-patching other people's code is a VERY rare thing
    in Python.

    > The closest analogy for understanding would be browser plugins: Scripts from
    > multiple authors who for the most part aren't looking to create deliberate
    > incompatibilities or interference between plugins. The isolation is basic,
    > and some effort is made to make sure that one plugin can't cripple another
    > trivially, but the protection is not exhaustive.


    Browser plugins probably need a lot more protection - maybe it's not
    exhaustive, but any time someone finds a way for one plugin to affect
    another, the plugin / browser authors are going to treat it as a bug.
    If I understand you, though, this is more akin to having two forms on
    one page and having JS validation code for each. It's trivially easy
    for one to check the other's form objects, but quite simple to avoid
    too, so for the sake of encapsulation you simply stay safe.

    > With the single interpreter and multiple thread approach suggested, do you
    > know if this will work with threads created externally to Python, ie. if I
    > can create a thread in my application as normal, and then call something
    > like PyGILState_Ensure() to make sure that Python has the internals it needs
    > to work with it, and then use the GIL (or similar) to ensure that accesses
    > to it remain thread-safe?


    Now that's something I can't help with. The only time I embedded
    Python seriously was a one-Python-per-process system (arbitrary number
    of processes fork()ed from one master, but each process had exactly
    one Python environment and exactly one database connection, etc), and
    I ended up being unable to make it secure, so I had to switch to
    embedding ECMAScript (V8, specifically, as it happens... I'm morbidly
    curious what my boss plans to do, now that he's fired me; he hinted at
    rewriting the C++ engine in PHP, and I'd love to be a fly on the wall
    as he tries to test a PHP extension for V8 and figure out whether or
    not he can trust arbitrary third-party compiled code). But there'll be
    someone on this list who's done threads and embedded Python.

    ChrisA
     
    Chris Angelico, Dec 6, 2013
    #9
  10. Garthy

    Garthy Guest

    Hi Chris,

    On 06/12/13 22:27, Chris Angelico wrote:
    > On Fri, Dec 6, 2013 at 8:35 PM, Garthy
    > <> wrote:
    >> I think the ideal is completely sandboxed, but it's something that I
    >> understand I may need to make compromises on. The bare minimum would be
    >> protection against inadvertent interaction. Better yet would be a

    setup that
    >> made such interaction annoyingly difficult, and the ideal would be

    where it
    >> was impossible to interfere.

    >
    > In Python, "impossible to interfere" is a pipe dream. There's no way
    > to stop Python from fiddling around with the file system, and if
    > ctypes is available, with memory in the running program. The only way
    > to engineer that kind of protection is to prevent _the whole process_
    > from doing those things (using OS features, not Python features),
    > hence the need to split the code out into another process (which might
    > be chrooted, might be running as a user with no privileges, etc).


    Absolutely- it would be an impractical ideal. If it was my highest and
    only priority, CPython might not be the best place to start. But there
    are plenty of other factors that make Python very desirable to use
    regardless. :) Re file and ctype-style functionality, that is something
    I'm going to have to find a way to limit somewhat. But first things
    first: I need to see what I can accomplish re initial embedding with a
    reasonable amount of work.

    > A setup that makes such interaction "annoyingly difficult" is possible
    > as long as your users don't think Ruby. For instance:
    >
    > # script1.py
    > import sys
    > sys.stdout = open("logfile", "w")
    > while True: print("Blah blah")
    >
    > # script2.py
    > import sys
    > sys.stdout = open("otherlogfile", "w")
    > while True: print("Bleh bleh")
    >
    >
    > These two scripts won't play nicely together, because each has
    > modified global state in a different module. So you'd have to set that
    > as a rule. (For this specific example, you probably want to capture
    > stdout/stderr to some sort of global log file anyway, and/or use the
    > logging module, but it makes a simple example.)


    Thanks for the example. Hopefully I can minimise the cases where this
    would potentially be a problem. Modifying the basic environment and the
    source is something I can do readily if needed.

    Re stdout/stderr, on that subject I actually wrote a replacement log
    catcher for embedded Python a few years back. I can't remember how on
    earth I did it now, but I've still got the code that did it somewhere.

    > Most Python scripts
    > aren't going to do this sort of thing, or if they do, will do very
    > little of it. Monkey-patching other people's code is a VERY rare thing
    > in Python.


    That's good to hear. :)

    >> The closest analogy for understanding would be browser plugins:

    Scripts from
    >> multiple authors who for the most part aren't looking to create

    deliberate
    >> incompatibilities or interference between plugins. The isolation is

    basic,
    >> and some effort is made to make sure that one plugin can't cripple

    another
    >> trivially, but the protection is not exhaustive.

    >
    > Browser plugins probably need a lot more protection - maybe it's not
    > exhaustive, but any time someone finds a way for one plugin to affect
    > another, the plugin / browser authors are going to treat it as a bug.
    > If I understand you, though, this is more akin to having two forms on
    > one page and having JS validation code for each. It's trivially easy
    > for one to check the other's form objects, but quite simple to avoid
    > too, so for the sake of encapsulation you simply stay safe.


    There have been cases where browser plugins have played funny games to
    mess with the behaviour of other plugins (eg. one plugin removing
    entries from the configuration of another). It's certainly not ideal,
    but it comes from the environment being not entirely locked down, and
    one plugin author being inclined enough to make destructive changes that
    impact another. I think the right effort/reward ratio will mean I end up
    in a similar place.

    I know it's not the best analogy, but it was one that readily came to
    mind. :)

    >> With the single interpreter and multiple thread approach suggested,

    do you
    >> know if this will work with threads created externally to Python,

    ie. if I
    >> can create a thread in my application as normal, and then call something
    >> like PyGILState_Ensure() to make sure that Python has the internals

    it needs
    >> to work with it, and then use the GIL (or similar) to ensure that

    accesses
    >> to it remain thread-safe?

    >
    > Now that's something I can't help with. The only time I embedded
    > Python seriously was a one-Python-per-process system (arbitrary number
    > of processes fork()ed from one master, but each process had exactly
    > one Python environment and exactly one database connection, etc), and
    > I ended up being unable to make it secure, so I had to switch to
    > embedding ECMAScript (V8, specifically, as it happens... I'm morbidly
    > curious what my boss plans to do, now that he's fired me; he hinted at
    > rewriting the C++ engine in PHP, and I'd love to be a fly on the wall
    > as he tries to test a PHP extension for V8 and figure out whether or
    > not he can trust arbitrary third-party compiled code). But there'll be
    > someone on this list who's done threads and embedded Python.


    Thanks in any case. I'm guessing someone with the right inclination and
    experience might see the question and jump in with their thoughts.

    Many thanks for your continued thoughts by the way. :)

    Cheers,
    Garth

    PS. As a dev with a heavy C++ background, I also wonder at the type of
    C++ engine that could be improved with a PHP rewrite. ;)
     
    Garthy, Dec 6, 2013
    #10
  11. Garthy

    Garthy Guest

    Hi Tim,

    On 06/12/13 20:47, Tim Golden wrote:
    > On 06/12/2013 09:27, Chris Angelico wrote:
    >> On Fri, Dec 6, 2013 at 7:21 PM, Garthy
    >> <> wrote:
    >>> PS. Apologies if any of these messages come through more than once. Most
    >>> lists that I've posted to set reply-to meaning a normal reply can be used,
    >>> but python-list does not seem to. The replies I have sent manually to
    >>> instead don't seem to have appeared. I'm not quite
    >>> sure what is happening- apologies for any blundering around on my part
    >>> trying to figure it out.

    >>
    >> They are coming through more than once. If you're subscribed to the
    >> list, sending to should be all you need to do -
    >> where else are they going?

    >
    >
    > I released a batch from the moderation queue from Garthy first thing
    > this [my] morning -- ie about 1.5 hours ago. I'm afraid I didn't check
    > first as to whether they'd already got through to the list some other way.


    I had to make a call between re-sending posts that might have gone
    missing, or seemingly not responding promptly when people had taken the
    time to answer my complex query. I made a call to re-send, and it was
    the wrong one. The fault for the double-posting is entirely mine.

    Cheers,
    Garth
     
    Garthy, Dec 6, 2013
    #11
  12. Garthy wrote:
    > To allow each script to run in its own environment, with minimal chance
    > of inadvertent interaction between the environments, whilst allowing
    > each script the ability to stall on conditions that will be later met by
    > another thread supplying the information, and to fit in with existing
    > infrastructure.


    The last time I remember this being discussed was in the context
    of allowing free threading. Multiple interpreters don't solve
    that problem, because there's still only one GIL and some
    objects are shared.

    But if all you want is for each plugin to have its own version
    of sys.modules, etc., and you're not concerned about malicious
    code, then it may be good enough.

    It seems to be good enough for mod_wsgi, because presumably
    all the people with the ability to install code on a given
    web server trust each other.

    --
    Greg
     
    Gregory Ewing, Dec 6, 2013
    #12
  13. Garthy wrote:

    > The bare minimum would be
    > protection against inadvertent interaction. Better yet would be a setup
    > that made such interaction annoyingly difficult, and the ideal would be
    > where it was impossible to interfere.


    To give you an idea of the kind of interference that's
    possible, consider:

    1) You can find all the subclasses of a given class
    object using its __subclasses__() method.

    2) Every class ultimately derives from class object.

    3) All built-in class objects are shared between
    interpreters.

    So, starting from object.__subclasses__(), code in any
    interpreter could find any class defined by any other
    interpreter and mutate it.

    This is not something that is likely to happen by
    accident. Whether it's "annoyingly difficult" enough
    is something you'll have to decide.

    Also keep in mind that it's fairly easy for Python
    code to chew up large amounts of memory and/or CPU
    time in an uninterruptible way, e.g. by
    evaluating 5**100000000. So even a thread that's
    keeping its hands entirely to itself can still
    cause trouble.

    --
    Greg
     
    Gregory Ewing, Dec 6, 2013
    #13
  14. Garthy

    Garthy Guest

    Hi Gregory,

    On 07/12/13 08:53, Gregory Ewing wrote:
    > Garthy wrote:
    >> The bare minimum would be protection against inadvertent interaction.
    >> Better yet would be a setup that made such interaction annoyingly
    >> difficult, and the ideal would be where it was impossible to interfere.

    >
    > To give you an idea of the kind of interference that's
    > possible, consider:
    >
    > 1) You can find all the subclasses of a given class
    > object using its __subclasses__() method.
    >
    > 2) Every class ultimately derives from class object.
    >
    > 3) All built-in class objects are shared between
    > interpreters.
    >
    > So, starting from object.__subclasses__(), code in any
    > interpreter could find any class defined by any other
    > interpreter and mutate it.


    Many thanks for the excellent example. It was not clear to me how
    readily such a small and critical bit of shared state could potentially
    be abused across interpreter boundaries. I am guessing this would be the
    first in a chain of potential problems I may run into.

    > This is not something that is likely to happen by
    > accident. Whether it's "annoyingly difficult" enough
    > is something you'll have to decide.


    I think it'd fall under "protection against inadvertent modification"-
    down the scale somewhat. It doesn't sound like it would be too difficult
    to achieve if the author was so inclined.

    > Also keep in mind that it's fairly easy for Python
    > code to chew up large amounts of memory and/or CPU
    > time in an uninterruptible way, e.g. by
    > evaluating 5**100000000. So even a thread that's
    > keeping its hands entirely to itself can still
    > cause trouble.


    Thanks for the tip. The potential for deliberate resource exhaustion is
    unfortunately something that I am likely going to have to put up with in
    order to keep things in the same process.

    Cheers,
    Garth
     
    Garthy, Dec 7, 2013
    #14
  15. Garthy

    Garthy Guest

    Hi Gregory,

    On 07/12/13 08:39, Gregory Ewing wrote:
    > Garthy wrote:
    >> To allow each script to run in its own environment, with minimal
    >> chance of inadvertent interaction between the environments, whilst
    >> allowing each script the ability to stall on conditions that will be
    >> later met by another thread supplying the information, and to fit in
    >> with existing infrastructure.

    >
    > The last time I remember this being discussed was in the context
    > of allowing free threading. Multiple interpreters don't solve
    > that problem, because there's still only one GIL and some
    > objects are shared.


    I am fortunate in my case as the normal impact of the GIL would be much
    reduced. The common case is only one script actively progressing at a
    time- with the others either not running or waiting for external input
    to continue.

    But as you point out in your other reply, there are still potential
    concerns that arise from the smaller set of shared objects even across
    interpreters.

    > But if all you want is for each plugin to have its own version
    > of sys.modules, etc., and you're not concerned about malicious
    > code, then it may be good enough.


    I wouldn't say that I wasn't concerned about it entirely, but on the
    other hand it is not a hard requirement to which all other concerns are
    secondary.

    Cheers,
    Garth
     
    Garthy, Dec 7, 2013
    #15
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Maciej Sobczak

    Multiple interpreters in a single process

    Maciej Sobczak, Jan 5, 2004, in forum: Python
    Replies:
    1
    Views:
    396
  2. Paul Miller
    Replies:
    4
    Views:
    325
    Paul Miller
    Jan 23, 2004
  3. Craig Ringer
    Replies:
    1
    Views:
    400
    Mustafa Demirhan
    Nov 18, 2004
  4. Replies:
    3
    Views:
    319
  5. ritesh
    Replies:
    2
    Views:
    189
    Stephan Titard
    Jul 12, 2006
Loading...

Share This Page