SoC project: Python-Haskell bridge - request for feedback

Discussion in 'Python' started by Michał Janeczek, Mar 24, 2008.

  1. Hi,

    I am a student interested in participating in this year's SoC.
    At http://tsk.ch.uj.edu.pl/~janeczek/socapp.html (and also below
    in this email) you can find a draft of my project proposal.

    I'd like to ask you to comment on it, especially the deliverables
    part. Are you interested in such a project, and if yes, what features
    would be most important to you? Is anything missing, or should
    something get more priority or attention?

    Regards,
    Michal


    Python-Haskell bridge
    =====================

    Description
    -----------

    This project will seek to provide a comprehensive, high level (and thus
    easy to use) binding between Haskell and Python programming languages.
    This will allow using libraries of either side from each language.


    Benefits for Python
    -------------------

    * Robust, high assurance components

    It might be beneficial to implement safety-critical components
    in a strongly, statically typed language, using Python to keep
    them together. Cryptography or authentication modules can be
    an example.

    * Performance improvements for speed-critical code

    Haskell compiled to native code is typically an order of magnitude
    faster than Python. Aside from that, advanced language features
    (such as multicore parallel runtime, very lightweight threads
    and software transactional memory) further serve in improving the
    performance. Haskell could become a safe, high level alternative
    to commonly used C extensions.

    * Access to sophisticated libraries

    While its set of libraries is not as comprehensive as that of
    Python, Haskell can still offer some well tested, efficient
    libraries. Examples might be rich parser combinator libraries
    (like Parsec) and persistent, functional data structures.
    QuickCheck testing library could also be used to drive analysis
    of Python code.


    Benefits for Haskell
    --------------------

    The project would benefit Haskell by providing it with access to
    an impressive suite of libraries. It also has a potential to help
    Haskell adoption, by mitigating risk of using Haskell in a project.


    Deliverables
    ------------

    * A low level library to access Python objects from Haskell

    * A set of low level functions to convert built-in data types
    between Haskell and Python (strings, numbers, lists,
    dictionaries, functions, generators etc.)

    * A higher level library allowing easy (transparent) access to
    Python functions from Haskell, and wrapping Haskell functions
    for Python to access

    * A way to easily derive conversion functions for user-defined
    data types/objects. Functions derived in such a way should
    work well with both low level and high level access libraries

    * Documentation and a set of examples for all of above


    Optional goals
    --------------

    These are of lower priority, and might require a fair amount of work.
    I would like to implement most of them, if technically feasible. If
    they don't fit into Summer of Code timeframe, I am planning to finish
    afterwards.

    * A Python module for accessing functions from Haskell modules without
    manual wrapping (such wrapping should be already easy thanks to the
    high level library). It'd be accomplished through GHC api - if it
    allows it. The Haskell side of the high level library will already
    support such mode of operation

    * Extend and refactor the code, to make it support other similar
    dynamic languages. This is a lot of work, and definitely out of
    the scope of Summer of Code project, but some design decisions
    may be influenced by this.


    Related projects
    ----------------

    They (and quite possibly some others) will be referenced for ideas.

    * MissingPy

    Provides a one way, low level binding to Python. Some of the code
    can be possibly reused, especially data conversion functions. It
    doesn't seem to export all features, in particular function
    callbacks are not supported

    * HaXR

    XML-RPC binding for Haskell. It could provide inspiration for
    reconciling Haskell and Python type systems, resulting in a
    friendly interface

    * rocaml

    A binding between Ruby and OCaml
    Michał Janeczek, Mar 24, 2008
    #1
    1. Advertising

  2. Michał Janeczek

    Paul Rubin Guest

    A few thoughts. The envisioned Python-Haskell bridge would have two
    directions: 1) calling Haskell code from Python; 2) calling Python
    code from Haskell. The proposal spends more space on #1 but I think
    #1 is both more difficult and less interesting. By "Haskell" I
    presume you mean GHC. I think that the GHC runtime doesn't embed very
    well, despite the example on the Python wiki
    (http://wiki.python.org/moin/PythonVsHaskell near the bottom). This
    is especially if you want to use the concurrency stuff. The GHC
    runtime wants to trap the timer interrupt and do select based i/o
    among other things. And I'm not sure that wanting to call large
    Haskell components under a Python top-level is that compelling: why
    not write the top level in Haskell too? The idea of making the
    critical components statically typed for safety is less convincing if
    the top level is untyped.

    There is something to be said for porting some functional data
    structures to Python, but I think that's mostly limited to the simpler
    ones like Data.Map (which I've wanted several times). And I think
    this porting is most easily done by simply reimplementing those
    structures in a Python-friendly style using Python's C API. The type
    signatures (among other things) on the Haskell libraries for this
    stuff tend to be a little too weird for Python; for example,
    Data.Map.lookup runs in an arbitrary monad which controls the error
    handling for a missing key. The Python version should be like a dict,
    where you give it a key and a default value to return if the key is
    not found. Plus, do you really want Python pointers into Haskell data
    structures to be enrolled with both systems' garbage collectors?

    (Actually (sidetrack that I just thought of), a Cyclone API would be
    pretty useful for writing safe Python extensions. Cyclone is a
    type-safe C dialect, see cyclone.thelanguage.org ).

    The Haskell to Python direction sounds more useful, given Haskell's
    weirdness and difficulty. Python is easy to learn and well-packaged
    for embedding, so it's a natural extension language for Haskell
    applications. If you wrote a database in Haskell, you could let
    people write stored procedures in Python if they didn't want to deal
    with Haskell's learning curve. Haskell would call Python through its
    "safe" FFI (which runs the extension in a separate OS thread) and not
    have to worry much about the Python side doing IO or whatever. Of
    course this would also let Python call back into the Haskell system,
    perhaps passing Python values as Data.Dynamic, or else using something
    like COM interface specifications.

    Anyway I'm babbling now, I may think about this more later.
    Paul Rubin, Mar 26, 2008
    #2
    1. Advertising

  3. Thanks for finding time to reply!

    On 26 Mar 2008 01:46:38 -0700, Paul Rubin
    <"http://phr.cx"@nospam.invalid> wrote:
    > A few thoughts. The envisioned Python-Haskell bridge would have two
    > directions: 1) calling Haskell code from Python; 2) calling Python
    > code from Haskell. The proposal spends more space on #1 but I think
    > #1 is both more difficult and less interesting. By "Haskell" I
    > presume you mean GHC. I think that the GHC runtime doesn't embed very
    > well, despite the example on the Python wiki
    > (http://wiki.python.org/moin/PythonVsHaskell near the bottom). This
    > is especially if you want to use the concurrency stuff. The GHC
    > runtime wants to trap the timer interrupt and do select based i/o
    > among other things. And I'm not sure that wanting to call large
    > Haskell components under a Python top-level is that compelling: why
    > not write the top level in Haskell too? The idea of making the
    > critical components statically typed for safety is less convincing if
    > the top level is untyped.


    I wasn't aware of the runtime issues, these can be things to watch out
    for. However, the type of embedding that I imagined would be mostly
    pure functions, since Python can deal with IO rather well. It'd also
    be applicable in situations where we want to add some functionality to
    to existing, large Python project, where the complete rewrite
    would be infeasible.

    > There is something to be said for porting some functional data
    > structures to Python, but I think that's mostly limited to the simpler
    > ones like Data.Map (which I've wanted several times). And I think
    > this porting is most easily done by simply reimplementing those
    > structures in a Python-friendly style using Python's C API. The type
    > signatures (among other things) on the Haskell libraries for this
    > stuff tend to be a little too weird for Python; for example,
    > Data.Map.lookup runs in an arbitrary monad which controls the error
    > handling for a missing key. The Python version should be like a dict,
    > where you give it a key and a default value to return if the key is
    > not found. Plus, do you really want Python pointers into Haskell data
    > structures to be enrolled with both systems' garbage collectors?


    I didn't mention this in this first draft, but I don't know (yet)
    how to support those "fancy" types. The plan for now is to export
    monomorphic functions only. As for GC, I think having the two systems
    involved is unavoidable if I want to have first class functions on
    both sides.

    > The Haskell to Python direction sounds more useful, given Haskell's
    > weirdness and difficulty. Python is easy to learn and well-packaged
    > for embedding, so it's a natural extension language for Haskell
    > applications. If you wrote a database in Haskell, you could let
    > people write stored procedures in Python if they didn't want to deal
    > with Haskell's learning curve. Haskell would call Python through its
    > "safe" FFI (which runs the extension in a separate OS thread) and not
    > have to worry much about the Python side doing IO or whatever. Of
    > course this would also let Python call back into the Haskell system,
    > perhaps passing Python values as Data.Dynamic, or else using something
    > like COM interface specifications.


    That is one of the use cases I have missed in the first draft.
    Thanks for the idea!

    > Anyway I'm babbling now, I may think about this more later.

    By all means, please do go on :) This has helped a lot :)

    Regards,
    Michal
    Michał Janeczek, Mar 27, 2008
    #3
  4. Michał Janeczek

    malkarouri Guest

    On 26 Mar, 08:46, Paul Rubin <http://> wrote:
    > A few thoughts. The envisioned Python-Haskell bridge would have two
    > directions: 1) calling Haskell code from Python; 2) calling Python
    > code from Haskell. The proposal spends more space on #1 but I think
    > #1 is both more difficult and less interesting.


    FWIW, I find #1 more interesting for me personally.
    As a monad-challenged person, I find it much easier to develop
    components using pure functional programming in a language like
    Haskell and do all my I/O in Python than having it the other way
    round.
    Of course, if it is more difficult then I wouldn't expect it from a
    SoC project, but that's that.

    Muhammad Alkarouri
    malkarouri, Mar 27, 2008
    #4
  5. Michał Janeczek

    Paul Rubin Guest

    malkarouri <> writes:
    > FWIW, I find #1 more interesting for me personally.
    > As a monad-challenged person, I find it much easier to develop
    > components using pure functional programming in a language like
    > Haskell and do all my I/O in Python than having it the other way
    > round.


    Haskell i/o is not that complicated, and monads are used in pure
    computation as well as for i/o. Without getting technical, monads are
    the bicycles that Haskell uses to get values from one place to another
    in the right order. They are scary at first, but once you get a
    little practice riding them, the understanding stays with you and
    makes perfect sense.
    Paul Rubin, Mar 28, 2008
    #5
  6. Michał Janeczek

    Paul Rubin Guest

    "Michaâ Janeczek" <> writes:
    > I wasn't aware of the runtime issues, these can be things to watch out
    > for. However, the type of embedding that I imagined would be mostly
    > pure functions, since Python can deal with IO rather well. It'd also
    > be applicable in situations where we want to add some functionality to
    > to existing, large Python project, where the complete rewrite
    > would be infeasible.


    Of course I can't say what functions someone else would want to use,
    but I'm not seeing very convincing applications of this myself. There
    aren't that many prewritten Haskell libraries (especially monomorphic
    ones) that I could see using that way. And if I'm skilled enough with
    Haskell to write the functions myself, I'd probably rather embed
    Python in a Haskell app than the other way around. Haskell's i/o has
    gotten a lot better recently (Data.ByteString) though there is
    important stuff still in progress (bytestring unicode). For other
    pure functions (crypto, math, etc.) there are generally libraries
    written in C already interfaced to Python (numarray, etc.), maybe
    through Swig. The case for Haskell isn't that compelling.

    > I didn't mention this in this first draft, but I don't know (yet)
    > how to support those "fancy" types. The plan for now is to export
    > monomorphic functions only.


    This probably loses most of the interesting stuff: parser combinators,
    functional data structures like zippers, etc. Unless you mean to
    use templates to make specialized versions?

    > As for GC, I think having the two systems involved is unavoidable if
    > I want to have first class functions on both sides.


    This just seems worse and worse the more I think about it. Remember
    that GHC uses a copying gc so there is no finalization and therefore
    no way to notify python that a reference has been freed. And you'd
    probably have to put Haskell pointers into Python's heap objects so
    that the Haskell gc wouldn't have to scan the whole Python heap.
    Also, any low level GHC gc stuff (not sure if there would be any)
    might have to be redone for GHC 6.10(?) which is getting a new
    (parallel) gc. Maybe I'm not thinking of this the right way though, I
    haven't looked at the low level ghc code.

    Keep in mind also that Python style tends to not use complex data
    structures and fancy sharing of Haskell structures may not be in the
    Python style. Python uses extensible lists and mutable dictionaries
    for just about everything, relying on the speed of the underlying C
    functions to do list operations very fast (C-coded O(n) operation
    faster than interpreted O(log n) operation for realistic n). So maybe
    this type of sharing won't be so useful.

    It may be simplest to just marshal data structures across a message
    passing interface rather than really try to share values between the
    two systems. For fancy functional structures, from a Python
    programmer's point of view, it is probably most useful to just pick a
    few important ones and code them in C from scratch for direct use in
    Python. Hedgehog Lisp (google for it) has a nice C implementation of
    functional maps that could probably port easily to the Python C API,
    and I've been sort of wanting to do that. It would be great if
    you beat me to it.

    > > Anyway I'm babbling now, I may think about this more later.

    > By all means, please do go on :) This has helped a lot :)


    One thing I highly recommend is that you join the #haskell channel on
    irc.freenode.net. There are a lot of real experts there (I'm just
    newbie) who can advise you better than I can, and you can talk to them
    in real time.
    Paul Rubin, Mar 28, 2008
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. William Park
    Replies:
    0
    Views:
    261
    William Park
    Jan 25, 2004
  2. Alia Khouri
    Replies:
    0
    Views:
    387
    Alia Khouri
    Oct 11, 2009
  3. Casey Hawthorne
    Replies:
    3
    Views:
    320
  4. Alia Khouri
    Replies:
    0
    Views:
    232
    Alia Khouri
    Oct 31, 2010
  5. Haskell -> Python

    , Nov 2, 2012, in forum: Python
    Replies:
    7
    Views:
    177
Loading...

Share This Page