Wrapping a C library in Python

R

Roy Smith

I've got a C library with about 50 calls in it that I want to wrap in
Python. I know I could use some tool like SWIG, but that will give me a
too-literal translation; I want to make some modifications along the way
to make the interface more Pythonic.

For example, all of these functions return an error code (typically just
errno passed along, but not always). They all accept as one of their
arguments a pointer to someplace to store their result. I want to
change all that to returning the result directly and throwing exceptions.

I also want to mutate some of the return types. A common way these
functions return a set of values is a pair of arrays of strings, forming
key-value pairs. In Python, it would make sense to return this as a
dictionary.

I know what I'm describing is kind of vague, but are there tools around
which might help automate (at least in part) this translation process?
 
?

=?iso-8859-1?Q?Fran=E7ois?= Pinard

[Roy Smith]
I've got a C library with about 50 calls in it that I want to wrap in
Python. I know I could use some tool like SWIG, but that will give me a
too-literal translation; I want to make some modifications along the way
to make the interface more Pythonic.

I used Pyrex with both pleasure and success for wrapping C libraries
while giving the API a more Pythonic flavour. I found Pyrex to be a
wonderful tool for easily doing such things.
 
B

Benji York

Roy said:
I've got a C library with about 50 calls in it that I want to wrap in
Python.

I'd recommend ctypes (http://starship.python.net/crew/theller/ctypes/).
It is very easy to use and multi-platform.

I have a small project that provides a DB API 2.0 interface to the ODBTP
(http://odbtp.sf.net) library. See http://benjiyork.com/odbtp.html
(beware programmer-web-design ahead) for the code (LGPL). It might
serve as an example to get you on your way.

BTW, if anyone is interested, I hope to have a much improved version of
the wrapper ready in a few weeks to be included in the official ODBTP
distribution.
 
M

Michael Loritsch

Roy Smith said:
I've got a C library with about 50 calls in it that I want to wrap in
Python. I know I could use some tool like SWIG, but that will give me a
too-literal translation; I want to make some modifications along the way
to make the interface more Pythonic.

For example, all of these functions return an error code (typically just
errno passed along, but not always). They all accept as one of their
arguments a pointer to someplace to store their result. I want to
change all that to returning the result directly and throwing exceptions.

I also want to mutate some of the return types. A common way these
functions return a set of values is a pair of arrays of strings, forming
key-value pairs. In Python, it would make sense to return this as a
dictionary.

I know what I'm describing is kind of vague, but are there tools around
which might help automate (at least in part) this translation process?

While SWIG is a strong tool for wrapping C code into many different
langauges, there are definitely better tools out there for producing
python modules from C code.

I recommend both boost::python
(http://www.boost.org/libs/python/doc/index.html) and ctypes
(http://starship.python.net/crew/theller/ctypes/), but for very
different reasons.

If wrapped code speed and production of a binary is a goal in creating
your python extensions, use boost::python. The price you pay for
creating an fast binary python extension is coding your translation
from python in C/C++ in C/C++ (I suppose you could get around this by
creating a boost::python module, and then wrapping it with a python
module to do the type translation). And in boost::python, C++ to
python exception translation is supported.

If speed of development is your main goal, then I'd use ctypes.
ctypes will dynamically load the library in python code. At the
python code level, you should then do the translation to python types.
No C/C++ coding required.

Hope this helps!

Michael Loritsch
 
J

John Hunter

Roy> For example, all of these functions return an error code
Roy> (typically just errno passed along, but not always). They
Roy> all accept as one of their arguments a pointer to someplace
Roy> to store their result. I want to change all that to
Roy> returning the result directly and throwing exceptions.

Roy> I also want to mutate some of the return types. A common way
Roy> these functions return a set of values is a pair of arrays of
Roy> strings, forming key-value pairs. In Python, it would make
Roy> sense to return this as a dictionary.

SWIG can do all this - see the section on typemaps and call policies
in the SWIG manual.

For example, it is easy to tell SWIG that pointers are used for output

void somefunc(double *OUTPUT, double *OUTPUT)

will be called from python like

x, y = o.somefunc()

And you need to do no more than add the one declaration line. INPUT,
OUTPUT and INOUT are special tokens that SWIG recognizes and applies
translation rules too. You can define your own such tokens to do more
complicated things (like turning a double *array into a python list,
etc)

Michael> I recommend both boost::python
Michael> (http://www.boost.org/libs/python/doc/index.html) and
Michael> ctypes (http://starship.python.net/crew/theller/ctypes/),
Michael> but for very different reasons.

Michael> If wrapped code speed and production of a binary is a
Michael> goal in creating your python extensions, use
Michael> boost::python. The price you pay for creating an fast
Michael> binary python extension is coding your translation from
Michael> python in C/C++ in C/C++ (I suppose you could get around
Michael> this by creating a boost::python module, and then
Michael> wrapping it with a python module to do the type
Michael> translation). And in boost::python, C++ to python
Michael> exception translation is supported.

I don't fully agree with this. I've been working on a wrapper for
antigrain, a C++ library that makes heavy use of templates. I started
off using pyste (a boost::python generator) and boost. It worked
reasonably well - pyste is not being actively maintained right but you
can usually work around the limitations by writing boost code where
you need to. I was reasonably happy, until I saw my *.so files
ballooning. After wrapping a small fraction of the library, and
having instantiated only a few of the many templates I ultimately
wanted, my extension files were at 20MB, which is *much larger* than
the agg library or moderately sophisticated applications built around
it. And I still had *a lot* left to expose!

I started over in SWIG. First, SWIG has excellent support for C++ and
templates, and in many cases could auto-wrap and entire header with

%include "someheader.h"

Second, by the time I had the SWIG wrapping to a comparable point that
the boost wrapping was at when I started over, I had only a 500K
extension module. Same functionality, 40x smaller. Since ultimately
I may want to distribute my software in compiled form (eg a windows
installer) this is an important difference. Third, the compile times
were much shorter in SWIG. Fourth, SWIG produces c and cxx files as
its output, which you can distribute with your app (user doesn't need
to have SWIG to compile). This is not true for boost.

In a nutshell, for wrapping a large C++ library (not the original
poster's question, I know), I found SWIG more suitable for the reasons
above. I don't want to slam boost - I think it is an awesome package
-- but you should be aware of these potential problems. The ease of
wrapping agg was comparable in boost and SWIG.

Where boost (and pycxx) really shines above SWIG is when you want to
write functions and methods yourself that interact with python objects
-- for example if you were writing a python extension largely from
scratch rather than wrapping an existing library. The ability to use
friendly boost C++ classes like dict and list that manage memory for
you and have a pythonic feel is great.

JDH
 
D

David M. Cooke

John Hunter said:
Michael> I recommend both boost::python
Michael> (http://www.boost.org/libs/python/doc/index.html) and
Michael> ctypes (http://starship.python.net/crew/theller/ctypes/),
Michael> but for very different reasons.

Michael> If wrapped code speed and production of a binary is a
Michael> goal in creating your python extensions, use
Michael> boost::python. The price you pay for creating an fast
Michael> binary python extension is coding your translation from
Michael> python in C/C++ in C/C++ (I suppose you could get around
Michael> this by creating a boost::python module, and then
Michael> wrapping it with a python module to do the type
Michael> translation). And in boost::python, C++ to python
Michael> exception translation is supported.

I don't fully agree with this. I've been working on a wrapper for
antigrain, a C++ library that makes heavy use of templates. I started
off using pyste (a boost::python generator) and boost. It worked
reasonably well - pyste is not being actively maintained right but you
can usually work around the limitations by writing boost code where
you need to. I was reasonably happy, until I saw my *.so files
ballooning. After wrapping a small fraction of the library, and
having instantiated only a few of the many templates I ultimately
wanted, my extension files were at 20MB, which is *much larger* than
the agg library or moderately sophisticated applications built around
it. And I still had *a lot* left to expose!

Did you strip the extension modules (run 'strip' on the .so file)? I
know, there's nothing in distutils that will do that automatically. I
just did this on an extension module of mine using boost::python.
Before stripping, it was about 1.3 MB, afterwards, 50 kB.

The problem is the symbol table is kept by default, and it seems with
GNU C++ (at least) that can get *huge* when templates are involved.

Compile times are pain still.
 
R

Roy Smith

If speed of development is your main goal, then I'd use ctypes.
ctypes will dynamically load the library in python code. At the
python code level, you should then do the translation to python types.
No C/C++ coding required.

OK, I decided to give ctypes a try. After a few false steps (mostly
related to sorting out LD_LIBRARY_PATH issues), I got it to work for
simple functions. Once you get your head around how it works, it's
pretty neat.

The problem is, I'm at a loss what to do for a slightly more complex
case. The API I'm working with requires that you create a dm_handle
which you then pass into all the other calls to establish a context.
You start with (approximately):

/*
* Create a handle. The caller must have allocated memory for
* the object pointed to by dmh.
*/
create_dm_handle (struct dm_handle *dmh);

I don't see how I can do the required memory allocation. Calling
malloc() directly seems kind of scary. Even if I could do that, doing
sizeof (struct dm_handle) won't work in the Python environment.

Am I missing something obvious here?
 
T

Thomas Heller

Roy Smith said:
OK, I decided to give ctypes a try. After a few false steps (mostly
related to sorting out LD_LIBRARY_PATH issues), I got it to work for
simple functions. Once you get your head around how it works, it's
pretty neat.

The problem is, I'm at a loss what to do for a slightly more complex
case. The API I'm working with requires that you create a dm_handle
which you then pass into all the other calls to establish a context.
You start with (approximately):

/*
* Create a handle. The caller must have allocated memory for
* the object pointed to by dmh.
*/
create_dm_handle (struct dm_handle *dmh);

I don't see how I can do the required memory allocation. Calling
malloc() directly seems kind of scary. Even if I could do that, doing
sizeof (struct dm_handle) won't work in the Python environment.

Am I missing something obvious here?

It's simple.
First, you define the structure:

import ctypes
class dm_handle(ctypes.Structure):
_fields_ = [....] # whatever is is

Then, create an instance (this will allocate memory internally)

my_handle = cm_handle()

and finally call the function, pssing it a pointer to the structure:

mydll.create_dm_handle(ctypes.byref(my_handle))

Thomas
 
R

Roy Smith

Am I missing something obvious here?

It's simple.
First, you define the structure:

import ctypes
class dm_handle(ctypes.Structure):
_fields_ = [....] # whatever is is

Then, create an instance (this will allocate memory internally)

my_handle = cm_handle()

and finally call the function, pssing it a pointer to the structure:

mydll.create_dm_handle(ctypes.byref(my_handle))[/QUOTE]

I guess that makes sense. In my case, however, the structure has about
20 elements, many of which are user-defined types. Building the correct
ctypes.Structure description would be a bit of work, and it would
certainly violate the rule of "once and only once". If the underlying C
structure changed, I'd have to update my Python code to match.

Since the handle is opaque, I don't need to know about the innards of
the structure at all. What I ended up doing was writing a little C
routine something like this:

dm_handle *allocate_dm_handle ()
{
return malloc (sizeof (struct dm_handle));
}

I then built a .so containing just that one routine, and used ctypes to
call it from Python to get my buffer.
 
R

Roger Binns

Roy Smith said:
Since the handle is opaque, I don't need to know about the innards of
the structure at all. What I ended up doing was writing a little C
routine something like this:

dm_handle *allocate_dm_handle ()
{
return malloc (sizeof (struct dm_handle));
}

I then built a .so containing just that one routine, and used ctypes to
call it from Python to get my buffer.

Congralutions on just manually re-inventing Swig :) Unless your
library is trivial (which this code indicates it is not), I would
recommend also spending a little time with Swig.

In many cases Swig can parse your header files and build the
correct wrapper. The huge advantage of Swig is that you can
use it to generate wrappers for a large number of languages.
That comes in really helpful if you will also need wrappers for
Java, TCL etc.

Roger
 
C

Craig Ringer

In many cases Swig can parse your header files and build the
correct wrapper. The huge advantage of Swig is that you can
use it to generate wrappers for a large number of languages.
That comes in really helpful if you will also need wrappers for
Java, TCL etc.

Just out of interest, do you know if SWIG can be used to generate an
extension module to provide interfaces to the API of an application that
embeds a Python interpreter?

(Still trying to get Py_NewInterpreter / Py_EndInterpreter to work in a
single-threaded app)
 
M

Mark Asbach

Hi Craig,
Just out of interest, do you know if SWIG can be used to generate an
extension module to provide interfaces to the API of an application that
embeds a Python interpreter?

Why not?

We do that for the next generation of an open source
finite-element-toolkit driven by an embedded Python interpreter.

And it was really nice to see how easy it is to retro-fit new SWIG
wrapped code to already existing hand written wrappers and externally
developed Python wrappers (vtk). So in fact we were able to dramatically
reduce the time spent for writing wrappers and adaptor code just by
using SWIG.

What we needed wouldn't have been possible with Boost.Python and it
would also mean a longer compile time.

Mark
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top