generic way to access C++ libs?

  • Thread starter Gabriel Zachmann
  • Start date
G

Gabriel Zachmann

Is there any generic way to use C++ libraries from within Python.

I seem to recall that there are tools to generate wrappers for C-libraries
semi-automatically.

But those were still way too cumbersome, IMHO.

What I would like to have is some module (or whatever), with which I can
say "load this C++ library", and then, "create that C++ object" or "call
method x of C++ object y".

Without doing anything else (such as recompiling the library or generating
wrappers).

I agree that templates could pose a major problem, so I would be happy if
it worked with pre-instantiated templates.

Is there anything?
Gab.

--
/-------------------------------------------------------------------------\
| There are works which wait, |
| and which one does not understand for a long time; [...] |
| for the question often arrives a terribly long time after the answer. |
| (Oscar Wilde) |
+-------------------------------------------------------------------------+
| (e-mail address removed)-bonn.de __@/' www.gabrielzachmann.org |
\-------------------------------------------------------------------------/
 
D

Diez B. Roggisch

What I would like to have is some module (or whatever), with which I can
say "load this C++ library", and then, "create that C++ object" or "call
method x of C++ object y".

C++ has no concept of runtime type information (except from saying "this is
of class X). This sort of information is known as "reflection" in java or
python, and that allows for late time bindings to those languages. In other
words: you can create on the fly wrappers for them.

But c++ lacking this means that you need to feed the specification of
objecst - the header files - to some generator. WHich is exactly what
happens in the wrapper generators like swig and sip. I never toyed around
with boost, but I don't believe they made the impossible possible.

So to answer your question: Its not possible. There are other reasons as
well: C++ defines no binary layout of objects, the result is that a lib
complied with different compilers (gcc, intel, msvc) results in
incompatible binaries. So even if one would ship the lib with the header
files: to actually generate the wrappers, the appropriate compliler has to
be used. And as you rarely find compilers of different kinds on end user
machines, you'll have to do it on the developers machine.
Without doing anything else (such as recompiling the library or generating
wrappers).

As I just said: not doable.

Third thing is that differences in language design make it necessary to
invest developer time in creating wrappings: In c++ you can pass arguments
by a pointer to them, allowing to modify them in the callers stackframe (or
whereever they live). As this is not possible in python, you need to work
around this, usually by returning a tuple of modified values in addition to
the result of the method/function itself.

So there are strong limits on automatically generationg wrappers - a
minmimum of work has to be done.
 
G

Gabriel Zachmann

C++ has no concept of runtime type information (except from saying "this
is
of class X). This sort of information is known as "reflection" in java
or

let's assume we have a well-populatyed symbol table in the lib
(which is usually the case, or, at least, not too hard a restriction).
So to answer your question: Its not possible. There are other reasons as
well: C++ defines no binary layout of objects, the result is that a lib
complied with different compilers (gcc, intel, msvc) results in
incompatible binaries. So even if one would ship the lib with the header

that's not quite true.
actually, each platform (wintel, linux, ...) has a pretty well-defined
object file format (ELF under unix/linux, for instance).
current icc/linux and gcc/g++ work pretty well together in most cases, and
icc/windows and cl, too.

It is well understood that the envisioned python module would have to be
platform-specific.


Best regards,
gabriel.


--
/-------------------------------------------------------------------------\
| There are works which wait, |
| and which one does not understand for a long time; [...] |
| for the question often arrives a terribly long time after the answer. |
| (Oscar Wilde) |
+-------------------------------------------------------------------------+
| (e-mail address removed)-bonn.de __@/' www.gabrielzachmann.org |
\-------------------------------------------------------------------------/
 
D

Diez B. Roggisch

let's assume we have a well-populatyed symbol table in the lib
(which is usually the case, or, at least, not too hard a restriction).

So what? The names are mangled - and each one according to its compilers own
rules. Even if symbols themselves are defined: there is no data type
structure layout for structs and classes themselves stored in the symbol
table.
that's not quite true.
actually, each platform (wintel, linux, ...) has a pretty well-defined
object file format (ELF under unix/linux, for instance).

I did not talk about binary executable formats, but the memory layout of C++
objects. The c++ standard doesn't define where e.g. the vtable of an
objects virtual method resides - or even if virtual methods have to be
implemented by a vtable at first place.

A c++ object created by g++ is total garbage passed to a VC lib that appears
to
current icc/linux and gcc/g++ work pretty well together in most cases, and
icc/windows and cl, too.

No, they don't - not for C++ code. Google for name mangling and the reasons
why every compiler uses its own scheme. Intel claims that there is binary
compatibility between them compiler and gcc, but thats only true for
certain compiler versions - which usage you have no control of in your
planned scenario.

I suggest you first delve somewhat more on the subject of c++ code
generation and difficulties observed by those trying to develop libraries
for c++ (libs that are shipped to customer/users not for their own projects
of course). Thats one major reason why there are only few c++ libs out
there - the pains e.g. trolltech has to go through to not break binary
compatibility between different versions can be observed here:

http://developer.kde.org/documentation/library/kdeqt/kde3arch/devel-binarycompatibility.html
 
D

Diez B. Roggisch

A c++ object created by g++ is total garbage passed to a VC lib that
appears to

That sentence should be


A c++ object created by g++ is total garbage passed to a VC lib that
appears to use the same objects denoted by some headerfile.
 
S

Skip Montanaro

Diez> A c++ object created by g++ is total garbage passed to a VC lib
Diez> that appears to use the same objects denoted by some headerfile.

Sure, but the name mangling schemes are certainly well-defined. The GNU
c++filt program on my Mac understands the following formats according to its
--help output:

none,auto,gnu,lucid,arm,hp,edg,gnu-v3,java,gnat

I don't know what most of them are, but I guess c++filt does. I imagine
something like ctypes could be trained to know how to decipher the
signatures as well. There's still the problem of templates.

Skip
 
D

Diez B. Roggisch

Sure, but the name mangling schemes are certainly well-defined. The GNU
c++filt program on my Mac understands the following formats according to
its --help output:

none,auto,gnu,lucid,arm,hp,edg,gnu-v3,java,gnat

I don't know what most of them are, but I guess c++filt does. I imagine
something like ctypes could be trained to know how to decipher the
signatures as well. There's still the problem of templates.

The name mangling is not the important part - memory layout of the objects
is. As setting members is made due to offsets to the objects address in
memory, one has to know exactly in which order declared and possibly
inherited members are layed out. And as this is not part of the c++
standard, every compiler does it as it suits it.
 
N

Neil Hodgson

Diez B. Roggisch:
The name mangling is not the important part - memory layout of the objects
is. As setting members is made due to offsets to the objects address in
memory, one has to know exactly in which order declared and possibly
inherited members are layed out. And as this is not part of the c++
standard, every compiler does it as it suits it.

The layout is also modified by various compiler options. I believe that,
with symbolic debugging information, this feature could be implemented but
it would be a large amount of work.

Neil
 
J

Jacek Generowicz

Gabriel Zachmann said:
Is there any generic way to use C++ libraries from within Python.
Without doing anything else (such as recompiling the library or generating
wrappers).

Bit of a tall order, don't you think?

What would be so cumbersome about invoking a single program which
requires the location of the library, the location of its headers, and
which gives you a Python module wrapping the library in return ?
 
A

Alex Martelli

Jacek Generowicz said:
Bit of a tall order, don't you think?

Well, ctypes does that for C libraries (as long as they're
DLL/so/dynlib/...), it's not immediately obvious that using C++
libraries is an order of magnitude harder (though probably true).

What would be so cumbersome about invoking a single program which
requires the location of the library, the location of its headers, and
which gives you a Python module wrapping the library in return ?

Without a C/C++ compiler around, you mean? Most Python users these days
don't have one (as they use Python on Windows)...


Alex
 
J

Jacek Generowicz

Well, ctypes does that for C libraries (as long as they're
DLL/so/dynlib/...), it's not immediately obvious that using C++
libraries is an order of magnitude harder (though probably true).

Maybe not _immediately_ obvious, but obvious after a few minutes
thought :)
Without a C/C++ compiler around, you mean? Most Python users these days
don't have one (as they use Python on Windows)...

Good point. I hadn't though of this one. In my environment the users
are expected to have at least one C++ compiler, and are even expected
to use it on a regular basis.
 
A

Alex Martelli

Jacek Generowicz said:
Maybe not _immediately_ obvious, but obvious after a few minutes
thought :)

To somebody with a good grasp of the current state of C++ technology,
maybe. Somebody who might just like to using existing dynlib/&c which
happen to be oriented to C++ rather than C might quite reasonably not
find the distinction obvious, IMHO.

Indeed, I suspect ctypes could be extended to do some of the requested
task, if one focused on a single, specific C++ compiler.

Good point. I hadn't though of this one. In my environment the users
are expected to have at least one C++ compiler, and are even expected
to use it on a regular basis.

Ah, yes, a definitely atypical environment. Anyway, if my guess is
correct that the demand for such a 'c++types' is really burning only on
Windows, then maybe it could be made for MS VC++7.1 specifically. But,
it IS just a guess. ctypes does require some understanding of some C
concepts, for example, even though it does not require access to a C
compiler; it also requires specific coding to a certain dynlib's
interfaces. I think Boost Python, if all needed tools were present,
might be able to do a more automatic job of producing the wrapper, maybe
requiring from the Python-level user even less C++ knowledge than that
hypothetical c++types might...


Alex
 
G

Gabriel Zachmann

The name mangling is not the important part - memory layout of the objects
is. As setting members is made due to offsets to the objects address in
memory, one has to know exactly in which order declared and possibly

given the header file of the c++ lib, it should be possible to determine
that at run-time.

Or, one could give the header file to the generic python wrapper module and
tell the module also, with which compiler the c++ lib was compiled.

Gab.

--
/-------------------------------------------------------------------------\
| There are works which wait, |
| and which one does not understand for a long time; [...] |
| for the question often arrives a terribly long time after the answer. |
| (Oscar Wilde) |
+-------------------------------------------------------------------------+
| (e-mail address removed)-bonn.de __@/' www.gabrielzachmann.org |
\-------------------------------------------------------------------------/
 
G

Gabriel Zachmann

The layout is also modified by various compiler options.

really? that would mean that c++ libs themselves are not binary compatible
among each other?

(You are not talking of that insane issue with M$'s libs when compiled with
or without debug info, are you?)
I believe that,
with symbolic debugging information, this feature could be implemented but
it would be a large amount of work.

i can believe that. but it would tremendously foster Python's spreading.

cheers,
gab.

--
/-------------------------------------------------------------------------\
| There are works which wait, |
| and which one does not understand for a long time; [...] |
| for the question often arrives a terribly long time after the answer. |
| (Oscar Wilde) |
+-------------------------------------------------------------------------+
| (e-mail address removed)-bonn.de __@/' www.gabrielzachmann.org |
\-------------------------------------------------------------------------/
 
D

Diez B. Roggisch

given the header file of the c++ lib, it should be possible to determine
that at run-time.
Or, one could give the header file to the generic python wrapper module
and tell the module also, with which compiler the c++ lib was compiled.

For a certain compiler, that might work. But usually, a shipped c++ library
hasn't its header files attached to it.

And as I said before: Not everything in c++ allows for direct translation.
Go take a look at sip or swig, and how to write wrappers for them - in
theory, its only copying the header file. In practice, some amount of extra
work has to be done.

And while it might be possible to make the dyn-wrapper know the internals of
certain compilers, keeping track of these is a tedious and errorprone task
- so why not instead let the compilers do that work? And voila, you've got
your average wrapper generator.
 
G

Gabriel Zachmann

Looked at boost::python?

Thanks a lot!

That's a very neat tool (like everything from Boost ;-) ),
and pretty close to what I was envisioning,
except that one still has to sort of manually transform header files into
BOOST_PYTHON_MODULE declarations ...

Cheers,
Gab.


--
/-------------------------------------------------------------------------\
| There are works which wait, |
| and which one does not understand for a long time; [...] |
| for the question often arrives a terribly long time after the answer. |
| (Oscar Wilde) |
+-------------------------------------------------------------------------+
| (e-mail address removed)-bonn.de __@/' www.gabrielzachmann.org |
\-------------------------------------------------------------------------/
 
J

Jacek Generowicz

To somebody with a good grasp of the current state of C++ technology,
maybe. Somebody who might just like to using existing dynlib/&c which
happen to be oriented to C++ rather than C might quite reasonably not
find the distinction obvious, IMHO.

I Absolutely agree ... which is why I put the smiley there.
Anyway, if my guess is correct that the demand for such a 'c++types'
is really burning only on Windows,

What's behind this guess ?
I think Boost Python, if all needed tools were present, might be
able to do a more automatic job of producing the wrapper,

Hmmm. Boost does refuse to make any assumptions, and therefore
requires human intervention in quite a few cases. Maybe default
assumptions could be built into Pyste. Maybe they already have; it's a
while since I looked at Pyste.

And I also suspect that, in practice, there will be a fairly low limit
on the size of library that Boost can wrap, because of compile-time
memory consumption. ISTRT wrapping about 20-30 methods taken from
about 5 classes used up half a Gb of RAM. So you probably don't want
the throw Boost at a whole library indiscriminantly.

Having said that, I recall that was some utility being developed to
split the compilation into small submodules, as an attempt to manage
the memory explosion.
 
A

Alex Martelli

Jacek Generowicz said:
What's behind this guess ?

The fact that Windows is the only widespread OS where getting a compiler
isn't as easy as installing it, for free, off the OS media. Admittedly
many Mac users don't even bother to install the compilers and IDEs that
Apple packs on the MacOSX media, but then these are people not at all
interested in developing programs -- they wouldn't program using the
hypothetical c++types either.

Hmmm. Boost does refuse to make any assumptions, and therefore
requires human intervention in quite a few cases. Maybe default
assumptions could be built into Pyste. Maybe they already have; it's a
while since I looked at Pyste.

Yeah, pyste is what I had in mind, I had just forgotten the name (it had
been quite a while for me, too).
And I also suspect that, in practice, there will be a fairly low limit
on the size of library that Boost can wrap, because of compile-time
memory consumption. ISTRT wrapping about 20-30 methods taken from
about 5 classes used up half a Gb of RAM. So you probably don't want
the throw Boost at a whole library indiscriminantly.

Wow -- THAT bad?! Eeek.
Having said that, I recall that was some utility being developed to
split the compilation into small submodules, as an attempt to manage
the memory explosion.

Makes sense, I guess.


Alex
 
J

Jacek Generowicz

The fact that Windows is the only widespread OS where getting a compiler
isn't as easy as installing it, for free, off the OS media.

Ah, indeed. Once again my unusual environment stops me from seeing the
blatantly obvious :)
Wow -- THAT bad?! Eeek.

I went back and checked.

38 methods (4 classes)
2 non-member functions.
4 instantiations of std::vector, with just one method each.

Compiling this with 256Mb RAM made the machine unusable for about 10
mins.

Upgrading to 768Mb RAM, made it not need to swap ... but it still
wasn't a breeze.

Sorry, can't be bothered to check exactly how much memory it does use.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,785
Messages
2,569,624
Members
45,318
Latest member
LuisWestma

Latest Threads

Top