Memory leak when using a C++ module for Python

J

Jaume Bonet

Hi,

I'm pretty new in python programming.

I've been developing a C++ module for a python application that simply
gets the information from python, makes the last processing (which is
very time consuming -that's why I make it in C++-).

When I test the code from C++ each time I delete a vector the consumed
memory decreases, but it does not happen when the module is called
from python. The memory is kept... Once the data comes from python,
the PyObjects are read and its information passed to C objects (that
is done just once) and all the rest of the processing is done with
those objects.

I've read that the even when you delete the content of the vectors the
memory is not freed when you are working with python. Is that so?

Is there any way to really free that memory?

Thanks,
J
 
U

Ulrich Eckhardt

Jaume said:
When I test the code from C++ each time I delete a vector the consumed
memory decreases, but it does not happen when the module is called
from python.

What is a "vector" for you? Do you mean std::vector? A vector allocated
using malloc()? A vector allocated using new? Just provide a simple piece
of C++ and Python example code that demonstrates the problem and you will
probably get help immediatel.
I've read that the even when you delete the content of the vectors the
memory is not freed when you are working with python. Is that so?

There are things like that, but without context it's pretty hard to tell
what's going on.

Uli
 
J

Jaume Bonet

Sure, sorry...

This is the function that is visible from python and the one that the
python code calls:

static PyObject * IMFind (PyObject *self, PyObject *args, PyObject
*kwargs) {

//Array for the detection of the parameters coming from Python
static char *kwlist[] =
{"shareInt","root","prefix","lowerLimit",NULL};
//Root protein code
char *root;
//Pefix for the output file
char *prefix;
//Incoming Python object
PyObject *shareIntPy;
//Number of proteins at level two from the root, that will appear in
the iMotifs' proteins
int *pSize = new int;
//Maximum number of common interactors
int *max = new int;
//Set relating each protein (integer) with their interactors...
set<unsigned int>** shareInt;
//Lower limit for iMotif search
int lowLim;
//Vector for the seed iMotifs from an specific threshold
vector<iMotif> IMseed;
//Vector of all the working iMotifs during the identification process
vector<iMotif> IMarray;

//Receiving data from python
if (!PyArg_ParseTupleAndKeywords(args, kwargs, "Ossi", kwlist,
&shareIntPy, &root, &prefix,&lowLim)) {
cerr << "Error in parameter transference from python to iMotif C++
module"<<endl;
return Py_BuildValue("");
}

//The position of the name in the vector corresponds to the number by
which it is represented
//Here we take the info coming from python and transform it
into a vector (will allow us to work with numbers instead of
// strings) and shareInt which is an array of sets (form
std::set)
vector<string> translator = string2int
(shareIntPy,root,shareInt,pSize,max);

//Loop for iMotif search at threshold thr till the specified lower
limit
for (int thr = *max; thr >= lowLim; thr--) {
cout << "Checking threshold " << thr << endl;

//Specifying the output file name
stringstream filename;
filename << prefix << "." << thr;

//Getting the seed iMotifs for this threshold
IMseed = getSeedIM(shareInt,*pSize,thr);

//Adding the new seeds (if any) to those obtained from previous
rounds...
if (!IMseed.size()) {
IMseed.clear();
vector<iMotif>().swap(IMseed); //This is how I try to free them
now, size & capacity gets to 0, but the
// memory is
not freed...
} else {
IMarray.insert(IMarray.end(),IMseed.begin(),IMseed.end());
IMseed.clear();
vector<iMotif>().swap(IMseed);
}

//Fuse those iMotifs sharing thr interactors
// It also deletes the IMarray before giving it the
new data in the same way as I use here for IMseed
processIMVector(IMarray,thr);
writeIMVector(IMarray,translator,filename.str());

}

return Py_BuildValue("");
}

The object iMotif used here is composed by 2 sets and 3 strings just
like this: (the access is done by setters and getters)

class iMotif {
private:
//Set of the proteins that shares the iMotif with the root
set<unsigned int> proteins;
//Set of the proteins to which the iMotifs interacts
set<unsigned int> interactors;
//iMotifs interactors signature
string MD5Interactor;
//iMotifs proteins signature
string MD5Proteins;
//Real MD5 according to the sequences names
string signature;
....
}
and I specified the destructor as:
iMotif::~iMotif() {

this->proteins.clear();
set<unsigned int>().swap(this->proteins);

this->interactors.clear();
set<unsigned int>().swap(this->proteins);

}

The call to the function from python goes like this:

iMotifs.IMFind(shareInt=intersections_dict,
root=options.protein_problem, prefix=prefixOutFile, lowerLimit=1);

where intersections_dict is a dictionary and options.protein_problem
and prefixOutFile are strings. There is nothing to return from C++ to
python, the result is directly printed into several files.

Thanks,
J
 
G

Gabriel Genellina

This is the function that is visible from python and the one that the
python code calls:

static PyObject * IMFind (PyObject *self, PyObject *args, PyObject
*kwargs) {

Your function does not call any Python function except
PyArg_ParseTupleAndKeywords (which does not modify reference counts).
So it's unlikely that this could cause any memory leak. I'd revise how
memory is allocated and deallocated on the C++ side.
 
J

Jaume Bonet

When I tried the C++ function with a C++ main() (skipping the Python
part) it didn't show any memory problem, but I'll re-check it anyway,
thanks...
 
I

Ivan Illarionov

        //Here we take the info coming from python and transform it
into a vector (will allow us to work with numbers instead of
        // strings) and shareInt which is an array of sets (form
std::set)
        vector<string> translator = string2int
(shareIntPy,root,shareInt,pSize,max);

I guess if there are any problems with Python/C API in your example
they are in string2int function. How do you retrieve Python data from
'shareIntPy' Python object? If you have memory leaks you probably
forget to Py_DECREF something retrieved with Python/C API functions
that return new references (like PyObject_GetAttrString).

Ivan
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,774
Messages
2,569,596
Members
45,143
Latest member
SterlingLa
Top