list2str and performance

S

spr

Hi,

I'm trying to learn Python and I'd appreciate any comments about my
small snippets of code. I read an old anecdote about performance here:

http://www.python.org/doc/essays/list2str/

First, I tried to figure out what would be the most pythonic approach by
today's standards:


def ListToStr(l):
"""Convert a list of integers into a string.
"""
return "".join([chr(n) for n in l])

def StrToList(s):
"""Convert a string into a list of integers.
"""
return [ord(c) for c in s]


By the way, using a generator expression in this case seem a bit slower
than a list comprehension and I'm not sure why.

I tried to improve the performance with Psyco and it became about 6
times as fast just for free. Then, I quickly tried Pyrex but I didn't
feel comfortable with using a dialect. So I trying Boost.Python, and as
I optimized the code it became more and more C-ish. The result is this:


// Convert a list of integers into a string.
PyObject* ListToStr(const boost::python::list& l)
{
PyObject* s;
Py_BEGIN_ALLOW_THREADS

const size_t length = PyList_GET_SIZE(l.ptr());

// Couldn't find a function for allocating a PyString from scratch.
s = (PyObject*)_PyObject_NewVar(&PyString_Type, length);
((PyStringObject*)s)->ob_shash = -1;
((PyStringObject*)s)->ob_sstate = SSTATE_NOT_INTERNED;
((PyStringObject*)s)->ob_sval[((PyStringObject*)s)->ob_size] = '\0';

char* s_items = PyString_AS_STRING(s);
char* ps = s_items, *ps_end = ps + length;
PyIntObject** pl = (PyIntObject**)((PyListObject*)l.ptr())->ob_item;

while (ps < ps_end)
{
*ps++ = (char)(*pl++)->ob_ival;
}

Py_END_ALLOW_THREADS
return s;
}

// Convert a string into a list of integers.
PyObject* StrToList(const boost::python::str& s)
{
PyObject* l;
Py_BEGIN_ALLOW_THREADS

const size_t length = PyString_GET_SIZE(s.ptr());
l = PyList_New(length);

PyObject** pl = ((PyListObject*)l)->ob_item, **pl_end = pl + length;
unsigned char* ps = (unsigned char*)PyString_AS_STRING(s.ptr());

while (pl < pl_end)
{
*pl++ = PyInt_FromLong(*ps++);
}

Py_END_ALLOW_THREADS
return l;
}

Is it safe here to use Py_BEGIN_ALLOW_THREADS and Py_END_ALLOW_THREADS?
On my machine this is about 50 times as fast as plain Python, but is
this as fast as it can get?

When performance matters and you have to develop a CPU-bound
application, do you think it is possible to eventually achieve nearly
the best performance by extending Python with C or C++ modules, or is it
better to take the embedding approach, that is, use a C or C++ core that
calls Python scripts?
 
F

Fredrik Lundh

spr said:
When performance matters and you have to develop a CPU-bound
application, do you think it is possible to eventually achieve nearly
the best performance by extending Python with C or C++ modules, or is it
better to take the embedding approach, that is, use a C or C++ core that
calls Python scripts?

the embedding approach only makes sense if you 1) have an existing application
that you want to extend with Python, or 2) are trying to sneak Python into an exist-
ing project (it's just a library, you know ;-).

for any other case, putting "Python at the top" is a lot more practical.

</F>
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,780
Messages
2,569,610
Members
45,255
Latest member
TopCryptoTwitterChannels

Latest Threads

Top