Can you make this faster?

K

Kamilche

I have a routine that I really need, but it slows down processing
significantly. Can you spot any ineffeciencies in the code?

This code makes a critical function of mine run about 7x slower than
using a prebuilt format string. For maximum flexibility, it would be
best to calculate the format string using this method, so I'd dearly
love to keep it.

def fmtstring(args):
delim = '\0'
fmt = []
fmt.append('<')
for arg in args:
t = type(arg)
if t == types.StringType:
l = len(arg)
fmt.append(str(l) + 's')
elif t == types.IntType:
fmt.append('i')
elif t == types.LongType:
fmt.append('q')
elif t == types.BooleanType:
fmt.append('c')
elif t == types.FloatType:
fmt.append('d')
else:
raise Exception("Can't pack argument of type %s!" % t)
s = ''.join(fmt)
s = s + '\0'
return s
 
R

Roy Smith

I have a routine that I really need, but it slows down processing
significantly. Can you spot any ineffeciencies in the code?

This code makes a critical function of mine run about 7x slower than
using a prebuilt format string. For maximum flexibility, it would be
best to calculate the format string using this method, so I'd dearly
love to keep it.

def fmtstring(args):
delim = '\0'
fmt = []
fmt.append('<')
for arg in args:
t = type(arg)
if t == types.StringType:
l = len(arg)
fmt.append(str(l) + 's')
elif t == types.IntType:
fmt.append('i')
elif t == types.LongType:
fmt.append('q')
elif t == types.BooleanType:
fmt.append('c')
elif t == types.FloatType:
fmt.append('d')
else:
raise Exception("Can't pack argument of type %s!" % t)
s = ''.join(fmt)
s = s + '\0'
return s

The first thing is to profile your code and make sure this really is
significant in the overall scheme of things. From your description, it
sounds like you've already done that, so I'll go with that assumption.

What's the most likely value for type(arg)? If 90% of your args are
floats, for example, most of the time you work your way all the way down
the if/else tree before getting to the right spot. Putting the tests in
order of probability of occurance will help you a little.

Doing "from types import *" and then being able to just say "IntType"
instead of "types.IntType" will be a little faster since there will be
one less name lookup.

Another thing you can try is doing dictionary lookups instead of an
if/else tree. Something like:

map = {IntType: 'i',
LongType: 'q',
BooleanType: 'c',
FloatType: 'd'}
[...]
try:
fmt.append (map[t])
except KeyError:
raise Exception ("Can't pack %s" % t)

You still need to special-case StringType, however.

Lastly, is an idea which is really quite gross. If you're desperate for
speed, the majority of your args are strings, and you have no pride, it
might be worth trying, however. Instead of saying:

t = type(arg)
if t == types.StringType:
l = len(arg)

you could do something like:

try:
l = len (arg)
fmt.append(str(l) + 's')
except TypeError:
process all the non-string types

The idea here is to not waste time getting and testing the type, just
assume it's a string and deal with the consequences later if your
assumption was wrong. In real life, this plays out as, "It's easier to
ask forgiveness than permission".

Of the types you mentioned, string is the only one which which has a
length; all the others will raise TypeError when passed to len().
Exception processing is relatively slow, but if it only happens a small
percentage of the time, you might be ahead of the game this way.

Some people would call this elegant, others would call it a gross hack.
I guess it depends on your perspective, and how desperate you are to
make your code run faster :)

Of course, you could put several of these ideas together. Try the
exception-catching idea for strings first, and then sort out the other 4
types using a dictionary lookup.

My gut feeling is that even with all these tricks, the best you'll be
seeing is a factor of 2 or 3 speedup. I'd be amazed if you could get
anywhere near the 7x you're looking for.
 
?

=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=

Kamilche said:
I have a routine that I really need, but it slows down processing
significantly. Can you spot any ineffeciencies in the code?

I would use a dictionary, and I would pre-allocate the result list:

_code = {types.StringType: 's',
types.IntType: 'i',
types.LongType: 'q',
types.BooleanType: 'c',
types.FloatType: 'd'
}
def fmtstring(args):
fmt = ['\0']*(len(args)+1)
fmt.append('<')
for i, arg in enumerate(args):
t = type(arg)
try:
c = _code[t]
except KeyError:
raise Exception("Can't pack argument of type %s!" % t)
if c == 's':
fmt = str(len(arg))+'s'
else:
fmt = c
return ''.join(fmt)

Regards,
Martin
 
?

=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=

Kamilche said:
I have a routine that I really need, but it slows down processing
significantly. Can you spot any ineffeciencies in the code?

I would use a dictionary, and I would pre-allocate the result list:

_code = {types.StringType: 's',
types.IntType: 'i',
types.LongType: 'q',
types.BooleanType: 'c',
types.FloatType: 'd'
}
def fmtstring(args):
fmt = ['\0']*(len(args)+1)
fmt.append('<')
for i, arg in enumerate(args):
t = type(arg)
try:
c = _code[t]
except KeyError:
raise Exception("Can't pack argument of type %s!" % t)
if c == 's':
fmt = str(len(arg))+'s'
else:
fmt = c
return ''.join(fmt)

Regards,
Martin
 
A

alejandro david weil

I have a routine that I really need, but it slows down processing
significantly. Can you spot any ineffeciencies in the code?

This code makes a critical function of mine run about 7x slower than
using a prebuilt format string. For maximum flexibility, it would be
best to calculate the format string using this method, so I'd dearly
love to keep it.

def fmtstring(args):
delim = '\0'
fmt = []
fmt.append('<')
for arg in args:
t = type(arg)
if t == types.StringType:
l = len(arg)
fmt.append(str(l) + 's')
elif t == types.IntType:
fmt.append('i')
elif t == types.LongType:
fmt.append('q')
elif t == types.BooleanType:
fmt.append('c')
elif t == types.FloatType:
fmt.append('d')
else:
raise Exception("Can't pack argument of type %s!" % t)
s = ''.join(fmt)
s = s + '\0'
return s

I've tried this:
import types

def fmtstring(args):
delim = '\0'
fmt = []
fmt.append('<')
for arg in args:
t = type(arg)
if t == types.StringType:
l = len(arg)
fmt.append(str(l) + 's')
elif t == types.IntType:
fmt.append('i')
elif t == types.LongType:
fmt.append('q')
elif t == types.BooleanType:
fmt.append('c')
elif t == types.FloatType:
fmt.append('d')
else:
raise Exception("Can't pack argument of type %s!" % t)
s = ''.join(fmt)
s = s + '\0'
return s

typedic = {
types.StringType : 's',
types.IntType: 'i',
types.LongType: 'q',
types.BooleanType: 'c',
types.FloatType:'d'}

def myfmtstring(args):
global typedic
f2 = '<'
t = None
for arg in args:
t = type(arg)
if t == types.StringType:
f2 += str(len(arg))
try:
f2 += typedic[t]
except: #check the exception here!
raise Exception("Can't pack argument of type %s!" % t)
return f2+'\0'

if __name__=='__main__':
import unittest
TIMES = 10000

class TestFmtFuncBase(unittest.TestCase):
def setUp(self):
self.what = ['hola','que','tal',324,454,False]

def runFuncOnce(self):
#shouldnt be here!
assert(False)

def runFuncTimes(self, times=TIMES):
for i in xrange(times):
self.runFuncOnce()


class TestFunction1(TestFmtFuncBase):
def runFuncOnce(self):
return fmtstring(self.what)
def testSpeed(self):
"""Run function a lot of times to check its time"""
self.runFuncTimes()
def testValue(self):
"""Check return value"""
print self.runFuncOnce()

class TestFunction2(TestFmtFuncBase):
def runFuncOnce(self):
return myfmtstring(self.what)
def testSpeed(self):
"""Run function a lot of times to check its time"""
self.runFuncTimes()
def testValue(self):
"""Check return value"""
print self.runFuncOnce()


suiteOriginal = unittest.TestSuite()
suiteOriginal.addTest(unittest.makeSuite(TestFunction1))
suiteNew = unittest.TestSuite()
suiteNew.addTest(unittest.makeSuite(TestFunction2))

unittest.TextTestRunner(verbosity=2).run(suiteOriginal)
unittest.TextTestRunner(verbosity=2).run(suiteNew)

And seems to be not faster:

----------------------------------------------------------------------
Run function a lot of times to check its time ... ok
Check return value ... <4s3s3siic
ok

----------------------------------------------------------------------
Ran 2 tests in 0.477s

OK
Run function a lot of times to check its time ... ok
Check return value ... <4s3s3siic
ok

----------------------------------------------------------------------
Ran 2 tests in 0.396s

OK


But, anyway, the problem seems to be the:
f2 += str(len(arg))
line.

If you take it off, you'll see that time reduces to half!
Why? How to fix?
Don't know! :-(

Sorry,
dave
 
A

alejandro david weil

But, anyway, the problem seems to be the:
f2 += str(len(arg))
line.

If you take it off, you'll see that time reduces to half!
Why? How to fix?
Don't know! :-(

For example, this seems to be "better":

typedic = {
types.StringType : 's',
types.IntType: 'i',
types.LongType: 'q',
types.BooleanType: 'c',
types.FloatType:'d'}



slen = {}
for i in xrange(256):
slen=str(i)


def myfmtstring(args):
global typedic, slen
f2 = '<'
t = None
for arg in args:
t = type(arg)
if t == types.StringType:
try:
f2+=slen[len(arg)]
except:
print 'aca'
f2 += str(len(arg))
try:
f2 += typedic[t]
except: #check the exception here!
raise Exception("Can't pack argument of type %s!" % t)
return f2+'\0'
 
A

alejandro david weil

For example, this seems to be "better":

Well, that was not optimized and some debug print remains, this,
uses half time:

typedic = {
types.IntType: 'i',
types.LongType: 'q',
types.BooleanType: 'c',
types.FloatType:'d'}

slen = {}
for i in xrange(256):
slen=str(i)+'s'

def myfmtstring(args):
global typedic, slen
f2 = '<'
t = None
for arg in args:
t = type(arg)
if t == types.StringType:
try:
f2+=slen[len(arg)]
except:
f2 += str(len(arg))+'s'
else:
try:
f2 += typedic[t]
except: #check the exception here!
raise Exception("Can't pack argument of type %s!" % t)
return f2+'\0'

-> Ran 2 tests in 0.253s



Also I tried the except-catch method proposed, in this way, and got ugly results:
(removing StringType from typedic)

def my_slowest_fmtstring(args):
global typedic, slen
f2 = '<'
t = None
for arg in args:
t = type(arg)
try:
f2 += typedic[t]
except: #check the exception here!
if t == types.StringType:
try:
f2+=slen[len(arg)]
except:
f2 += str(len(arg))+'s'
else:
raise Exception("Can't pack argument of type %s!" % t)
return f2+'\0'

-> Ran 2 tests in 0.632s (worst than the first version! that gives: Ran 2 tests in 0.465s)
 
C

Christopher T King

What's the context you're using this in? Consider the following
possibilities:

Do you actually need to keep the format strings, or do are you just using
them to dump binary data to a file? If it's the latter, then just have the
function write the data directly, piece by piece, rather than building a
format string first.

Are you storing data that needs to be readable for other applications, or
are you just storing data for your own application's (temporary) needs? If
it's the latter, then consider using pickle or cPickle -- these
(especially the latter) will be much faster than doing it in Python. Don't
use them for long-term data storage, though.

If neither of the above apply to you, or give you the speedup you need,
try using the Psyco JIT compiler to compile the function. After installing
Psyco, all you need to do is to 'import psyco' and 'psyco.bind(fmtstring)'
at the top of the file your function resides in.

If that still doesn't help, you could try re-writing the code in pure C (I
doubt Pyrex would provide much more of a speedup than Psyco in this case).
Then you can use mutable, fixed-length strings to store the format
characters in.
 
W

William Park

Kamilche said:
I have a routine that I really need, but it slows down processing
significantly. Can you spot any ineffeciencies in the code?

This code makes a critical function of mine run about 7x slower than
using a prebuilt format string. For maximum flexibility, it would be
best to calculate the format string using this method, so I'd dearly
love to keep it.

def fmtstring(args):
delim = '\0'
fmt = []
fmt.append('<')
for arg in args:
t = type(arg)
if t == types.StringType:
l = len(arg)
fmt.append(str(l) + 's')
elif t == types.IntType:
fmt.append('i')
elif t == types.LongType:
fmt.append('q')
elif t == types.BooleanType:
fmt.append('c')
elif t == types.FloatType:
fmt.append('d')
else:
raise Exception("Can't pack argument of type %s!" % t)
s = ''.join(fmt)
s = s + '\0'
return s

String concatenation ('+') is main culprit. Avoid it.

Before After
------ -----
str(l) + 's' fmt.append (str(l))
fmt.append ('s')

s = ''.join(fmt) fmt.append ('\0')
s = s + '\0' ''.join(fmt)
 
K

Kamilche

Roy Smith said:
The first thing is to profile your code and make sure this really is
significant in the overall scheme of things. From your description, it
sounds like you've already done that, so I'll go with that assumption.

Yeah, it makes a difference. I did run profile on it, plus custom
timing tests. Sorry I didn't mention the typical arguments!

This routine gets passed anywhere from 5 to 20 smallish arguments,
less than 20 bytes in size. The most likely is an int, then a string,
then so on. I put them in order of likelihood already.

I tried a dictionary lookup for the argument types already, and it
slowed it down, so I took it out.

The best I was able to do, was change the 'list append' to a simple
string += . That speeded it up by 50%... only because the format
strings are relatively short, I imagine.

It's the need to calculate the strings separately that is slowing me
up! If they had a struct pack/unpack that didn't require you to pass
in a different format string just for strings of varying lengths, I'd
have my optimization - I'd put these function signatures in a hash
lookup table once I calculated it, so I'd never have to calculate it
again!

Too bad the struct pack/unpack routines require custom massaging for
strings. :-( I wish it had an option to specify a 'string delimiter'
for those of us that aren't passing binary data in the string, so it
could 'automagically' unpack the record, even with strings.
 
W

William Park

William Park said:
Kamilche said:
def fmtstring(args):
delim = '\0'
fmt = []
fmt.append('<')
for arg in args:
t = type(arg)
if t == types.StringType:
l = len(arg)
fmt.append(str(l) + 's')
elif t == types.IntType:
fmt.append('i')
elif t == types.LongType:
fmt.append('q')
elif t == types.BooleanType:
fmt.append('c')
elif t == types.FloatType:
fmt.append('d')
else:
raise Exception("Can't pack argument of type %s!" % t)
s = ''.join(fmt)
s = s + '\0'
return s

String concatenation ('+') is main culprit. Avoid it.

Before After
------ -----
str(l) + 's' fmt.append (str(l))
fmt.append ('s')

s = ''.join(fmt) fmt.append ('\0')
s = s + '\0' ''.join(fmt)

Also, change 'fmt.append' and 'types.*' with a version without '.', ie.
fmtappend = fmt.append
types_s = types.StringType
stringjoin = string.join
 
A

alejandro david weil

If neither of the above apply to you, or give you the speedup you need,
try using the Psyco JIT compiler to compile the function. After installing
Psyco, all you need to do is to 'import psyco' and 'psyco.bind(fmtstring)'
at the top of the file your function resides in.

I had never tried psyco, so I did it:

Some more stats (arguments given: [ 'hola','que','tal',324,454,False] )

no psyco | psyco
original code 4.664s | 2.132s
modified ver 2.222s | 0.840s

print 4.66 / 0.84
-> 5.54..

Not yet 7x faster :-( but 5.5 is very good i think.

modified version:
slen = []
for i in xrange(256):
slen.append(str(i)+'s'
typedic = {
types.IntType: 'i',
types.LongType: 'q',
types.BooleanType: 'c',
types.FloatType:'d'}

def myfmtstring(args):
global typedic, slen
flen = slen
ftypedic = typedic
f2 = '<'
st = types.StringType
for arg in args:
t = type(arg)
if t == st:
l = len(arg)
if l<256:
f2+=flen[l]
else:
f2+=str(l)+'s'
else:
try:
f2+=ftypedic[t]
except:
raise Exception("Can't pack argument of type %s!" % t)
return f2+'\0'

Saludos,
alejandro
 
Y

Ype Kingma

alejandro said:
....


But, anyway, the problem seems to be the:
f2 += str(len(arg))
line.

You can try this (untested):

def myfmtstring(args):
        global typedic
        f2l = ['<']
        t = None
        for arg in args:
                t = type(arg)
                if t == types.StringType:
                        f2l.append(str(len(arg)))
                try:
                        f2l.append(typedic[t])
                except: #check the exception here!
                        raise Exception("Can't pack argument of type %s!" % t)
        fl2.append('\0')
return ''.join(f2l)
If you take it off, you'll see that time reduces to half!
Why? How to fix?

Appending to a string copies the original completely.

Good luck,
Ype
 
A

Andrea Griffini

I have a routine that I really need, but it slows down processing
significantly. Can you spot any ineffeciencies in the code?

This code makes a critical function of mine run about 7x slower than
using a prebuilt format string. For maximum flexibility, it would be
best to calculate the format string using this method, so I'd dearly
love to keep it.

After making a few experiments I found this faster...

def fmtstring5(args,
_type = type,
_len = len,
_str = str,
_int = int,
_long = long,
_bool = bool,
_float = float):
fmt = "<"
for arg in args:
t = _type(arg)
if t is _str:
l = _len(arg)
fmt = fmt + _str(_len(arg)) + 's'
elif t is _int:
fmt = fmt + 'i'
elif t is _long:
fmt = fmt + 'q'
elif t is _bool:
fmt = fmt + 'c'
elif t is _float:
fmt = fmt + 'd'
else:
raise Exception("Can't pack argument of type %s!" % t)
return fmt+'\0'

In other words I tried to remove dynamic lookups for predefined
functions and used "is int" instead of the "types." stuff.
s = s + x seems faster than s += x, and s = s + x seems faster
that append & "".join also (at least for small formats).

I tried to remove all setup cost (using a generator acting on
a global variable), but gained nothing (and indeed that was a
bit slower). I tried also to remove dynamic lookups for the
addition (concatenation), but in no case I found results better
than the plain vanilla addition syntax.

HTH
Andrea

PS: I'm new to python, and found its speed interesting and
more than adequate for many uses... It's however kind
of depressing seeing that so much dinamicity is "wasted"
in places where it's not needed (e.g. when calling "len"
or "str"). This version was running on the examples I
tested about twice the speed of the original version.

So I suppose that the same is happening all over my
python code :-(
 
A

alejandro david weil

After making a few experiments I found this faster...

def fmtstring5(args, [..]
t = _type(arg)
if t is _str:
l = _len(arg)
fmt = fmt + _str(_len(arg)) + 's'
elif t is _int: [..]
else:
raise Exception("Can't pack argument of type %s!" % t)
return fmt+'\0'
PS: I'm new to python, and found its speed interesting and
more than adequate for many uses... It's however kind
of depressing seeing that so much dinamicity is "wasted"
in places where it's not needed (e.g. when calling "len"
or "str"). This version was running on the examples I
tested about twice the speed of the original version.

It seems that no one has read my last post :)

You can replace str() with an array access for many of the strings as you want..
let's say..

slen = []
for i in xrange(256):
slen.append(str(i))

And then, in your function:

def mystrbla..():
...
try:
fmt += slen[_len(arg)] + 's'
except:
fmt += str(_len(arg)) + 's'
Please, try it! :)

------------------------------------------------------------------

And, another improvment is add the 's' to the
slen array :) and..:

slen = []
for i in xrange(256):
slen.append(str(i)+'s')

def mystrbla..():
...
l = _len(arg)
if l <256:
fmt += slen[l] + 's'
except:
fmt += str(l) + 's'


Unreaded greetings!
alejandro
 
A

Andrea Griffini

It seems that no one has read my last post :)

Newsfeeding isn't synchronous by itself. News reading is
even less deterministic :)

....
And, another improvment is add the 's' to the
slen array :) and..:

Original version 1.640
My version 0.782
Improved with slen[] 0.687

What however I find depressing is thinking that in so
many places the python runtime is wasting time looking
for something I'd never do (like replacing "len" or "str").
And for a function like the one the OP posted looks like
we're talking about more than 50% of the time being spent
doing those pointless searches.

It's IMO true that python is "fast enough" for most
situations... but ... what if I get sued from the
association for the rights of CPUs ? :)

I didn't check python sources, is there a timestamping
for namespaces making it possible to cache lookups ?
Was this tried and didn't pay off ?

Andrea

PS: Sorry if all this is hyper-naive...
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,767
Messages
2,569,572
Members
45,045
Latest member
DRCM

Latest Threads

Top