Slow network reading?

Ivan Voras · May 11, 2006

I have a simple network protocol client (it's a part of this:
http://sqlcached.sourceforge.net) implemented in Python, PHP and C.
Everything's fine, except that the Python implementation is the slowest
- up to 30% slower than the PHP version (which implements exactly the
same logic, in a class).

In typical usage (also in the benchmark), an object is created and
..query is called repeatedly. Typical numbers for the benchmark are:

For Python version:

Timing 100000 INSERTs...
5964.4 qps
Timing 100000 SELECTs...
7491.0 qps

For PHP version:

Timing 100000 inserts...
7820.2 qps
Timing 100000 selects...
9926.2 qps

The main part of the client class is:

----

import os, socket, re

class SQLCacheD_Exception(Exception):
pass

class SQLCacheD:

DEFAULT_UNIX_SOCKET = '/tmp/sqlcached.sock'
SC_VER_SIG = 'sqlcached-1'
SOCK_UNIX = 'unix'
SOCK_TCP = 'tcp'

re_rec = re.compile(r"\+REC (\d+), (\d+)")
re_ok = re.compile(r"\+OK (.+)")
re_ver = re.compile(r"\+VER (.+)")

def __init__(self, host = '/tmp/sqlcached.sock', type = 'unix'):
if type != SQLCacheD.SOCK_UNIX:
raise

self.sock = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)
self.sock.connect(host)
self.sf = self.sock.makefile('U', 4000)

self.sf.write("VER %s\r\n" % SQLCacheD.SC_VER_SIG)
self.sf.flush()
if self.sf.readline().rstrip() != '+VER %s' % SQLCacheD.SC_VER_SIG:
raise SQLCacheD_Exception("Handshake failure (invalid
version signature?)")

def query(self, sql):
self.sf.write("SQL %s\r\n" % sql)
self.sf.flush()
resp = self.sf.readline().rstrip()
m = SQLCacheD.re_rec.match(resp)
if m != None: # only if some rows are returned (SELECT)
n_rows = int(m.group(1))
n_cols = int(m.group(2))
cols = []
for c in xrange(n_cols):
cols.append(self.sf.readline().rstrip())
rs = []
for r in xrange(n_rows):
row = {}
for c in cols:
row[c] = self.sf.readline().rstrip()
rs.append(row)
return rs
m = SQLCacheD.re_ok.match(resp)
if m != None: # no rows returned (e.g. INSERT/UPDATE/DELETE)
return True
raise SQLCacheD_Exception(resp)

----

My question is: Am I missing something obvious? The C implementation is
(as expected) the fastest with result of 10000:15000, but somehow I
expected the Python one to be closer to, or even faster than PHP.

I tried using 'r' mode for .makefile() but it had no significant effect.

Andrew MacIntyre · May 11, 2006

Ivan said:
def query(self, sql):
self.sf.write("SQL %s\r\n" % sql)
self.sf.flush()
resp = self.sf.readline().rstrip()
m = SQLCacheD.re_rec.match(resp)
if m != None: # only if some rows are returned (SELECT)
n_rows = int(m.group(1))
n_cols = int(m.group(2))
cols = []
for c in xrange(n_cols):
cols.append(self.sf.readline().rstrip())
rs = []
for r in xrange(n_rows):
row = {}
for c in cols:
row[c] = self.sf.readline().rstrip()
rs.append(row)
return rs
m = SQLCacheD.re_ok.match(resp)
if m != None: # no rows returned (e.g. INSERT/UPDATE/DELETE)
return True
raise SQLCacheD_Exception(resp)

Comparative CPU & memory utilisation statistics, not to mention platform
and version of Python, would be useful hints...

Note that the file-like object returned by makefile() has significant
portions of heavy lifting code in Python rather than C which can be a
drag on ultimate performance... If on a Unix platform, it may be worth
experimenting with os.fdopen() on the socket's fileno() to see whether
the core Python file object (implemented in C) can be used in place of
the lookalike returned from the makefile method.

Even without that, you are specifying a buffer size smaller than the
default (8k - see Lib/socket.py). 16k might be even better.

Although they're only micro-optimisations, I'd be interested in the
relative performance of the query method re-written as:

def query(self, sql):
self.sf.write("SQL %s\r\n" % sql)
self.sf.flush()
sf_readline = self.sf.readline
resp = sf_readline().rstrip()
m = self.re_rec.match(resp)
if m is not None:
# some rows are returned (SELECT)
rows = range(int(m.group(1)))
cols = range(int(m.group(2)))
for c in cols:
cols[c] = sf_readline().rstrip()
for r in rows:
row = {}
for c in cols:
row[c] = sf_readline().rstrip()
rows[r] = row
return rows
elif self.re_ok.match(resp) is not None:
# no rows returned (e.g. INSERT/UPDATE/DELETE)
return True
raise SQLCacheD_Exception(resp)

This implementation is based on 2 strategies for better performance:
- minimise name lookups by hoisting references from outside the method
to local references;
- pre-allocate lists when the required sizes are known, to avoid the
costs associated with growing them.

Both strategies can pay fair dividends when the repetition counts are
large enough; whether this is the case for your tests I can't say.

--

Ivan Voras · May 12, 2006

Andrew said:
Comparative CPU & memory utilisation statistics, not to mention platform
and version of Python, would be useful hints...

During benchmarking, all versions cause all CPU to be used, but Python
version has ~1.5x more CPU time allocated to it than PHP. Python is 2.4.1

Note that the file-like object returned by makefile() has significant
portions of heavy lifting code in Python rather than C which can be a
drag on ultimate performance... If on a Unix platform, it may be worth
experimenting with os.fdopen() on the socket's fileno() to see whether
the core Python file object (implemented in C) can be used in place of
the lookalike returned from the makefile method.

That's only because I need the .readline() function. In C, I'm using
fgets() (with the expectation that iostream will buffer data).

Even without that, you are specifying a buffer size smaller than the
default (8k - see Lib/socket.py). 16k might be even better.

The benchmark is such that all of data is < 200 bytes. I estimate that
in production almost all protocol data will be < 4KB.

Although they're only micro-optimisations, I'd be interested in the
relative performance of the query method re-written as:

The change (for the better) is minor (3-5%).

Andrew MacIntyre · May 13, 2006

Ivan said:
During benchmarking, all versions cause all CPU to be used, but Python
version has ~1.5x more CPU time allocated to it than PHP. Python is 2.4.1

A pretty fair indication of the Python interpreter doing a lot more work...

That's only because I need the .readline() function. In C, I'm using
fgets() (with the expectation that iostream will buffer data).

The readline method of the file object lookalike returned by makefile
implements all of the line splitting logic in Python code, which is very
likely where the extra process CPU time is going. Note that this code is
in Python for portability reasons, as Windows socket handles cannot be
used as file handles the way socket handles on Unix systems can be.

If you are running on Windows, a fair bit of work will be required to
improve performance as the line splitting logic needs to be moved to
native code - I wonder whether psyco could do anything with this?.

The benchmark is such that all of data is < 200 bytes. I estimate that
in production almost all protocol data will be < 4KB.

A matter of taste perhaps, but that seems to me like another reason not
to bother with a non-default buffer size.

The change (for the better) is minor (3-5%).

Given your comments above about how much data is actually involved, I'm
a bit surprised that the tweaked version actually produced a measurable
gain.

Ivan Voras · May 13, 2006

Andrew said:
The readline method of the file object lookalike returned by makefile
implements all of the line splitting logic in Python code, which is very
likely where the extra process CPU time is going. Note that this code is

Heh, I didn't know that - you're probably right about this being a
possible bottleneck.

in Python for portability reasons, as Windows socket handles cannot be
used as file handles the way socket handles on Unix systems can be.

I think they actually can in NT and above... but no, I'm doing it on Unix.

Given your comments above about how much data is actually involved, I'm
a bit surprised that the tweaked version actually produced a measurable
gain.

I didn't do statistical analysis of the results so the difference
actually could be negligable IRL.

Anyway, thanks for the advice - I'll leave it as it is, as the Python
client is not used currently.

native python matrix class (2D list), without inverse	0	Jun 14, 2007
matrix class	2	Jun 13, 2007
dict.get and str.xsplit	10	Feb 26, 2008
Digest MD5 authentication over using ZSI	1	Sep 4, 2005
[Numeric] column vector faster than row vector in mat multiply?	1	Mar 4, 2005
Bash-like brace expansion	6	Mar 24, 2009
[Fwd: Problems with PyGridTableBase]	0	Sep 9, 2006
Ligmail bug?	0	Aug 12, 2007

Slow network reading?

Ivan Voras

Andrew MacIntyre

Ivan Voras

Andrew MacIntyre

Ivan Voras

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads