J
Jacob Lee
There are a bunch of new tests up at shootout.alioth.debian.org for which
Python does not yet have code. I've taken a crack at one of them, a task
to print the reverse complement of a gene transcription. Since there are a
lot of minds on this newsgroup that are much better at optimization than
I, I'm posting the code I came up with to see if anyone sees any
opportunities for substantial improvement. Without further ado:
table = string.maketrans('ACBDGHK\nMNSRUTWVY', 'TGVHCDM\nKNSYAAWBR')
def show(s):
i = 0
for char in s.upper().translate(table)[::-1]:
if i == 60:
print
i = 0
sys.stdout.write(char)
i += 1
print
def main():
seq = ''
for line in sys.stdin:
if line[0] == '>' or line[0] == ';':
if seq != '':
show(seq)
seq = ''
print line,
else:
seq += line[:-1]
show(seq)
main()
Making seq into a list instead of a string (and using .extend instead of
the + operator) didn't give any speed improvements. Neither did using a
dictionary instead of the translate function, or using reversed() instead
of s[::-1]. The latter surprised me, since I would have guessed using an
iterator to be more efficient. Since the shootout also tests memory usage,
should I be using reversed for that reason? Does anyone have any other
ideas to optimize this code?
By the way - is there a good way to find out the maximum memory a program
used (in the manner of the "time" command)? Other than downloading and
running the shootout benchmark scripts, of course.
Python does not yet have code. I've taken a crack at one of them, a task
to print the reverse complement of a gene transcription. Since there are a
lot of minds on this newsgroup that are much better at optimization than
I, I'm posting the code I came up with to see if anyone sees any
opportunities for substantial improvement. Without further ado:
table = string.maketrans('ACBDGHK\nMNSRUTWVY', 'TGVHCDM\nKNSYAAWBR')
def show(s):
i = 0
for char in s.upper().translate(table)[::-1]:
if i == 60:
i = 0
sys.stdout.write(char)
i += 1
def main():
seq = ''
for line in sys.stdin:
if line[0] == '>' or line[0] == ';':
if seq != '':
show(seq)
seq = ''
print line,
else:
seq += line[:-1]
show(seq)
main()
Making seq into a list instead of a string (and using .extend instead of
the + operator) didn't give any speed improvements. Neither did using a
dictionary instead of the translate function, or using reversed() instead
of s[::-1]. The latter surprised me, since I would have guessed using an
iterator to be more efficient. Since the shootout also tests memory usage,
should I be using reversed for that reason? Does anyone have any other
ideas to optimize this code?
By the way - is there a good way to find out the maximum memory a program
used (in the manner of the "time" command)? Other than downloading and
running the shootout benchmark scripts, of course.