I need help speeding up an app that reads football scores and generates rankings

J

jocknerd

About 10 years ago, I wrote a C app that would read scores from
football games and calculate rankings based on the outcome of the
games. In fact, I still use this app. You can view my rankings at
http://members.cox.net/jocknerd/football.

A couple of years ago, I got interested in Python and decided to
rewrite my app in Python. I got it to work but its painfully slow
compared to the C app. I have a file containing scores of over 1500
high school football games for last season. With my Python app, it
takes about 3 minutes to process the rankings. With my C app, it
processes the rankings in less than 15 seconds.

The biggest difference in my two apps is the C app uses linked lists.
I feel my Python app is doing too many lookups which is causing the
bottleneck.

I'd love some feedback regarding how I can improve the app. I'd like
to drop the C app eventually. Its really ugly. My goal is to
eventually get the data stored in PostgreSQL and then have a Django
powered site to process and display my rankings.

You can download the source code from http://members.cox.net/jocknerd/downloads/fbratings.py
and the data file from http://members.cox.net/jocknerd/downloads/vhsf2006.txt

Thanks!
 
M

Marc 'BlackJack' Rintsch

The biggest difference in my two apps is the C app uses linked lists.
I feel my Python app is doing too many lookups which is causing the
bottleneck.

Then replace those linear searches you wrote in Python with a dictionary.

Ciao,
Marc 'BlackJack' Rintsch
 
T

Terry Reedy

| About 10 years ago, I wrote a C app that would read scores from
| football games and calculate rankings based on the outcome of the
| games. In fact, I still use this app. You can view my rankings at
| http://members.cox.net/jocknerd/football.
|
| A couple of years ago, I got interested in Python and decided to
| rewrite my app in Python. I got it to work but its painfully slow
| compared to the C app. I have a file containing scores of over 1500
| high school football games for last season. With my Python app, it
| takes about 3 minutes to process the rankings. With my C app, it
| processes the rankings in less than 15 seconds.

A ratio of 12 to 1 is not bad. However....

| The biggest difference in my two apps is the C app uses linked lists.
| I feel my Python app is doing too many lookups which is causing the
| bottleneck.

You have to do as many lookups as you have to do, but looking up teams by
name in a linear scan of a list is about the slowest way possible. Replace
'teamlist' with a dict 'teams' keyed by team name. Replace
'lookupTeam(team)' by 'if team not in teams: addTeam(team)' and delete the
lookupTeam function. Similarly 'lookupTeamRate(team)' becomes
'teams[team]['grate'] (and delete function). And
'updateTeamRate(team,rate)' becomes teams[team]['rate'] = rate' (and delete
function. And similarly for updateTeamRating and anything else using
teamlist. In many places, multiple lookups in teams could be eliminated.
For instance, 'team1 = teams[g['team1']]. Then use 'team1' to manipulate
its rating and other attributes.

| You can download the source code from
http://members.cox.net/jocknerd/downloads/fbratings.py
| and the data file from
http://members.cox.net/jocknerd/downloads/vhsf2006.txt

Minor point. Multiple functions do 'localvar = <expression>; return
localvar'. The simpler 'return <expression>' will be slightly faster.
Your comments and function name eliminate any documentary need for the
otherwise useless local var.

Function calls are relatively slow in Python. So calling
def totalPtsGame (score1, score2): return score1 + score2
is slower than simply adding the scores 'in place'.

Terry Jan Reedy


You can also, people say, use the profiler to find where time is going.
 
A

Arnaud Delobelle

About 10 years ago, I wrote a C app that would read scores from
football games and calculate rankings based on the outcome of the
games. In fact, I still use this app. You can view my rankings athttp://members.cox.net/jocknerd/football.

A couple of years ago, I got interested in Python and decided to
rewrite my app in Python. I got it to work but its painfully slow
compared to the C app. I have a file containing scores of over 1500
high school football games for last season. With my Python app, it
takes about 3 minutes to process the rankings. With my C app, it
processes the rankings in less than 15 seconds.

The biggest difference in my two apps is the C app uses linked lists.
I feel my Python app is doing too many lookups which is causing the
bottleneck.

I'd love some feedback regarding how I can improve the app. I'd like
to drop the C app eventually. Its really ugly. My goal is to
eventually get the data stored in PostgreSQL and then have a Django
powered site to process and display my rankings.

You can download the source code fromhttp://members.cox.net/jocknerd/downloads/fbratings.py
and the data file fromhttp://members.cox.net/jocknerd/downloads/vhsf2006.txt

Thanks!

A simple improvement is to change your list of teams('teamlist') to a
dictionary of teams (call it say 'teamdict') mapping team names to
teams.

You have lots of
#Some code
for row in teamlist:
if teamname == row['name']:
#Do something with row

These can all be replaced with:
#Some code
row = teamdict[teamname]
#Do something with row

(Although I wouldn't call it 'row' but rather 'team')

That may speed up your code significantly.

Moreover you can make the main loop (in calcTeamRatings) faster by
avoiding looking up a team each time you need some info on it.

Finally I would change your schedule list to a list of tuples rather
than a list of dictionaries: each game in the schedule would be a
tuple (team1, team2, ratio) and wouldn't include the actual team
scores as you don't seem to use them in your calcTeamRatings function
(that means moving the ratio calculation into the loop that creates
the schedule)

Disclaimer: I only looked at your code superficially and I don't claim
to understand it !

HTH
 
G

Gabriel Genellina

En Wed, 02 May 2007 12:16:56 -0300, Marc 'BlackJack' Rintsch
Then replace those linear searches you wrote in Python with a dictionary.

As an example: using a Team object instead of a dictionary, and using
teamlist (not a good name now) as a dictionary of Team objects indexed by
name:

def lookupTeam (teamname):
team = teamlist.get(teamname)
if team is None:
teamlist[teamname] = team = Team(teamname)
return team

def updateTeamStats (tname1, score1, tname2, score2):
team1 = lookupTeam (tname1)
team2 = lookupTeam (tname2)

team1.pf += score1
team1.pa += score2
if (score1 > score2):
team1.won += 1
elif (score1 < score2):
team1.lost += 1
else:
team1.tied += 1

team2.pf += score2
team2.pa += score1
if (score1 < score2):
team2.won += 1
elif (score1 > score2):
team2.lost += 1
else:
team2.tied += 1

Then you should realize that those last two blocks are too similar, and
you can make a function of it. And then you realize that in fact they act
on a Team object, so you should make a Team method...
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,045
Messages
2,570,389
Members
47,052
Latest member
ketan

Latest Threads

Top