P
Peter v.d. Berger
Hello,
I'm working on a script that can place results of soccergames from different
seasons in a row, to see the history of the game.
I've gattered a lot of scores from different websites on a FreeBSD
webserver. The scores are all placed in a directory with the season as name,
and the names of the team as the filename.
So for example results of the game 'AC Milan - Ajax' are in different files
for different seasons:
../0405/AC Milan - Ajax.txt
../0304/AC Milan - Ajax.txt
../0203/AC Milan - Ajax.txt
(team names seperated with '-')
My script creates an HTML-page with an overview of the results of al
seasons.
The problem is that I gathered the names of the teams for the results from
different websites, and some websites will use 'AC Milan', others just
'Milan'
Some websites use the name 'Ajax', others 'Ajax FC', others 'Ajax
Amsterdam'.
Since I gathered results of hundreds of teams, in tenthousands of results,
renaming all the files is not an option.
Is there a way to improve the matching of these files, with the knowledge
that:
- two or three character strings can be left out (like FC, Utd.)
- make a match when, for example, two out of three names in the filename
match
(like: the game 'name1 name2 - name3' matches both 'name1 - name 3', and
'name2 - name3')
I hope i could make my question clear, and someone can help me.
Thanks!
I'm working on a script that can place results of soccergames from different
seasons in a row, to see the history of the game.
I've gattered a lot of scores from different websites on a FreeBSD
webserver. The scores are all placed in a directory with the season as name,
and the names of the team as the filename.
So for example results of the game 'AC Milan - Ajax' are in different files
for different seasons:
../0405/AC Milan - Ajax.txt
../0304/AC Milan - Ajax.txt
../0203/AC Milan - Ajax.txt
(team names seperated with '-')
My script creates an HTML-page with an overview of the results of al
seasons.
The problem is that I gathered the names of the teams for the results from
different websites, and some websites will use 'AC Milan', others just
'Milan'
Some websites use the name 'Ajax', others 'Ajax FC', others 'Ajax
Amsterdam'.
Since I gathered results of hundreds of teams, in tenthousands of results,
renaming all the files is not an option.
Is there a way to improve the matching of these files, with the knowledge
that:
- two or three character strings can be left out (like FC, Utd.)
- make a match when, for example, two out of three names in the filename
match
(like: the game 'name1 name2 - name3' matches both 'name1 - name 3', and
'name2 - name3')
I hope i could make my question clear, and someone can help me.
Thanks!