Q
Qertoip
Would you like to suggest me any improvements for the following code?
I want to make my implementation as simple, as Python - native, as fine as
possible.
I've written simple code, which reads input text file and creates words'
ranking by number of appearence.
Code:
---------------------------------------------------------------------------
import sys
def moreCommonWord( x, y ):
if x[1] != y[1]:
return cmp( x[1], y[1] ) * -1
return cmp( x[0], y[0] )
wordsDic = {}
inFile = open( sys.argv[1] )
for word in inFile.read().split():
if wordsDic.has_key( word ):
wordsDic[word] = wordsDic[word] + 1
else:
wordsDic[word] = 1
inFile.close()
wordsLst = wordsDic.items()
wordsLst.sort( moreCommonWord )
outFile = open( sys.argv[2], 'w')
for pair in wordsLst:
outFile.write( str( pair[1] ).rjust( 7 ) + " : " + str( pair[0] ) + "\n" )
outFile.close()
---------------------------------------------------------------------------
In particular, I don't like reading whole file just to split it.
It is easy to read by lines - may I read by words with that ease?
PS I've been learning Python since todays morning, so be understanding :>
I want to make my implementation as simple, as Python - native, as fine as
possible.
I've written simple code, which reads input text file and creates words'
ranking by number of appearence.
Code:
---------------------------------------------------------------------------
import sys
def moreCommonWord( x, y ):
if x[1] != y[1]:
return cmp( x[1], y[1] ) * -1
return cmp( x[0], y[0] )
wordsDic = {}
inFile = open( sys.argv[1] )
for word in inFile.read().split():
if wordsDic.has_key( word ):
wordsDic[word] = wordsDic[word] + 1
else:
wordsDic[word] = 1
inFile.close()
wordsLst = wordsDic.items()
wordsLst.sort( moreCommonWord )
outFile = open( sys.argv[2], 'w')
for pair in wordsLst:
outFile.write( str( pair[1] ).rjust( 7 ) + " : " + str( pair[0] ) + "\n" )
outFile.close()
---------------------------------------------------------------------------
In particular, I don't like reading whole file just to split it.
It is easy to read by lines - may I read by words with that ease?
PS I've been learning Python since todays morning, so be understanding :>