counting how often the same word appears in a txt file...But my codeonly prints the last line entry

dgcosgrave · Dec 19, 2012

Hi Iam just starting out with python...My code below changes the txt file into a list and add them to an empty dictionary and print how often the word occurs, but it only seems to recognise and print the last entry of the txt file. Any help would be great.

tm =open('ask.txt', 'r')
dict = {}
for line in tm:
line = line.strip()
line = line.translate(None, '!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~')
line = line.lower()
list = line.split(' ')
for word in list:
if word in dict:
count = dict[word]
count += 1
dict[word] = count
else:
dict[word] = 1
for word, count in dict.iteritems():
print word + ":" + str(count)

Jussi Piitulainen · Dec 19, 2012

Hi Iam just starting out with python...My code below changes the txt
file into a list and add them to an empty dictionary and print how
often the word occurs, but it only seems to recognise and print the
last entry of the txt file. Any help would be great.

tm =open('ask.txt', 'r')
dict = {}
for line in tm:
line = line.strip()
line = line.translate(None, '!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~')
line = line.lower()
list = line.split(' ')
for word in list:
if word in dict:
count = dict[word]
count += 1
dict[word] = count
else:
dict[word] = 1
for word, count in dict.iteritems():
print word + ":" + str(count)

The "else" clause is mis-indented (rather, mis-unindented).

Python's "for" statement does have an optional "else" clause. That's
why you don't get a syntax error. The "else" clause is used after the
loop finishes normally. That's why it catches the last word.

Steven D'Aprano · Dec 19, 2012

Hi Iam just starting out with python...My code below changes the txt
file into a list and add them to an empty dictionary and print how often
the word occurs, but it only seems to recognise and print the last entry
of the txt file. Any help would be great.

tm =open('ask.txt', 'r')
dict = {}
for line in tm:
line = line.strip()
line = line.translate(None, '!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~')
line = line.lower()
list = line.split(' ')

Note: you should use descriptive names. Since this is a list of WORDS, a
much better name would be "words" rather than list. Also, list is a built-
in function, and you may run into trouble when you accidentally re-use
that as a name. Same with using "dict" as you do.

Apart from that, so far so good. For each line, you generate a list of
words. But that's when it goes wrong, because you don't do anything with
the list of words! The next block of code is *outside* the for-loop, so
it only runs once the for-loop is done. So it only sees the last list of
words.

for word in list:

The problem here is that you lost the indentation. You need to indent the
"for word in list" (better: "for word in words") so that it starts level
with the line above it.

if word in dict:
count = dict[word]
count += 1
dict[word] = count

This bit is fine.

else:
dict[word] = 1

But this fails for the same reason! You have lost the indentation.

A little-known fact: Python for-loops take an "else" block too! It's a
badly named statement, but sometimes useful. You can write:

for value in values:
do_something_with(value)
if condition:
break # skip to the end of the for...else
else:
print "We never reached the break statement"

So by pure accident, you lined up the "else" statement with the for loop,
instead of what you needed:

for line in tm:
... blah blah blah
for word in words:
if word in word_counts: # better name than "dict"
... blah blah blah
else:
...

for word, count in dict.iteritems():
print word + ":" + str(count)

And this bit is okay too.

Good luck!

Thomas Bach · Dec 19, 2012

Hi,

just as a side-note

for word in list:
if word in dict:
count = dict[word]
count += 1
dict[word] = count
else:
dict[word] = 1

When you got the indentation and names right, you can restate this as

import collections
counter = collections.Counter(words)

in Python 2.7 or as

import collections
counter = collections.defaultdict(int)
for word in words:
counter[word] += 1

in Python 2.6

Regards,
Thomas.

dgcosgrave · Dec 19, 2012

Hi Iam just starting out with python...My code below changes the txt

Click to expand...

file into a list and add them to an empty dictionary and print how

Click to expand...

often the word occurs, but it only seems to recognise and print the

Click to expand...

last entry of the txt file. Any help would be great.

tm =open('ask.txt', 'r')

Click to expand...

dict = {}

Click to expand...

for line in tm:

Click to expand...

line = line.strip()

Click to expand...

line = line.translate(None, '!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~')

Click to expand...

line = line.lower()

Click to expand...

list = line.split(' ')

Click to expand...

for word in list:

Click to expand...

if word in dict:

Click to expand...

count = dict[word]

Click to expand...

count += 1

Click to expand...

dict[word] = count

else:

Click to expand...

dict[word] = 1

Click to expand...

for word, count in dict.iteritems():

Click to expand...

print word + ":" + str(count)

Click to expand...

The "else" clause is mis-indented (rather, mis-unindented).

Python's "for" statement does have an optional "else" clause. That's

why you don't get a syntax error. The "else" clause is used after the

loop finishes normally. That's why it catches the last word.

Thanks for quick reply Jussi...indentation fixed the problem

dgcosgrave · Dec 19, 2012

Hi Iam just starting out with python...My code below changes the txt

Click to expand...

file into a list and add them to an empty dictionary and print how often

Click to expand...

the word occurs, but it only seems to recognise and print the last entry

Click to expand...

of the txt file. Any help would be great.

tm =open('ask.txt', 'r')

Click to expand...

dict = {}

Click to expand...

for line in tm:

Click to expand...

line = line.strip()

Click to expand...

line = line.translate(None, '!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~')

Click to expand...

line = line.lower()

Click to expand...

list = line.split(' ')

Click to expand...

Note: you should use descriptive names. Since this is a list of WORDS, a

much better name would be "words" rather than list. Also, list is a built-

in function, and you may run into trouble when you accidentally re-use

that as a name. Same with using "dict" as you do.

Apart from that, so far so good. For each line, you generate a list of

words. But that's when it goes wrong, because you don't do anything with

the list of words! The next block of code is *outside* the for-loop, so

it only runs once the for-loop is done. So it only sees the last list of

words.

for word in list:

Click to expand...

The problem here is that you lost the indentation. You need to indent the

"for word in list" (better: "for word in words") so that it starts level

with the line above it.

if word in dict:

Click to expand...

count = dict[word]

Click to expand...

count += 1

Click to expand...

dict[word] = count

Click to expand...

This bit is fine.

else:

Click to expand...

dict[word] = 1

Click to expand...

But this fails for the same reason! You have lost the indentation.

A little-known fact: Python for-loops take an "else" block too! It's a

badly named statement, but sometimes useful. You can write:

for value in values:

do_something_with(value)

if condition:

break # skip to the end of the for...else

else:

print "We never reached the break statement"

So by pure accident, you lined up the "else" statement with the for loop,

instead of what you needed:

for line in tm:

... blah blah blah

for word in words:

if word in word_counts: # better name than "dict"

... blah blah blah

else:

...

for word, count in dict.iteritems():

Click to expand...

print word + ":" + str(count)

Click to expand...

And this bit is okay too.

Good luck!

Thanks Steven appreciate great info for future coding. i have change names to be more decriptive and corrected the indentation... all works! cheers

dgcosgrave · Dec 19, 2012

Hi,

just as a side-note

for word in list:

Click to expand...

if word in dict:

Click to expand...

count = dict[word]

Click to expand...

count += 1

Click to expand...

dict[word] = count

else:

Click to expand...

dict[word] = 1

Click to expand...

When you got the indentation and names right, you can restate this as

import collections

counter = collections.Counter(words)

in Python 2.7 or as

import collections

counter = collections.defaultdict(int)

for word in words:

counter[word] += 1

in Python 2.6

Regards,

Thomas.

Thanks Thomas for your time... using 2.7 great!

dgcosgrave · Dec 19, 2012

Hi,

just as a side-note

for word in list:

Click to expand...

if word in dict:

Click to expand...

count = dict[word]

Click to expand...

count += 1

Click to expand...

dict[word] = count

else:

Click to expand...

dict[word] = 1

Click to expand...

When you got the indentation and names right, you can restate this as

import collections

counter = collections.Counter(words)

in Python 2.7 or as

import collections

counter = collections.defaultdict(int)

for word in words:

counter[word] += 1

in Python 2.6

Regards,

Thomas.

Thanks Thomas for your time... using 2.7 great!

Dennis Lee Bieber · Dec 19, 2012

tm =open('ask.txt', 'r')
dict = {}
for line in tm:
line = line.strip()
line = line.translate(None, '!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~')
line = line.lower()
list = line.split(' ')

You could (though it gets a bit long) combine some of the above...

list = line.strip().translate(
None,
"the punctuation set"
).lower().split()

# taking advantage that open ()/[]/{} automatically continue on next
lines

for word in list:

INDENTATION! As coded, you first do the strip/translate/lower/split
on EACH line of the file... THEN you are processing the words in the
LAST line processed in the previous loop.

if word in dict:
count = dict[word]
count += 1
dict[word] = count
else:
dict[word] = 1

More indentation -- I suspect your want the else: and following line
to be indented the same as the if line...

Though the whole block can be simplified to

dict[word] = dict.get(word, 0) + 1

Help for my project in the last minute	0	Apr 23, 2022
I have to finish this code for my assignment but I cant figure out how to solve it	1	Jun 27, 2023
Insert replace text based on a name in other file python script	4	Mar 5, 2025
Creating a dictionary from a .txt file	18	Mar 31, 2013
Translater + module + tkinter	1	Feb 16, 2023
counting the repeated values of a certain row in array	0	Sep 25, 2012
get first and last line from txt file - how?	18	Dec 20, 2008
How to open a txt file from the same folder as my module (w/outchanging the working dir)	3	Apr 4, 2007

counting how often the same word appears in a txt file...But my codeonly prints the last line entry

dgcosgrave

Jussi Piitulainen

Steven D'Aprano

Thomas Bach

dgcosgrave

dgcosgrave

dgcosgrave

dgcosgrave

Dennis Lee Bieber

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads