learning with python question (HtTLaPP)

U

umpsumps

Hello all,

I've been trying to teach myself python from "How to Think Like a
Python Programmer" and have been trying to write a script that checks
'words.txt' for parameters (letters) given. The problem that is the i
can only get results for the exact sequnce of parameter 'letters'.
I'll spare posting all the different ways I've tried to search for
specific letters. But they are all generally:

for line in fin:
for linechar in line:
for ch in letters:

or the "for linechar in line:" and "for ch in letters:" get switched..
I'm getting really frustrated to say the least.

What alternative method could I use that isn't too advanced? Any
tips/suggestions on the code itself would be greatly appreciated, and
tips for learning in general.

here is the code that returns a certain sequence:
fin = open('words.txt')
words = ""
index = 0
count = 0
for line in fin:
index +=1
if letters in line.strip():
count += 1
words = line.strip() + '\n' + words

print words
print index, 'lines searched..', count, letters, 'words present'

Thank you in advance.
 
E

Eric Wertman

Python Programmer" and have been trying to write a script that checks
'words.txt' for parameters (letters) given. The problem that is the i
can only get results for the exact sequence of parameter 'letters'.

The "re" module comes to mind:

text = open('words.txt','r').read()
letters = 'sequence'
results = re.findall(letters,text)

result_count = len(results)

# one word per line:
for result in results :
print result

# one line
print ' '.join(results)

of course, you may need to invest a little time in regular expression
syntax to get exactly what you want, but I think you'll find that's
not wasted effort, as this is pretty standard and used in a lot of
other places.
 
U

umpsumps

Eric,

Thank you for helping.

Is the way I wrote the function inherently wrong? What I wrote
returns the sequence, however I'm trying to make the output match for
the letters in the string entered, not necessarily the string
sequence. For example if I search words.txt with my function for
'uzi' I get this:
gauziest
gauzier
fuzing
fuzils
fuzil
frouziest
frouzier
defuzing

113809 lines searched.. 8 uzi words present


Only the sequence shows up 'uzi'. I don't get words like 'unzip' or
'Zurich' . I've only barely started on invocation and maybe writing
something like I'm describing is above what level I'm currently at.
 
E

Eric Wertman

Is the way I wrote the function inherently wrong? What I wrote

I would not say that. I think a lot of people probably start off like
that with python. You'll find in most cases that manually keeping
counters isn't necessary. If you really want to learn python though,
I would suggest using built in functions and libraries as much as
possible, as that's where the real power comes from (IMO).
returns the sequence, however I'm trying to make the output match for
the letters in the string entered, not necessarily the string
sequence.
Only the sequence shows up 'uzi'. I don't get words like 'unzip' or
'Zurich' . I've only barely started on invocation and maybe writing
something like I'm describing is above what level I'm currently at.

This would be a more difficult approach.. Where you are doing the
comparison step:

if letters in line.strip():

It's trying to match the exact string "uzi", not any of the individual
letters. You would need to look for each letter independently and
then make sure they were in the right order to match the other words.
 
U

umpsumps

This is what I'm stuck on. I keep doing things like:

for line in fin:
for ch in letters:
if ch not in line:


I've tried

for ch in letters:
for line in fin:

too..

Should I use a while statement? What's the best way to compare a
group of letters to a line?
 
U

umpsumps

ok.. I finally made something that works.. Please let me know what you
think:
fin = open('words.txt')
count = 0
rescount = 0 # count the number of results
results = "" # there are words that contain the letters
for line in fin:
needs = 0
x = str(line.strip())
for ch in letters:
if ch not in x:
pass
else:
needs = needs + 1
if needs == len(letters):
rescount += 1
results = results + '\n' + x
count += 1
print count, 'lines searched'
print results, '\n'
print 'result count is: ', rescount
 
E

Eric Wertman

ok.. I finally made something that works.. Please let me know what you
think:

fin = open('words.txt')
count = 0
rescount = 0 # count the number of results
results = "" # there are words that contain the letters
for line in fin:
needs = 0
x = str(line.strip())
for ch in letters:
if ch not in x:
pass
else:
needs = needs + 1
if needs == len(letters):
rescount += 1
results = results + '\n' + x
count += 1
print count, 'lines searched'
print results, '\n'
print 'result count is: ', rescount


That's pretty much it.. I'm guessing you are assuming your file has
one word per line? I took a shot at it, without using the regex
module:

file = open('spyware')

my_string = 'uzi'
length = len(my_string)
words = []

for line in file :
chunks = line.strip().split()
for chunk in chunks :
x = 0
for char in my_string :
x = chunk.rfind(char,x)
if x > 0 :
words.append(chunk)

print '\n'.join(words)


or with the re module:

import re

text = open('words.txt').read()
pattern = '\S*u\S*z\S*i\S*'
stuff = re.findall(pattern,text)
count = len(stuff)

print "Found %d words :" % (count)
print "\n".join(stuff)
 
G

Gabriel Genellina

ok.. I finally made something that works.. Please let me know what you
think:

fin = open('words.txt')
count = 0
rescount = 0 # count the number of results
results = "" # there are words that contain the letters
for line in fin:
needs = 0
x = str(line.strip())
for ch in letters:
if ch not in x:
pass
else:
needs = needs + 1
if needs == len(letters):
rescount += 1
results = results + '\n' + x
count += 1
print count, 'lines searched'
print results, '\n'
print 'result count is: ', rescount

That's pretty good. Some improvements:

- The "natural" way to collect the results is using a list, appending words to it. Later you can print it one word per line or in any other format you want. Also, we don't need the "rescount" variable: it's just the length of the list.
needs = 0
for ch in letters:
if ch not in x:
pass
else:
needs = needs + 1
if needs == len(letters):

The overall idea is to test whether ALL letters are in the word `x`, ok? So as soon as we find a letter that isn't in the word, we are sure the test failed and we can break out of the loop. And if we get up to the last step, that means that all the letters were in the word (else we would not have got so far). So we don't have to count the letters; instead, we can use the "else" clause of the for loop (it means "the loop was exhausted completely".)

- I don't like the names "x" nor "ch"; I'm using "word" and "letter" instead. This is the revised version:

def lines(letters):
fin = open('words.txt')
count = 0
results = [] # there are words that contain the letters
for line in fin:
word = line.strip()
for letter in letters:
if letter not in x:
break
else:
results.append(word)
count += 1
print count, 'lines searched'
print '\n'.join(results), '\n'
print 'result count is: ', len(results)

That "\n".join(...) means "concatenate all the items in the list using \n as a separator between items" and it's a pretty common idiom in Python.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,774
Messages
2,569,596
Members
45,140
Latest member
SweetcalmCBDreview
Top