Question of Python second loop break and indexes

L

lilin Yi

//final_1 is a list of Identifier which I need to find corresponding
files(four lines) in x(x is the file) and write following four lines
in a new file.

//because the order of the identifier is the same, so after I find the
same identifier in x , the next time I want to start from next index
in x,which will save time. That is to say , when the if command
satisfied ,it can automatically jump out out the second while loop and
come to the next identifier of final_1 ,meanwhile the j should start
not from the beginning but the position previous.

//when I run the code it takes too much time more than one hour and
give the wrong result....so could you help me make some improvement of
the code?

i=0

offset_1=0


while i <len(final_1):
j = offset_1
while j <len(x1):
if final_1 == x1[j]:
new_file.write(x1[j])
new_file.write(x1[j+1])
new_file.write(x1[j+2])
new_file.write(x1[j+3])
offset_1 = j+4
quit_loop="True"
if quit_loop == "True":break
else: j=j +1
i=i+1
 
U

Ulrich Eckhardt

Am 09.05.2012 10:36, schrieb lilin Yi:
//final_1 is a list of Identifier which I need to find corresponding
files(four lines) in x(x is the file) and write following four lines
in a new file.

//because the order of the identifier is the same, so after I find the
same identifier in x , the next time I want to start from next index
in x,which will save time. That is to say , when the if command
satisfied ,it can automatically jump out out the second while loop and
come to the next identifier of final_1 ,meanwhile the j should start
not from the beginning but the position previous.

//when I run the code it takes too much time more than one hour and
give the wrong result....so could you help me make some improvement of
the code?

If the code takes too much time and gives the wrong results, you must
fix and improve it. In order to do that, the first thing you should do
is get familiar with "test-driven development" and Python's unittest
library. You can start by fixing the code, but chances are that you will
break it again trying to make it fast then. Having tests that tell you
after each step that the code still works correctly is invaluable.

Some more comments below...
i=0

offset_1=0


while i <len(final_1):
j = offset_1
while j <len(x1):
if final_1 == x1[j]:
new_file.write(x1[j])
new_file.write(x1[j+1])
new_file.write(x1[j+2])
new_file.write(x1[j+3])
offset_1 = j+4
quit_loop="True"
if quit_loop == "True":break
else: j=j +1
i=i+1


Just looking at the code, there are a few things to note:
1. You are iterating "i" from zero to len(final_1)-1. The pythonic way
to code this is using "for i in range(len(final_1)):...". However, since
you only use the index "i" to look up an element inside the "final_1"
sequence, the proper way is "for f in final_1:..."
2. Instead of writing four lines separately, you could write them in a
loop: "for x in x1[j:j+4]: new_file.write(x)".
3. "x1" is a list, right? In that case, there is a member function
"index()" that searches for an element and accepts an optional start
position.
4. The "quit_loop" is useless, and you probably are getting wrong
results because you don't reset this value. If you use "break" at the
place where you assign "True" to it, you will probably get what you
want. Also, Python has real boolean variables with the two values "True"
and "False", you don't have to use strings.


Concerning the speed, you can probably improve it by not storing the
lines of the input file in "x1", but rather creating a dictionary
mapping between the input value and the according four lines:

content = open(...).readlines()
d = {}
for i in range(0, len(content), 4):
d[content] = tuple(content[i, i+4])

Then, drop the "offset_1" (at least do that until you have the code
working correctly), as it doesn't work with a dictionary and the
dictionary will probably be faster anyway.

The whole loop above then becomes:

for idf in final_1:
for l in d.get(idf):
new_file.write(l)

;)

I hope I gave you a few ideas, good luck!


Uli
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,772
Messages
2,569,588
Members
45,100
Latest member
MelodeeFaj
Top