Here I am again, same old arguments

C

CJ

Okay, same program, different issue. Thanks to the help that I was
given I was able to complete my program to find variables in a list that
were repeated, and display them once, and how many times they appeared in
the list. And it worked great!

But, being the perfectionist that I am, I wanted to make the proggie
allow any size of list, and not have to be recoded every time. So step
one was to not make the program reliant on the list itself being of X
length all the time.

Well, for some reason, the FOR loop is altering two of my lists. Using
PRINT magic, I was able to narrow down the lines that were causing it.
But the question remains: Why is it doing this? I'm sure there's a simple
answer that I just overlooked in the manual or something.

So without further ado, the code:


#setup variables
grub=[3,25,3,5,3,"a","a","BOB",3,3,45,36,26,25,"a",3,3,3,"bob","BOB",67]
grubrpt=grub
cntro=0
cntrt=0
rpt=0
skipped=0

#set up for variable length of grub
ttllen=len(grub)-1
print "The heck is this for loop doing?"
for point in range(0,ttllen,1):
print "Here's Grub=",grub
print "And grubrpt=",grubrpt
grubrpt[point]="blk"

#Makes sure that there are not multiple prints.
def alrdy_dn(grub,grubrpt):
if grub[cntro] in grubrpt:
return grubrpt
else:
print grub[cntro],"appears in list",rpt,"times."
grubrpt[grubrpt.index("blk")]=grub[cntro]
return grubrpt

#removes display of variables not repeated
def no_rpts(skipped,grubrpt):
if rpt==0:
skipped=skipped+1
else:
grubrpt=alrdy_dn(grub,grubrpt)
return skipped

#Main body of code
print "The List is:",grub

while cntro<>len(grub)-1:
if grub[cntro]==grub[cntrt]:
rpt=rpt+1
cntrt=cntrt+1
else:
cntrt=cntrt+1
if cntrt==len(grub):
skipped=no_rpts(skipped,grubrpt)
cntro=cntro+1
cntrt=0
rpt=-1

print skipped,"list elements are unique."


And the award winning Output:

The heck is this for loop doing?
Here's Grub= [3, 25, 3, 5, 3, 'a', 'a', 'BOB', 3, 3, 45, 36, 26, 25, 'a',
3, 3, 3, 'bob', 'BOB', 67]
And grubrpt= [3, 25, 3, 5, 3, 'a', 'a', 'BOB', 3, 3, 45, 36, 26, 25, 'a',
3, 3, 3, 'bob', 'BOB', 67]
Here's Grub= ['blk', 25, 3, 5, 3, 'a', 'a', 'BOB', 3, 3, 45, 36, 26, 25,
'a', 3, 3, 3, 'bob', 'BOB', 67]
And grubrpt= ['blk', 25, 3, 5, 3, 'a', 'a', 'BOB', 3, 3, 45, 36, 26, 25,
'a', 3, 3, 3, 'bob', 'BOB', 67]
Here's Grub= ['blk', 'blk', 3, 5, 3, 'a', 'a', 'BOB', 3, 3, 45, 36, 26,
25, 'a', 3, 3, 3, 'bob', 'BOB', 67]
And grubrpt= ['blk', 'blk', 3, 5, 3, 'a', 'a', 'BOB', 3, 3, 45, 36, 26,
25, 'a', 3, 3, 3, 'bob', 'BOB', 67]

It goes on like that, I'm not going to put all of it here for obvious
reasons. But, if I take out the whole for loop and the lines relating to
it and statically assign grubrpt as ["blk","blk"...] then the program
runs wonderfully.


From what I understand, you can never have too many functions, so I
tried to make the grub "blk" a function and got the same result. I'm
still nailing them down, so if my functions look a little weird you know
why. Also, I do realize that there is an easier way to do this, I just
created a little project for myself to learn the basics of the language.

Thanks for all the help!
-CJ
 
F

Fredrik Lundh

CJ said:
Well, for some reason, the FOR loop is altering two of my lists. Using
PRINT magic, I was able to narrow down the lines that were causing it.
But the question remains: Why is it doing this? I'm sure there's a simple
answer that I just overlooked in the manual or something.

here you create a list, and name it "grub":
grub=[3,25,3,5,3,"a","a","BOB",3,3,45,36,26,25,"a",3,3,3,"bob","BOB",67]

here you add a new name for that list:
grubrpt=grub

after this operation, "grubrpt" and "grub" are two names for the
same object, not two distinct objects. in Python, variables are
names, not small "boxes" in memory than contain stuff.

you can add

print id(grub), id(grubrpt)

to see the object identities.

required reading:

http://effbot.org/zone/python-objects.htm
(short version, available in english, french, and czech)

http://starship.python.net/crew/mwh/hacks/objectthink.html
(long version, with ascii art!)

some common patterns:

to copy a list, use

grubrpt = grub[:]

or the copy module. to build a new list from an old one, use

new_list = []
for item in old_list:
... do something ...
new_list.append(item)

for some cases, a list comprehension can be a nicer way to
write that:

new_list = [... item ... for item in old_list ...]

etc.

</F>
 
S

Steven D'Aprano

Okay, same program, different issue. Thanks to the help that I was
given I was able to complete my program to find variables in a list that
were repeated, and display them once, and how many times they appeared in
the list. And it worked great!

But, being the perfectionist that I am, I wanted to make the proggie
allow any size of list, and not have to be recoded every time. So step
one was to not make the program reliant on the list itself being of X
length all the time.

First off -- don't use a for loop with an index as you are doing.
#setup variables
grub=[3,25,3,5,3,"a","a","BOB",3,3,45,36,26,25,"a",3,3,3,"bob","BOB",67]
grubrpt=grub
cntro=0
cntrt=0
rpt=0
skipped=0

You are doing too much manual work! Let Python do the lion's share of the
work for you!
#set up for variable length of grub
ttllen=len(grub)-1

Why are you subtracting one from the length of the list?
print "The heck is this for loop doing?"
for point in range(0,ttllen,1):

Using point as a loop index is generally a bad idea. The result coming
from range is not a point, it is an integer, so why call it a point?

You are also over-specifying the input arguments to range. If the step
size is one, you don't need to specify it -- that's the default. You just
make it harder to read, for no reason. Likewise the initial starting value
of zero. Just use range(ttllen).

This, by the way, will return a list [0, 1, 2, ... , length of list -
TWO] because you already subtracted one from the length.
print "Here's Grub=",grub
print "And grubrpt=",grubrpt
grubrpt[point]="blk"

As others have pointed out, grub and grubrpt are both names for the same
list. Changing one changes the other.

#Makes sure that there are not multiple prints.
def alrdy_dn(grub,grubrpt):
if grub[cntro] in grubrpt:

Ew!!! Global variables!!!

Bad programmer! No biscuit!!!

*wink*

Global variables are almost always a BAD idea.
return grubrpt
else:
print grub[cntro],"appears in list",rpt,"times."
grubrpt[grubrpt.index("blk")]=grub[cntro] return grubrpt

This is a strange function. What exactly is it meant to do? It
combines user interface (printing the number of times each item appears)
and functionality (counting the number of times each item appears) and
side effects (changing the list), before returning one of the input
arguments again.

At least two of those things (counting the items, and printing the
results) should be separated into different functions for ease of
comprehension.

I'm going to skip the rest of your code, because I don't understand it and
am too lazy, er, I mean busy, to spend the time trying to decipher it.
Especially since the function you are trying to duplicate manually is so
easy to do if you work with Python instead of against it.

def count_item(L, item):
"""Count the number of times item appears in list L."""
return L.count(item)

Or wait... that's too easy :)

If you want to roll your own, then do it like this:

def count_item(L, item):
"""Count the number of times item appears in list L by reinventing
the wheel."""
n = 0
for obj in L:
if obj == item:
n += 1
return n

Notice that we don't change the list at any time. Why change it? That just
adds complexity to our program and adds extra places to make bugs. Of
which you have many :)

Now you use it like this:

grub=[3,25,3,5,3,"a","a","BOB",3,3,45,36,26,25,"a",3,3,3,"bob","BOB",67]
for item in grub:
n = count_item(grub, item)
print item, "appears in list", n, "times."


And you are done.

No, not quite -- my code has a bug in it. You want to print the count for
each *unique* item. Mine prints the count for each item, regardless of
whether it is unique or not. So what we need to keep track of which items
have been counted before. Here is one way of doing it:

grub=[3,25,3,5,3,"a","a","BOB",3,3,45,36,26,25,"a",3,3,3,"bob","BOB",67]
already_seen = []
for item in grub:
if item not in already_seen:
n = count_item(grub, item)
print item, "appears in list", n, "times."
already_seen.append(item)

Notice that rather than *deleting* from a copy of the original list, we
*add* to a new list that started off empty.

Here is another way:

grub=[3,25,3,5,3,"a","a","BOB",3,3,45,36,26,25,"a",3,3,3,"bob","BOB",67]
unique_counts = {} # use a dictionary, not a list
for item in grub:
n = count_item(grub, item)
unique_counts[item] = n
for item, n in unique_counts.items():
print item, "appears in list", n, "times."

The second version has a disadvantage that the objects in your list can't
be lists themselves, because lists can't be keys of dictionaries.

It also has what appears to be a disadvantage that it stores the item
count in the dictionary, even if that count has already been stored.
Wasted work! But wait... it might not be wasted. It takes work to test if
your item has already been seen. That work might be more than it would
take to just blindly store the result even if it is there.

Think of this real world equivalent. You tried to send a fax to somebody,
but you aren't sure if it went through correctly. What's the cheapest way
to make sure? You could call them up on the phone and ask, but that costs
money and time. It could be cheaper and quicker to just re-fax the
document.
Also, I do realize that there is an easier way to do this, I just
created a little project for myself to learn the basics of the language.

This is good, but keep in mind that "the basics" of Python include tools
to do things you are trying to do by hand.

For instance, if you find yourself writing a loop like this:

counter = 0
while counter < some_value:
do_something_with(counter)
counter = counter + 1

you almost always want to change that to:

for counter in range(some_value):
do_something_with(counter)

If you find a loop like this:

for indx in range(len(some_list)):
obj = some_list[indx]
do_something_with(obj)

you almost always want to write it like this:

for obj in some_list:
do_something_with(obj)

Don't fight the language -- you aren't programming in C or Java now, this
is Python and there is usually an easier way to do something.

*wink*

Hope this is of use to you,
 
C

CJ

Wow, thanks alot. I pretty much (due to my own desire to get the program to )(@#T(=!!! work and
be done with it) just turned the list into a function that returns a list that isn't attached to
anything but the function itself, but I've taken the advice to heart.

Most of what you posted makes sense, and is in fact easier than what I was doing, but I have
three questions:

1) Why no global variables? I'm taking your word for it that they're bad. Far be it from me to
argue with you, but why are they bad ideas to begin with? Most of the languages I've used up to
this point have been reliant on globals, so I'm not entirely sure why they shouldn't be used.

2) Why no for loop with an index? Again, far be it from me to argue, but it seemed perfect for
my program.

3) Where do I find a command list, with syntax and all that fun stuff for Python? I've explored
the python site to no end, but I can't seem to find a list.


Again, thanks to everyone who put my crappy noob proggie through the blender :D I really did
learn alot. Like how my 60 line program got turned into a 15 line code snippet. (I'm not being
sarcastic, really, thanks)




-----
Wait a minute. I can use my PSP to play GAMES?!?





Okay, same program, different issue. Thanks to the help that I was
given I was able to complete my program to find variables in a list
that were repeated, and display them once, and how many times they
appeared in the list. And it worked great!

But, being the perfectionist that I am, I wanted to make the
proggie
allow any size of list, and not have to be recoded every time. So
step one was to not make the program reliant on the list itself being
of X length all the time.

First off -- don't use a for loop with an index as you are doing.
#setup variables
grub=[3,25,3,5,3,"a","a","BOB",3,3,45,36,26,25,"a",3,3,3,"bob","BOB",6
7] grubrpt=grub
cntro=0
cntrt=0
rpt=0
skipped=0

You are doing too much manual work! Let Python do the lion's share of
the work for you!
#set up for variable length of grub
ttllen=len(grub)-1

Why are you subtracting one from the length of the list?
print "The heck is this for loop doing?"
for point in range(0,ttllen,1):

Using point as a loop index is generally a bad idea. The result coming
from range is not a point, it is an integer, so why call it a point?

You are also over-specifying the input arguments to range. If the step
size is one, you don't need to specify it -- that's the default. You
just make it harder to read, for no reason. Likewise the initial
starting value of zero. Just use range(ttllen).

This, by the way, will return a list [0, 1, 2, ... , length of list -
TWO] because you already subtracted one from the length.
print "Here's Grub=",grub
print "And grubrpt=",grubrpt
grubrpt[point]="blk"

As others have pointed out, grub and grubrpt are both names for the
same list. Changing one changes the other.

#Makes sure that there are not multiple prints.
def alrdy_dn(grub,grubrpt):
if grub[cntro] in grubrpt:

Ew!!! Global variables!!!

Bad programmer! No biscuit!!!

*wink*

Global variables are almost always a BAD idea.
return grubrpt
else:
print grub[cntro],"appears in list",rpt,"times."
grubrpt[grubrpt.index("blk")]=grub[cntro] return grubrpt

This is a strange function. What exactly is it meant to do? It
combines user interface (printing the number of times each item
appears) and functionality (counting the number of times each item
appears) and side effects (changing the list), before returning one of
the input arguments again.

At least two of those things (counting the items, and printing the
results) should be separated into different functions for ease of
comprehension.

I'm going to skip the rest of your code, because I don't understand it
and am too lazy, er, I mean busy, to spend the time trying to decipher
it. Especially since the function you are trying to duplicate manually
is so easy to do if you work with Python instead of against it.

def count_item(L, item):
"""Count the number of times item appears in list L."""
return L.count(item)

Or wait... that's too easy :)

If you want to roll your own, then do it like this:

def count_item(L, item):
"""Count the number of times item appears in list L by reinventing
the wheel."""
n = 0
for obj in L:
if obj == item:
n += 1
return n

Notice that we don't change the list at any time. Why change it? That
just adds complexity to our program and adds extra places to make
bugs. Of which you have many :)

Now you use it like this:

grub=[3,25,3,5,3,"a","a","BOB",3,3,45,36,26,25,"a",3,3,3,"bob","BOB",67
] for item in grub:
n = count_item(grub, item)
print item, "appears in list", n, "times."


And you are done.

No, not quite -- my code has a bug in it. You want to print the count
for each *unique* item. Mine prints the count for each item,
regardless of whether it is unique or not. So what we need to keep
track of which items have been counted before. Here is one way of
doing it:

grub=[3,25,3,5,3,"a","a","BOB",3,3,45,36,26,25,"a",3,3,3,"bob","BOB",67
] already_seen = []
for item in grub:
if item not in already_seen:
n = count_item(grub, item)
print item, "appears in list", n, "times."
already_seen.append(item)

Notice that rather than *deleting* from a copy of the original list,
we *add* to a new list that started off empty.

Here is another way:
 
S

Steven D'Aprano

1) Why no global variables? I'm taking your word for it that they're bad. Far be it from me to
argue with you, but why are they bad ideas to begin with? Most of the languages I've used up to
this point have been reliant on globals, so I'm not entirely sure why they shouldn't be used.

Global variables aren't *entirely* bad. I use them myself, sometimes for
constants (well, pseudo-constants -- Python doesn't enforce constants) and
short, quick 'n' dirty throw away code.

But in general, as your code gets bigger and more complicated, using
global variables gets more dangerous, unreliable, harder to debug, and
generally a bad idea. Let me show you some examples.

Suppose you have some some code like this:

# set a global variable
gParrot = 1
# do some work with it
get_shrubbery()
eat_after_dinner_mint()
make_machine_go_ping()
pine_for_the_fjords()
print gParrot

(The names come from Monty Python, as is traditional in Python.)

You have four functions in that piece of code. You expect that after
the four functions run, the code should print 2, but instead it prints
3. Why? Somehow, parrot is getting set to the wrong value.

Which of those four functions is to blame? Which ones use the global
variable parrot? You can't tell just by looking. Which ones change the
variable? Again, you can't tell. The only way to find out is to read the
code. In a big program, there might be thousands of functions, split over
dozens of modules. Just going backwards and forwards looking up the
functions is hard work.

Now let's re-write that code properly:

# set a variable
parrot = 1
# do some work with it
get_shrubbery(parrot)
parrot = eat_after_dinner_mint()
make_machine_go_ping()
parrot = pine_for_the_fjords()
print parrot

Now it is easier to see what is going on. get_shrubbery uses the value of
parrot, but can't change it. Or rather, if it changes parrot, that change
is local to the function, and doesn't effect anything outside of that
function. So it isn't responsible for the bug.

The machine that goes ping doesn't even use the value of parrot, and it
certainly doesn't change it. So it can't be responsible for the bug.

The value of parrot gets changed in only two places, once by
eat_after_dinner_mint, and the other by pine_for_the_fjords. So you have
halved the amount of places that you need to look for the bug.

What else is wrong with globals?

(1) You can't be sure what a function does without reading and
understanding every bit of code. Suppose you have a function that is
supposed to make the machine go ping, and return True if the machine
pinged and False if it didn't. But you don't know if it changes any global
variables -- that is what we call a "side-effect". Like side-effects in
medicine, it makes it very hard to tell what a function will do.

(2) Global variables allow indiscriminate access. You can't prevent other
functions from messing them up. If you are writing a small program on your
own, maybe you can trust yourself not to accidentally mangle the global
variable. But if you are working on a big project with ten other
programmers, and they all are writing code that reads and writes to your
global variables, can you really trust that none of them will put the
wrong value there?

(3) Globals make it really hard to change your code. You have a global
variable and you want to change what it stands for. But you can't, because
all these dozens of other functions rely on it. So you end up leaving the
old global there, and inventing a new one, and now you have twice as many
places that bugs can be hiding because every function can potentially mess
up two globals instead of one.

(4) It is easy to screening globals accidentally, a source of bugs. You
have a global called "parrot", but somewhere in a function you create a
local variable also called "parrot". Now you have lost access to the
global. This can be a source of many puzzling bugs.

(5) Dependency: global variables mean that different functions and modules
depend on each other in ways that are very hard to control. This means
that any time you modify a function that uses globals, you could
potentially be breaking *any other function*, even if the change you made
to the first function is not a bug.

(6) Error propagation. A bug in one function will propagate to other
functions. This can mean that you detect the bug in a piece of code a
long, long way away from where the bug was introduced.


2) Why no for loop with an index? Again, far be it from me to argue, but it seemed perfect for
my program.

Why do work that you don't have to?

You are writing code like this:

myList = ["spam", "ping", "parrot", "fjords"]
for indx in range(len(myList)):
print myList[indx],

It prints:
spam ping parrot fjords

But look at how much work you have to do: first you count the length of
the list with len(myList), then you create a list of integers between 0
and that length with range, then for each of those integers, you have to
look up the item in myList. To understand the code, you have to work
through those three things in your head.

Here is a simpler way of doing it:

myList = ["spam", "ping", "parrot", "fjords"]
for item in myList:
print item,

Translate that into English: For each item in myList, print the item. The
code practically explains itself, you don't have to do all these
intermediate calculations, and there are fewer places for bugs to hide.


3) Where do I find a command list, with syntax and all that fun stuff
for Python? I've explored the python site to no end, but I can't seem to
find a list.

Have you looked here?

http://docs.python.org/index.html
 
T

Tim Roberts

Steven D'Aprano said:
Global variables aren't *entirely* bad. I use them myself, sometimes for
constants (well, pseudo-constants -- Python doesn't enforce constants) and
short, quick 'n' dirty throw away code.

But in general, as your code gets bigger and more complicated, using
global variables gets more dangerous, unreliable, harder to debug, and
generally a bad idea. Let me show you some examples.

This gets my vote for post-of-the-week. If there were a bulletin board
that was required reading for beginning programmers, this would definitely
need to be tacked up to it.

Nice work.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top