Passing by reference

M

MartinRinehart

Is the following correct?

x = "some string"

x is a reference to "some string"

foo(x)

Reference is passed to function.

In foo:
x += " change"

Strings are immutable, so x in foo() now points to a different string
than x outside foo().
Right?

Back outside foo.

x = ["some string"]

x is a reference to a list whose first element is a reference to a
string.

foo(x)

Within foo:

x[0] += " other"

Another string is created, the first element of x is modified to point
to the new string and back outside foo(), x[0] will point to the new
string.

Right?
 
B

Ben Finney

Is the following correct?

[lots of references to "references"]

All good so far.
x[0] += " other"

Another string is created, the first element of x is modified to point
to the new string and back outside foo(), x[0] will point to the new
string.

Change these to talk about "references" again and it'll be true also:

"Another string is created, the first element of x now refers to
the new string and back outside foo(), x is still a reference to
the same list (so its first element is a reference to the same
string)."

Right. In Python, all names, and all elements of container objects,
are references to the corresponding objects. Python has no concept of
"pointers" in the style of C-like languages.
 
J

John Machin

Is the following correct?

x = "some string"

x is a reference to "some string"

foo(x)

Reference is passed to function.

In foo:
x += " change"

Strings are immutable, so x in foo() now points to a different string
than x outside foo().
Right?

Back outside foo.

x = ["some string"]

x is a reference to a list whose first element is a reference to a
string.

foo(x)

Within foo:

x[0] += " other"

Another string is created, the first element of x is modified to point

Somewhat colloquial/abbreviated. x is a reference. It does not have
elements. You mean "... the first element of the list to which x
refers is modified ...".
to the new string and back outside foo(), x[0] will point to the new
string.

Right?

Close enough.
 
M

MartinRinehart

.... the first element of the list to which x refers is a reference to
the new string and back outside foo, the first element of the list to
which x refers will be a reference to the new string.

Right?
 
M

Michael Sparks

... the first element of the list to which x refers is a reference to
the new string and back outside foo, the first element of the list to
which x refers will be a reference to the new string.

I'd rephrase that as:
* Both the global context and the inside of foo see the same list
* They can therefore both update the list
* If a new string is put in the first element of the list, the can
both see the same new string.

You know you can get python to answer your question - yes? Might be slightly
more illuminating than twisting round english... :)

OK, you're passing in a string in a list. You have 2 obvious ways of doing
that - either as an argument:

def foo(y):
y[0] += " other"
print id(y[0])

.... or as a global: (which of course you wouldn't do :)

def bar():
global x
x[0] += " another"
print id(x[0])

So let's see what happens.
x = ["some string"] # create container with string
x[0] # Check that looks good & it does 'some string'
id(x[0]) # What's the id of that string?? 3082578144L
foo(x) # OK, foo thinks the new string has the following id 3082534160
x[0] # Yep, our x[0] has updated, as expected. 'some string other'
id(x[0]) # Not only that the string has the same id. 3082534160L
bar() # Update the global var, next line is new id 3082543416
x[0] # Check the value's updated as expected 'some string other another'
id(x[0]) # Note that the id is the same as the output from bar
3082543416L

Does that perhaps answer your question more precisely ?


Michael.
 
T

Terry Reedy

| Is the following correct?
|
| x = "some string"
|
| x is a reference to "some string"

x is a name bound to a string object with value 'some string'.
Some people find is useful to call that a 'reference', as you seem to have.
Others get confused by that viewpoint. It depend on exactly what one means
by 'reference'.

| foo(x)
|
| Reference is passed to function.

The first parameter name of foo gets bound to the object referred to by
'x'.
Calling that 'passing by reference' sometimes misleads people as to how
Python behaves.

| In foo:
| x += " change"
|
| Strings are immutable, so x in foo() now points to a different string
| than x outside foo().
| Right?

A function local name x has no particular relationship to a global name
spelled the same, except to confuse things. Best to avoid when possible.

The effect of that statement would be the same outside of the function as
well, pretty much for the reason given. In general, 'y op= x' is the same
as 'y = y op x' except for any side-effects of expression y. Lists are an
exception.

tjr
 
S

Sion Arrowsmith

Michael Sparks said:
def bar():
global x
x[0] += " another"
print id(x[0])

.... and for bonus marks, explain why the "global x" in this function
is not required.
 
M

MartinRinehart

Sion said:
Michael Sparks said:
def bar():
global x
x[0] += " another"
print id(x[0])

... and for bonus marks, explain why the "global x" in this function
is not required.

Because x does not appear as an LHS in bar(), just about the first
thing I learned here.

More seriously, I can and do use lots of globals. In the tokenizer I'm
writing, for example, all the token types(COMMENT_EOL = 0,
CONSTANT_INTEGER = 1, ...) are global constants. The text to be
tokenized is a global variable. (Actually, the text is unchanging once
the Tok object is created, so this "variable" may be another
constant.) Passing global constants to functions is a case of CPU
abuse.

Structured purists gave globals a bad rap, years ago. Time to stick up
for them. They're good citizens. Don't blame them if some dumb coder
abuses them. It's not their fault.
 
B

Bruno Desthuilliers

(e-mail address removed) a écrit :
Sion said:
Michael Sparks said:
def bar():
global x
x[0] += " another"
print id(x[0])
... and for bonus marks, explain why the "global x" in this function
is not required.

Because x does not appear as an LHS in bar(), just about the first
thing I learned here.

More seriously, I can and do use lots of globals.

We all do, FWIW - since everything is name/object binding, all the
classes, functions, modules etc defined or imported in a module are,
technically, globals (for the Python definition of 'global').
In the tokenizer I'm
writing, for example, all the token types(COMMENT_EOL = 0,
CONSTANT_INTEGER = 1, ...) are global constants.

Technically, they are not even constants !-)
The text to be
tokenized is a global variable.

Now *this* is bad. Really bad.
(Actually, the text is unchanging once
the Tok object is created, so this "variable" may be another
constant.)

It isn't.
Passing global constants to functions is a case of CPU
abuse.

Remember that Python doesn't copy objects when passing them as function
params, and that function-local names are faster to lookup than global
ones.

There are indeed reasons not to pass module constants to the module's
functions, but that have nothing to do with CPU. And in your case, the
text to be tokenised is definitively not a constant.
Structured purists gave globals a bad rap, years ago. Time to stick up
for them. They're good citizens. Don't blame them if some dumb coder
abuses them.

Once you learned why you should not do something - and how to avoid
doing it -, chances are you also know when it's ok to break the rule.
 
M

MartinRinehart

Hi, Bruno. Merry Christmas!

By "constant" I meant that it did not change during the lifetime of
the Toker.
 
B

Bruno Desthuilliers

(e-mail address removed) a écrit :
Hi, Bruno. Merry Christmas!

By "constant" I meant that it did not change during the lifetime of
the Toker.

That's still a variable to me. It's even the essence of the variable,
since it's the main input of your program. And that's definitively not
something I'd store in global.
 
H

Hendrik van Rooyen

MartinRinehart Wrote:

More seriously, I can and do use lots of globals. In the tokenizer I'm
writing, for example, all the token types(COMMENT_EOL = 0,
CONSTANT_INTEGER = 1, ...) are global constants. The text to be
tokenized is a global variable. (Actually, the text is unchanging once
the Tok object is created, so this "variable" may be another
constant.) Passing global constants to functions is a case of CPU
abuse.

Structured purists gave globals a bad rap, years ago. Time to stick up
for them. They're good citizens. Don't blame them if some dumb coder
abuses them. It's not their fault.

*grin*

It is good to see that I am not the only person in the squad who hears
the beat of this drum.

I wonder if you have some COBOL data divisions under your belt?

- Hendrik
 
S

Steven D'Aprano

So where would you put it?

Context is all gone, so I'm not sure that I remember what "it" is. I
think it is the text that you're parsing.

I believe you are currently doing something like this:

TEXT = "placeholder"

def parse():
while True:
token = get_next_token() # looks at global TEXT
yield token

# And finally actually run your parser:
TEXT = open("filename", "r").read()
for token in parse():
print token



If I were doing this, I would do something like this:

def parse(text):
while True:
token = get_next_token() # looks at local text
yield token

# Run as many independent parsers as I need:
parser1 = parse(open("filename", "r").read())
parser2 = parse(open("filename2", "r").read())
parser3 = parse("some text")

for token in parser1:
print token
# etc.



Unless the text you are parsing is truly enormous (multiple hundreds of
megabytes) you are unlikely to run into memory problems. And you gain the
ability to run multiple parsers at once.
 
M

MartinRinehart

Steven said:
Context is all gone, so I'm not sure that I remember what "it" is. I
think it is the text that you're parsing.

Yes. I'm tokenizing today. Parsing comes after Christmas.
TEXT = "placeholder"

def parse():
while True:
token = get_next_token() # looks at global TEXT
yield token

Classic, but I'm not going to go there (at least until I fail
otherwise).

My tokenizer returns an array of Token objects. Each Token includes
the text from which is created, locations in the original text and,
for something like CONSTANT_INTEGER, it has an intValue data member.
# Run as many independent parsers as I need:
parser1 = parse(open("filename", "r").read())
parser2 = parse(open("filename2", "r").read())
parser3 = parse("some text")

Interesting approach, that. Could have a separate parser for each
statement. Hmmm. Maybe my tokenizer should return a list of arrays of
Tokens, one array per statement. Hmmm.

I'm thinking about an OO language construction that would be very easy
to extend. Tentatively, I'll have Production objects, Statement
objects, etc. I've already got Tokens.

Goal is a really simple language for beginners. Decaf will be to Java
as BASIC was to Fortran, I hope.
 
B

Bruno Desthuilliers

(e-mail address removed) a écrit :
So where would you put it?

You don't have to "put" functions arguments anywhere - they're already
local vars.

def tokenize(text):
do some work
returns or (yields) a list of tokens or whatever

If you want the tokenizer module to work as a self-contained appliction
*too*, then :

if __name__ == '__main__':
text = reads the text from a file or stdin
for token in tokenize(text):
do something with token


HTH
 
D

Dennis Lee Bieber

Goal is a really simple language for beginners. Decaf will be to Java
as BASIC was to Fortran, I hope.

WRT the latter simile... a mistake? <G>

BASIC: a language that mandated line numbers for every line, not just
those that were targets of branch/jump statements. Sure, it meant one
could perform edits by just "retyping" the line (with line number)
replacing the former line... Great if one is using a teletype as editor
-- but meaningless once one begins using even a text-based terminal with
an editing program that can clear/move/print at any position.

Two character variable names, with a suffix to differentiate strings
from numerics... And arrays "DIM a(10)" have 11 elements? (0 was a valid
subscript, but most users never learned that and only used 1..10)

--
Wulfraed Dennis Lee Bieber KD6MOG
(e-mail address removed) (e-mail address removed)
HTTP://wlfraed.home.netcom.com/
(Bestiaria Support Staff: (e-mail address removed))
HTTP://www.bestiaria.com/
 
M

MartinRinehart

Dennis said:
Great if one is using a teletype as editor

The original Dartmouth computer room was a basement that featured 8
teletypes.

The original BASIC, Dennis, was implemented on a time-shared
"mainframe" with a gigantic 8k words (20-bit words, if I remember) of
core memory. Designing a language for such a machine, I'd bet you,
too, would choose single-letter names. ('A' was a numeric. 'A$' a
string.)

If you compare the teletype to a tube it was lame. But that's not the
right comparison. The Fortran technology was cards, punched on a card
punch, carried to the operator. Wait your turn (hours more commonly
than minutes). Get a report off the line printer. Repunch the
offending cards.

Indeed, the teletype with line numbers was a giant step forward. No
operator. No waiting. Compiler complains. Retype the offending line. A
miracle in its day. You didn't even have to start your statements in
column 7!
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,731
Messages
2,569,432
Members
44,832
Latest member
GlennSmall

Latest Threads

Top