Proposal for adding symbols within Python

  • Thread starter Pierre Barbier de Reuille
  • Start date
P

Pierre Barbier de Reuille

Please, note that I am entirely open for every points on this proposal
(which I do not dare yet to call PEP).

Abstract
========

This proposal suggests to add symbols into Python.

Symbols are objects whose representation within the code is more
important than their actual value. Two symbols needs only to be
equally-comparable. Also, symbols need to be hashable to use as keys of
dictionary (symbols are immutable objects).

Motivations
===========

Currently, there is no obvious way to define constants or states or
whatever would be best represented by symbols. Discussions on
comp.lang.python shows at least half a dozen way to replace symbols.

Some use cases for symbols are : state of an object (i.e. for a file
opened/closed/error) and unique objects (i.e. attributes names could be
represented as symbols).

Many languages propose symbols or obvious ways to define symbol-like
values. For examples in common languages:

In C/C++ : Symbols are emulated using Enums.
In Haskell/OCaml : Symbols are defined by union types with empty
constructors. Symbols are local to modules.
In Prolog : Symbols are called atoms ... they are local to modules (at
least in swi-prolog)
In Ruby : Symbols are introduced be the ":" notation. ":eek:pen" is a
symbol. Symbols are global.
In LISP : Symbols are introduced by "'". "'open" is a symbol. Symbols
are local to modules.

Proposal
========

First, I think it would be best to have a syntax to represent symbols.
Adding some special char before the name is probably a good way to
achieve that : $open, $close, ... are $ymbols.

On the range of symbols, I think they should be local to name space
(this point should be discussed as I see advantages and drawbacks for
both local and global symbols). For example, for the state of the file
object I would write :

Then, given some other objects (say some other device which also may be
opened) :

would always hold if both objects use locally-defined symbols. The only
way for these states to be equal would be, for example, for the device
object to explicitly assign the file symbols :

By default, symbols should be local to the current module. Then, being
in the module "device_manager", this would hold:

There should be a way to go from strings to symbols and the other way
around. For that purpose, I propose:

Implementation
==============

One possible way to implement symbols is simply with integers resolved
as much as possible at compile time.

The End
=======

Thanks to those who read entirely this proposal and I hope this proposal
will gather enough interests to become a PEP and someday be implemented,
maybe (probably?) in a completely different way ;)

Pierre
 
B

Ben Finney

Pierre Barbier de Reuille said:
This proposal suggests to add symbols into Python.

I still don't think "symbol" is particularly descriptive as a name;
there are too many other things already in the language that might
also be called a "symbol".
Symbols are objects whose representation within the code is more
important than their actual value.

An interesting
Two symbols needs only to be equally-comparable.

I believe it would be more useful to have enumerated types in Python,
which would also allow values from the same type to be cmp() compared.
Currently, there is no obvious way to define constants or states or
whatever would be best represented by symbols.

"constants" isn't a good equivalent here, since constants in most
other languages are all about the name-to-value mapping, which you
said was unimportant for this concept.
Discussions on comp.lang.python shows at least half a dozen way to
replace symbols.
s/replace/implement/

Some use cases for symbols are : state of an object (i.e. for a file
opened/closed/error) and unique objects (i.e. attributes names could
be represented as symbols).

That pretty much covers the common use cases. Nicely done.
First, I think it would be best to have a syntax to represent
symbols.

I disagree. Namespaces would be fine, and would also make clear which
values were related to each other; e.g. for your "state of an object"
use case, it's useful to have all the states in one namespace,
separate from unrelated states of other classes of objects.
Adding some special char before the name is probably a good way to
achieve that : $open, $close, ... are $ymbols.

Counterproposal:

FileState = SomeTypeDefiningStates( 'open', 'closed' )

thefile.state = FileState.open
if thefile.state == FileState.closed:
print "File is closed"

So all that's needed here is the type SomeTypeDefiningStates, not a
new syntax.
One possible way to implement symbols is simply with integers
resolved as much as possible at compile time.

I believe all your requirements and motivations could be met with an
Enum type in the language. Here's an implementation using a sequence
of integers for the underlying values:

"First Class Enums in Python"
<URL:http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/413486>

An enumerated type would also allow values from that type to be
compared with cmp() if their sequence was considered important. e.g.
for object state, the "normal" sequence of states could be represented
in the enumeration, and individual states compared to see if they are
"later" that each other. If sequence was not considered important, of
course, this feature would not get in the way.
 
M

Mike Meyer

Pierre Barbier de Reuille said:
Please, note that I am entirely open for every points on this proposal
(which I do not dare yet to call PEP).

Abstract
========

This proposal suggests to add symbols into Python.

You're also proposing adding a syntax to generate symbols. If so, it's
an important distinction, as simply addig symbols is a lot more
straightforward than adding new syntax.
Symbols are objects whose representation within the code is more
important than their actual value. Two symbols needs only to be
equally-comparable. Also, symbols need to be hashable to use as keys of
dictionary (symbols are immutable objects).

The values returned by object() meet this criteria. You could write
LISPs gensym as:

gensym = object

As you've indicated, there are a number of ways to get such
objects. If all you want is symbols, all that really needs to happen
is that one of those ways be blessed by including an implementation in
the distribution.
In LISP : Symbols are introduced by "'". "'open" is a symbol.

No, they're not. "'(a b c)" is *not* a symbol, it's a list. Symbols in
LISP are just names. "open" is a symbol, but it's normally evaluated.
The "'" is syntax that keeps the next expression from being evaluated,
so that "'open" gets you the symbol rather than it's value. Since
you're trying to introduce syntax, I think it's important to get
existing practice in other languages right.
Proposal
========

First, I think it would be best to have a syntax to represent symbols.

That's half the proposal.
Adding some special char before the name is probably a good way to
achieve that : $open, $close, ... are $ymbols.

$ has bad associations for me - and for others that came from an
earlier P-language. Also, I feel that using a magic character to
introduce type information doesn't feel very Pythonic.

While you don't make it clear, it seems obvious that you intend that
if $open occurs twice in the same scope, it should refer to the same
symbol. So you're using the syntax for a dual purpose. $name checks to
see if the symbol name exists, and references that if so. If not, it
creates a new symbol and with that name. Having something that looks
like a variables that instantiates upon reference instead of raising
an exception seems like a bad idea.
On the range of symbols, I think they should be local to name space
(this point should be discussed as I see advantages and drawbacks for
both local and global symbols).

Agreed. Having one type that has different scoping rules than
everything else is definitely a bad idea.
There should be a way to go from strings to symbols and the other way
around. For that purpose, I propose:

So the heart of your proposal seems to be twofold: The addition of
"symbol" as a type, and the syntax that has the lookup/create behavior
I described above.
Implementation
==============

One possible way to implement symbols is simply with integers resolved
as much as possible at compile time.

What exactly are you proposing be "resolved" at compile time? How is
this better than using object, as illustratd above?

Suggested changes:

Provide a solid definition for the proposed builtin type "symbol".
Something like:

symbol objects support two operations: is and equality
comparison. Two symbol objects compare equal if and only if
they are the same object, and symbol objects never compare
equal to any other type of object. The result of other
operations on a symbol object is undefined, and should raise
a TypeError exception.

symbol([value]) - creates a symbol object. Two distinct
calls to symbol will return two different symbol objects
unless the values passed to them as arguments are equal, in
which case they return the same symbol object. If symbol is
called without an argument, it returns a unique symbol.

I left the type of the value argument unspecified on purpose. Strings
are the obvious type, but I think it should be as unrestricted as
possible. The test on value is equality, not identity, because two
strings can be equal without being the same string, and we want that
case to give us the same symbol. I also added gensym-like behavior,
because it seemed useful. You could do without equality comparison,
but it seems like a nice thing to have.

Now propose a new syntax that "means" symbol, ala {} "meaning" dict
and [] "meaning" list. Don't use "$name" (& and ^ are also probably
bad, but not as; pretty much everything else but ? is already in
use). Python does seem to be moving away from this kind of thing,
though.

Personally, I think that the LISP quote mechanism would be a better
addition as a new syntax, as it would handle needs that have caused a
number of different proposals to be raised. It would require that
symbol know about the internals of the implementation so that ?name
and symbol("name") return the same object, and possibly exposing said
object to the programmer. And this is why the distinction about how
LISP acts is important.

<mike
 
B

Bengt Richter

Personally, I think that the LISP quote mechanism would be a better
addition as a new syntax, as it would handle needs that have caused a
number of different proposals to be raised. It would require that
symbol know about the internals of the implementation so that ?name
and symbol("name") return the same object, and possibly exposing said
object to the programmer. And this is why the distinction about how
LISP acts is important.
I wonder if the backquote could be deprecated and repurposed.
It could typographically serve nicely as a lisp quote then. But in python,
how would 'whatever be different from lambda:whatever ?
(where of course whatever could be any expression parenthesized
as necessary)

Regards,
Bengt Richter
 
S

Steven D'Aprano

First, I think it would be best to have a syntax to represent symbols.
Adding some special char before the name is probably a good way to
achieve that : $open, $close, ... are $ymbols.

I think your chances of convincing Guido to introduce new syntax is slim
to none. (Not quite zero -- he did accept @ for decorators.)

I think symbols should simply be an immutable object, one with state and
limited or no behaviour, rather than a brand new syntactical element.
Being an object, you can reference them in whatever namespace you define
them in.

Personally, I think rather than adding a new language feature (...slim to
none...) there is more hope of getting something like this added to the
standard library:

http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/413486
 
P

Pierre Barbier de Reuille

Ben Finney a écrit :
I still don't think "symbol" is particularly descriptive as a name;
there are too many other things already in the language that might
also be called a "symbol".

Well, that's the name in many languages. Then, probably all the things
already in the language that might be called "symbol" may be implemented
using the symbols in this proposal ... or maybe I don't see what you
mean here ?
[...]
First, I think it would be best to have a syntax to represent
symbols.


I disagree. Namespaces would be fine, and would also make clear which
values were related to each other; e.g. for your "state of an object"
use case, it's useful to have all the states in one namespace,
separate from unrelated states of other classes of objects.

Adding some special char before the name is probably a good way to
achieve that : $open, $close, ... are $ymbols.


Counterproposal:

FileState = SomeTypeDefiningStates( 'open', 'closed' )

thefile.state = FileState.open
if thefile.state == FileState.closed:
print "File is closed"

So all that's needed here is the type SomeTypeDefiningStates, not a
new syntax.

The problem, IMHO, is that way you need to declare "symbols"
beforehands, that's what I was trying to avoid by requiring a new syntax.
I believe all your requirements and motivations could be met with an
Enum type in the language. Here's an implementation using a sequence
of integers for the underlying values:

"First Class Enums in Python"
<URL:http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/413486>

An enumerated type would also allow values from that type to be
compared with cmp() if their sequence was considered important. e.g.
for object state, the "normal" sequence of states could be represented
in the enumeration, and individual states compared to see if they are
"later" that each other. If sequence was not considered important, of
course, this feature would not get in the way.

Well, I don't think enumarated objects ARE symbols. I can see two
"problems" :
1 - in the implementation, trying to compare values from different
groups raises an error instead of simply returning "False" (easy to fix ...)
2 - You have to declare these enumerable variables, which is not
pythonic IMO (impossible to fix ... needs complete redesign)

In the end, I really think symbols and enum are of different use, one of
the interest un symbols being to let the compiler does what he wants
(i.e. probably what is the most efficient).

Thanks for your reply,

Pierre
 
P

Pierre Barbier de Reuille

Mike Meyer a écrit :
Pierre Barbier de Reuille said:
Please, note that I am entirely open for every points on this proposal
(which I do not dare yet to call PEP).

Abstract
========
[...]
Symbols are objects whose representation within the code is more
important than their actual value. Two symbols needs only to be
equally-comparable. Also, symbols need to be hashable to use as keys of
dictionary (symbols are immutable objects).


The values returned by object() meet this criteria. You could write
LISPs gensym as:

gensym = object

As you've indicated, there are a number of ways to get such
objects. If all you want is symbols, all that really needs to happen
is that one of those ways be blessed by including an implementation in
the distribution.

Well, I may rewrite the proposal, but one good thing to have is the
hability to go from symbol to string and the opposite (as written below)
and that is not really allowed by this implementation of symbols.
No, they're not. "'(a b c)" is *not* a symbol, it's a list. Symbols in
LISP are just names. "open" is a symbol, but it's normally evaluated.
The "'" is syntax that keeps the next expression from being evaluated,
so that "'open" gets you the symbol rather than it's value. Since
you're trying to introduce syntax, I think it's important to get
existing practice in other languages right.

You're right ! I was a bit quick here ... "'" is a way to stop
evaluation and you may also write "(quote open)" for "'open".
That's half the proposal.




$ has bad associations for me - and for others that came from an
earlier P-language. Also, I feel that using a magic character to
introduce type information doesn't feel very Pythonic.

While you don't make it clear, it seems obvious that you intend that
if $open occurs twice in the same scope, it should refer to the same
symbol. So you're using the syntax for a dual purpose. $name checks to
see if the symbol name exists, and references that if so. If not, it
creates a new symbol and with that name. Having something that looks
like a variables that instantiates upon reference instead of raising
an exception seems like a bad idea.

Well, that's why symbols are absolutely not variables. One good model
(IMO) is LISP symbols. Symbols are *values* and equality is not
depending on the way you obtained the symbol :

(eq (quote opened) 'opened)
Agreed. Having one type that has different scoping rules than
everything else is definitely a bad idea.




So the heart of your proposal seems to be twofold: The addition of
"symbol" as a type, and the syntax that has the lookup/create behavior
I described above.

Indeed !
Implementation
==============

One possible way to implement symbols is simply with integers resolved
as much as possible at compile time.


What exactly are you proposing be "resolved" at compile time? How is
this better than using object, as illustratd above?

Suggested changes:

Provide a solid definition for the proposed builtin type "symbol".
Something like:

symbol objects support two operations: is and equality
comparison. Two symbol objects compare equal if and only if
they are the same object, and symbol objects never compare
equal to any other type of object. The result of other
operations on a symbol object is undefined, and should raise
a TypeError exception.

symbol([value]) - creates a symbol object. Two distinct
calls to symbol will return two different symbol objects
unless the values passed to them as arguments are equal, in
which case they return the same symbol object. If symbol is
called without an argument, it returns a unique symbol.

Good definition to me !
I left the type of the value argument unspecified on purpose. Strings
are the obvious type, but I think it should be as unrestricted as
possible. The test on value is equality, not identity, because two
strings can be equal without being the same string, and we want that
case to give us the same symbol. I also added gensym-like behavior,
because it seemed useful. You could do without equality comparison,
but it seems like a nice thing to have.
Now propose a new syntax that "means" symbol, ala {} "meaning" dict
and [] "meaning" list. Don't use "$name" (& and ^ are also probably
bad, but not as; pretty much everything else but ? is already in
use). Python does seem to be moving away from this kind of thing,
though.

Well, maybe we should find some other way to express symbols. The only
thing I wanted was a way easy to write, avoiding the need to declare
symbols, and allowing the specification of the scope of the symbol. My
prefered syntax would be something like :

'opened, `opened or `opened`

However, none are usable in current Python.
Personally, I think that the LISP quote mechanism would be a better
addition as a new syntax, as it would handle needs that have caused a
number of different proposals to be raised. It would require that
symbol know about the internals of the implementation so that ?name
and symbol("name") return the same object, and possibly exposing said
object to the programmer. And this is why the distinction about how
LISP acts is important.

<mike

Maybe, although I may say I cannot see clearly how LISP quote mechanism
translates into Python.
 
M

Mike Meyer

Pierre Barbier de Reuille said:
You're right ! I was a bit quick here ... "'" is a way to stop
evaluation and you may also write "(quote open)" for "'open".

Yup. Also notice that if you eval the symbol, you get any value that
happens to be bound to it. This is irrelevant for your purposes. But
the properties you're looking for are - in LISP, anyway -
implementation details of how it handles names.
Well, that's why symbols are absolutely not variables.

If they aren't variables, they probably shouldn't *look* like
variables.
Provide a solid definition for the proposed builtin type "symbol".
Something like:

symbol objects support two operations: is and equality
comparison. Two symbol objects compare equal if and only if
they are the same object, and symbol objects never compare
equal to any other type of object. The result of other
operations on a symbol object is undefined, and should raise
a TypeError exception.

symbol([value]) - creates a symbol object. Two distinct
calls to symbol will return two different symbol objects
unless the values passed to them as arguments are equal, in
which case they return the same symbol object. If symbol is
called without an argument, it returns a unique symbol.

Good definition to me !

Note that this definition doesn't capture the name-space semantics you
asked for - symbol(value) is defined to return the same symbol
everywhere it's called, so long as value is equal. This is probably a
good thing. Using the ability to have non-strings for value means you
can get this behavior by passing in something that's unique to the
namespace as part of value. Said something probably depends on the the
flavor of the namespace in question. This allows you to tailor the
namespace choice to your needs.

Also, since I'm allowing non-strings for value, just invoking str on
the symbol isn't really sufficient. Let's add an attribute 'value',
such that symbol(stuff).value is identical to stuff. I you want,
define symbol.__str__ as str(symbol.value) so that str(symbol("foo"))
returns "foo".
Well, maybe we should find some other way to express symbols. The only
thing I wanted was a way easy to write, avoiding the need to declare
symbols, and allowing the specification of the scope of the symbol. My
prefered syntax would be something like :
'opened, `opened or `opened`
However, none are usable in current Python.

Well, symbol('opened') solves the declaration issue, but it's not as
easy as you'd like.
Maybe, although I may say I cannot see clearly how LISP quote mechanism
translates into Python.

It compiles the quoted expression and returns a code object. I'd love
to recycle backquotes so that `expr` means
compile(expr, 'quoted-expr', 'eval'), but that won't happen anytime soon.

Hmm. You know, $symbol$ doesn't seem nearly as bad as $symbol. It
tickles TeX, not P***. I could live with that.

Like I said, the tricky part of doing this is getting `symbol` to have
the semantics you want. If you compile the same string twice, you get
two different code objects, though they compare equal, and the
variable names in co_names are the same strings. Maybe equality is
sufficient, and you don't need identity.

<mike
 
P

Pierre Barbier de Reuille

Mike Meyer a écrit :
If they aren't variables, they probably shouldn't *look* like
variables.

Yes, that's why we should find some way to express that.
Provide a solid definition for the proposed builtin type "symbol".
Something like:

symbol objects support two operations: is and equality
comparison. Two symbol objects compare equal if and only if
they are the same object, and symbol objects never compare
equal to any other type of object. The result of other
operations on a symbol object is undefined, and should raise
a TypeError exception.

symbol([value]) - creates a symbol object. Two distinct
calls to symbol will return two different symbol objects
unless the values passed to them as arguments are equal, in
which case they return the same symbol object. If symbol is
called without an argument, it returns a unique symbol.

Good definition to me !


Note that this definition doesn't capture the name-space semantics you
asked for - symbol(value) is defined to return the same symbol
everywhere it's called, so long as value is equal. This is probably a
good thing. Using the ability to have non-strings for value means you
can get this behavior by passing in something that's unique to the
namespace as part of value. Said something probably depends on the the
flavor of the namespace in question. This allows you to tailor the
namespace choice to your needs.

Very interesting ... that way we could get global AND local symbols ...
I like it !
Also, since I'm allowing non-strings for value, just invoking str on
the symbol isn't really sufficient. Let's add an attribute 'value',
such that symbol(stuff).value is identical to stuff. I you want,
define symbol.__str__ as str(symbol.value) so that str(symbol("foo"))
returns "foo".




Well, symbol('opened') solves the declaration issue, but it's not as
easy as you'd like.




It compiles the quoted expression and returns a code object. I'd love
to recycle backquotes so that `expr` means
compile(expr, 'quoted-expr', 'eval'), but that won't happen anytime soon.

Hmm. You know, $symbol$ doesn't seem nearly as bad as $symbol. It
tickles TeX, not P***. I could live with that.

Yep, I like this $symbol$ notation ! It could me equivalent to :

symbol( "symbol" )

And $object.symbol$ could translate into :

symbol( (object, "symbol") )
Like I said, the tricky part of doing this is getting `symbol` to have
the semantics you want. If you compile the same string twice, you get
two different code objects, though they compare equal, and the
variable names in co_names are the same strings. Maybe equality is
sufficient, and you don't need identity.

<mike

Yep, that's something I always found strange but I think this is for
optimization reasons. However, with symbols the problem is quite
different and we can take some time to ensure there are never two same
objects with different ids ... then, we can also garanty only the use of
"==" and not of "is" ...

Pierre
 
S

Steven D'Aprano

The problem, IMHO, is that way you need to declare "symbols"
beforehands, that's what I was trying to avoid by requiring a new syntax.

???

If you don't declare your symbols, how will you get the ones that you want?

I don't understand why it is a problem to declare them first, and if it is
a problem, what your solution would be.

[snip]
Well, I don't think enumarated objects ARE symbols. I can see two
"problems" :
1 - in the implementation, trying to compare values from different
groups raises an error instead of simply returning "False" (easy to fix ...)

As you say, that's easy to fix.
2 - You have to declare these enumerable variables, which is not
pythonic IMO (impossible to fix ... needs complete redesign)

Are you suggesting that the Python language designers should somehow
predict every possible symbol that anyone in the world might ever need,
and build them into the language as predefined things?

If that is not what you mean, can you explain please, because I'm confused.
 
P

Pierre Barbier de Reuille

Steven D'Aprano a écrit :
???

If you don't declare your symbols, how will you get the ones that you want?

I don't understand why it is a problem to declare them first, and if it is
a problem, what your solution would be.

Well, just as Python do not need variable declaration, you can just
*use* them ... in dynamic languages using symbols, they just get created
when used (i.e. have a look at LISP or Ruby).
[snip]

Well, I don't think enumarated objects ARE symbols. I can see two
"problems" :
1 - in the implementation, trying to compare values from different
groups raises an error instead of simply returning "False" (easy to fix ...)


As you say, that's easy to fix.

2 - You have to declare these enumerable variables, which is not
pythonic IMO (impossible to fix ... needs complete redesign)


Are you suggesting that the Python language designers should somehow
predict every possible symbol that anyone in the world might ever need,
and build them into the language as predefined things?

If that is not what you mean, can you explain please, because I'm confused.

Well, the best I can propose is for you to read the discussion with Mike
Meyer.
He pointer out the flaws in my proposal and we're trying to precise things.

Pierre
 
B

Ben Finney

Steven D'Aprano said:
The problem, IMHO, is that way you need to declare "symbols"
beforehands, that's what I was trying to avoid by requiring a new
syntax.

If you don't declare your symbols, how will you get the ones that
you want?
[...]
Are you suggesting that the Python language designers should somehow
predict every possible symbol that anyone in the world might ever
need, and build them into the language as predefined things?

I believe Pierre is looking for a syntax that will save him from
assigning values to names; that Python will simply assign arbitrary
unique values for these special names. My understanding of the
intended use is that their only purpose is to compare differently to
other objects of the same type, so the actual values don't matter.

What I still don't understand is why this justifies additional syntax
baggage in the language, rather than an explicit assignment earlier in
the code.
 
B

Ben Finney

Pierre Barbier de Reuille said:
Mike Meyer a écrit :
Yep, I like this $symbol$ notation !

Gets a big -1 here.

I've yet to see a convincing argument against simply assigning values
to names, then using those names.
 
S

Steven D'Aprano

Steven D'Aprano a écrit :

Well, just as Python do not need variable declaration, you can just
*use* them ... in dynamic languages using symbols, they just get created
when used (i.e. have a look at LISP or Ruby).

If you want to be technical, Python doesn't have variables. It has names
and objects.

If I want a name x to be bound to an object 1, I have to define it
(actually bind the name to the object):

x = 1

If I want a symbol $x$ (horrible syntax!!!) with a value 1, why shouldn't
I define it using:

$x$ = 1

instead of expecting Python to somehow magically know that I wanted it?
What if somebody else wanted the symbol $x$ to have the value 2 instead?

[snip]

Well, I don't think enumarated objects ARE symbols. I can see two
"problems" :
1 - in the implementation, trying to compare values from different
groups raises an error instead of simply returning "False" (easy to fix ...)


As you say, that's easy to fix.

2 - You have to declare these enumerable variables, which is not
pythonic IMO (impossible to fix ... needs complete redesign)


Are you suggesting that the Python language designers should somehow
predict every possible symbol that anyone in the world might ever need,
and build them into the language as predefined things?

If that is not what you mean, can you explain please, because I'm confused.

Well, the best I can propose is for you to read the discussion with Mike
Meyer.
He pointer out the flaws in my proposal and we're trying to precise things.

I've read the discussion, and I am no wiser.

You haven't explained why enums are not suitable to be used for symbols.
You gave two "problems", one of which was "easy to fix", as you said
yourself, and the other reason was that you don't want to define enums as
symbols.

If you don't want to define something manually, that can only mean that
you expect them to be predefined. Or am I misunderstanding something?
 
P

Pierre Barbier de Reuille

Ben Finney a écrit :
Gets a big -1 here.

I've yet to see a convincing argument against simply assigning values
to names, then using those names.

I can see three interests :
1 - ensure values are unique (i.e. a bit like using instances of object)
2 - values are meaningful (i.e. with introspection on the values you get
a human-readable value, unlike with instances of object)
3 - getting an *easy* access to those two properties

1 and 2 require a new type, 3 a new syntax (IMO).

Here's a try for the symbol class :

class symbol(object):
def __init__(self, value):
self._value = value
def _get_value(self):
return self._value
value = property(_get_value)
def __eq__(self, other):
return self.value == other.value
def __str__(self):
return str(self.value)
def __repr__(self):
return "symbol(%s)" % (repr(self.value),)

One thing to do would be to return the same object for symbols with the
same value (when possible ...).

For example, if we limit symbol to hashable types, we can implement
something which can be tested with "is" instead of "==":

class symbol(object):
_cache = {}
def __new__(cls, value):
if value in symbol._cache:
return symbol._cache[value]
self = object.__new__(cls)
self._value = value
symbol._cache[value] = self
return self
def _get_value(self):
return self._value
value = property(_get_value)
def __eq__(self, other):
return self.value == other.value
def __str__(self):
return str(self.value)
def __repr__(self):
return "symbol(%s)" % (repr(self.value),)

Then, as I suggested, you can do something like :

a = symbol((file, "opened"))

But it's less readable than $file.opened$ (or something similar).

Pierre
 
G

Guest

Ben Finney said:
I've yet to see a convincing argument against simply assigning values
to names, then using those names.

The problem with that is that you can't pass around the names of objects
that are used for other things. Obviously they make enums unnecessary,
but only people damaged by non-dynamic languages could think that's the
main point. ;-)

Being able to do that precludes the need for converting going back and
forth between strings and method names when you need to do things like
keeping a list of function names, even when you need to be able to
change what those function names point to.

Python doesn't really need to introduce a new type to do this. It's
already there, as what we usually just call names. Probably this
discussion would benefit from talking about names rather than symbols,
as that seems to confuse some people.

So, Python already has symbols. What we need is a way to refer to these
symbols explicitly. I would suggest to do it like in Lisp:

quote(spam)

Of course, this would preferably be implemented so that it doesn't just
work on simple names:

quote(spam(eggs))

I syntactic sugar, like ' in Lisp, could be introduced later, but I
don't think that would be strictly necessary.
 
P

Pierre Barbier de Reuille

Steven D'Aprano a écrit :
Steven D'Aprano a écrit :
[...]


If you want to be technical, Python doesn't have variables. It has names
and objects.

If I want a name x to be bound to an object 1, I have to define it
(actually bind the name to the object):

x = 1

If I want a symbol $x$ (horrible syntax!!!) with a value 1, why shouldn't
I define it using:

$x$ = 1

instead of expecting Python to somehow magically know that I wanted it?
What if somebody else wanted the symbol $x$ to have the value 2 instead?

Well, as stated, I don't care about the actual value of symbols. They
*are* values. A trivial implementation of symbols are strings :

$x$ <=> "x"

However, that won't fit because of the scope, because it would be great
to use "is" instead of "==" (even if not necessary), and as said Mike,
you might want something else than a string. That's why `x` would be a
good wawy to write that.


I've read the discussion, and I am no wiser.

You haven't explained why enums are not suitable to be used for symbols.
You gave two "problems", one of which was "easy to fix", as you said
yourself, and the other reason was that you don't want to define enums as
symbols.

If you don't want to define something manually, that can only mean that
you expect them to be predefined. Or am I misunderstanding something?

Well, I suspect Python will know them, exactly as it know "without
defining it" that "foo" is the string with chars f, o, o, that 3 is the
number 3, that [1,2] is the list with 1 and 2, ... However, to get
quicker, symbols could be created at compile-time when possible (like
variables). The fact is, symbols allow compilation optimisations that
you cannot get with regular types, because the language is completely
free about their representations. Then, to the programmer it is a good
way to have a meaningful value without caring about how to represent it
in the computer. That way, while debugging, if I ask the value of
file.state I will get something I can read instead of some meaningless
integer or other anonymous object.

So I gain in readability of my code and in debugging capacity.

Pierre
 
S

Steven D'Aprano

Steven D'Aprano said:
The problem, IMHO, is that way you need to declare "symbols"
beforehands, that's what I was trying to avoid by requiring a new
syntax.

If you don't declare your symbols, how will you get the ones that
you want?
[...]
Are you suggesting that the Python language designers should somehow
predict every possible symbol that anyone in the world might ever
need, and build them into the language as predefined things?

I believe Pierre is looking for a syntax that will save him from
assigning values to names; that Python will simply assign arbitrary
unique values for these special names. My understanding of the
intended use is that their only purpose is to compare differently to
other objects of the same type, so the actual values don't matter.

Unless I've misunderstood something, it would be easy to modify the recipe
given here to do something like that:

http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/413486

The example code does this:

Days = Enum('Mo', 'Tu', 'We', 'Th', 'Fr', 'Sa', 'Su')
print Days.Mo, Days.Tu
# etc.

What I still don't understand is why this justifies additional syntax
baggage in the language, rather than an explicit assignment earlier in
the code.

The only advantage would be if you want to do something like this:

MO, TU, WE, TH, FR, SA, SU = Symbols()

and have it magically work. I can see the advantage of that, and you don't
even need new syntax, just a magic function that somehow knows how many
names are on the left hand side of the assignment.

This is a poor substitute:

MO, TU, WE, TH, FR, SA, SU = range(7)

Firstly, if you change the number of symbol names, you have to manually
adjust the argument to range. Secondly, your symbols are actually ints,
and so will compare the same way ints compare.


I don't know enough about the Python internals to tell: is there any
feasible way for there to be a magic function like Symbol above that knew
how many names were waiting for it to be supplied? If there is no way to
do this from Python itself, it is possible to patch the compiler to do so?
 
P

Pierre Barbier de Reuille

Björn Lindström a écrit :
The problem with that is that you can't pass around the names of objects
that are used for other things. Obviously they make enums unnecessary,
but only people damaged by non-dynamic languages could think that's the
main point. ;-)

Being able to do that precludes the need for converting going back and
forth between strings and method names when you need to do things like
keeping a list of function names, even when you need to be able to
change what those function names point to.

Python doesn't really need to introduce a new type to do this. It's
already there, as what we usually just call names. Probably this
discussion would benefit from talking about names rather than symbols,
as that seems to confuse some people.

So, Python already has symbols. What we need is a way to refer to these
symbols explicitly. I would suggest to do it like in Lisp:

quote(spam)

Of course, this would preferably be implemented so that it doesn't just
work on simple names:

quote(spam(eggs))

I syntactic sugar, like ' in Lisp, could be introduced later, but I
don't think that would be strictly necessary.

Well, if this already exists in Python's internals, then, it would be
great just to expose them. Now, just being able to write :
quote(spam)

requires a new syntax so that spam is not resolved *before* calling the
quote method.

Pierre
 
M

Michael

Ben Finney wrote:
....
I've yet to see a convincing argument against simply assigning values
to names, then using those names.

I don't like any syntax I've seen so far, but I can understand the problem.
If you have a name, you can redefine a name, therefore the value a name
refers to is mutable. As a result if you have 2 symbols represented by
names and values, you may have two symbols with different names but the
same value. Hence the two "symbols" are no longer unique)

Conversely consider "NAME" to be a symbol. I can't modify "NAME". It always
means the same as "NAME" and "NAME", but is never the same as "FRED".
What's tricky is I can't have namespaceOne."NAME" [1] and
namespaceTwo."NAME" as different "NAME"s even though logically there's no
reason I couldn't treat "NAME" differently inside each.

[1] Supposing for a moment that I could have a string as a name in a
namespace. (Rather than a string used as a key in that namespace)

However it might be useful to note that these two values (or symbols) are
actually different, even if you remove their namespaces.

To me, the use of a symbol implies a desire for a constant, and then to only
use that constant rather than the value. In certain situations it's the
fact that constant A is not the same as constant B that's important (eg
modelling state machines).

Often you can use strings for that sort of thing, but unfortunately even
python's strings can't be used as symbols that are always the same thing
in all ways. For example, we can force the id of identical strings to be
different:(135049832, 135059864)

As a result I can see that *IF* you really want this kind of symbol, rather
than the various other kinds people have discussed in the thread, that some
special syntax (like u'hello' for unicode 'hello') could be useful.

However, I'd be more interested in some real world usecases where this would
be beneficial, and then seeing what sort of syntax would be nice/useful
(Especially since I can think of some uses where it would be nice).

On the original syntax proposal, I'm firmly in the -1 camp - to me having
done lots of perl in the past $foo looks very firmly like a mutable, rather
than an immutable.

The reason I'm more interested in seeing usecases, is because I'd rather see
where the existing approaches people use/define symbols has caused the OP
problems to the extent he feels the language needs to change to fix these
real world problems.


Michael.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,066
Latest member
VytoKetoReviews

Latest Threads

Top