# user-defined operators: a very modest proposal

Discussion in 'Python' started by Steve R. Hastings, Nov 22, 2005.

1. ### Steve R. HastingsGuest

I have been studying Python recently, and I read a comment on one
web page that said something like "the people using Python for heavy math
really wish they could define their own operators". The specific
example was to define an "outer product" operator for matrices. (There

and came up with this suggestion:

User-defined operators could be defined like the following: ]+[

I'm not any kind of language design expert, but this seems to me like a
syntax that would be easy for Python to recognize. Because the square
braces are reversed from the usual "[]" order, this should not look like
any currently-valid code. And square braces, IMHO, do not fail the
"low-toner printout" test. (Some earlier proposals included operators like
"~+" and these were deemed too hard to read.)

For improved readability, Python could even enforce a requirement that
there should be white space on either side of a user-defined operator.
I don't really think that's necessary.

It should be possible to define operators using punctuation,
alphanumerics, or both:

]+[
]outer*[

Examples of use:

m = m0 ]*[ m1
m = m0]*[m1

m = m0 ]outer*[ m1
m = m0]outer*[m1

It looks a lot better with the white space, I think, but it's not horrible
without the white space.

Also, there should be a way to declare what kind of precedence the user-defined
operators use. Python already has lots of operators with different precedence,
and I think the best way is just to indicate which Python operator the new
operator's precedence should match:

class MyExcellentMatrix(object):
@precedence('*')
def __op_outer*__(self, right):
# ...do stuff...

I think a decorator is a good way to set the precedence.
Perhaps the default precedence should be that of '+'.

Augmented forms should be supported:

]+=[
]*=[
]outer*=[

Examples:

m ]*=[ m0
m]*=[m0

m ]outer*=[ m0
m]outer*=[

Either I actually have made a sensible suggestion, or else people will now
explain why this idea isn't good (and I'll learn something). Either way,

References:

Elementwise/Objectwise Operators
http://www.python.org/peps/pep-0225.html

Adding A New Outer Product Operator
http://www.python.org/peps/pep-0211.html

--
Steve R. Hastings "Vita est"
http://www.blarg.net/~steveha

Steve R. Hastings, Nov 22, 2005

2. ### Steven D'ApranoGuest

On Tue, 22 Nov 2005 13:48:05 -0800, Steve R. Hastings wrote:

> User-defined operators could be defined like the following: ]+[

[snip]

> Examples of use:
>
> m = m0 ]*[ m1
> m = m0]*[m1

That looks to me like multiplying two lists. I have to look twice to see
that the operands are merely m0 and m1 and not [m0] and [m1].

> m = m0 ]outer*[ m1
> m = m0]outer*[m1

That just looks weird.

Here is a thought: Python already supports an unlimited number of
operators, if you write them in prefix notation:

inner_product(m0, m1)
outer_product(m0, m1)
etc.

Here is some syntax that I don't object to, although that's not saying
much. In mathematics, there are operators of a plus sign within a circle,
multiply sign within a circle, etc. The closest we can get in plain ASCII
would be:

m0(+)m1
m0(*)m1
m0(-)m1
etc.

--
Steven.

Steven D'Aprano, Nov 22, 2005

3. ### Guest

If your proposal is implemented, what does this code mean?
if [1,2]+[3,4] != [1,2,3,4]: raise TestFailed, 'list concatenation'
Since it contains ']+[' I assume it must now be parsed as a user-defined
operator, but this code currently has a meaning in Python.

(This code is the first example I found, Python 2.3's test/test_types.py, so it
is actual code)

I don't believe that Python needs user-defined operators, but let me share my
terrible proposal anyway: Each unicode character in the class 'Sm' (Symbol,
Math) whose value is greater than 127 may be used as a user-defined operator.
The special method called depends on the ord() of the unicode character, so
that __u2044__ is called when the source code contains u'\N{FRACTION SLASH}'.
Whatever alternate syntax is adopted to allow unicode identifier characters to
be typed in pure ASCII will also apply to typing user-defined operators. "r"
and "i" versions of the operators will of course exist, as in __ru2044__ and
__iu2044__.

Also, to accomodate operators such as u'\N{DOUBLE INTEGRAL}', which are not
simple unary or binary operators, the character u'\N{NO BREAK SPACE}' will be
used to separate arguments. When necessary, parentheses will be added to
remove ambiguity. This leads naturally to expressions like
\N{DOUBLE INTEGRAL} (y * x**2) \N{NO BREAK SPACE} dx \N{NO BREAK SPACE} dy
(corresponding to the call (y*x**2).__u222c__(dx, dy)) which are clearly easy
to love, except for the small issue that many inferior editors will not clearly
display the \N{NO BREAK SPACE} characters.

Some items on which I think I'd like to hear the community's ideas are:
* Do we give special meaning to comparison characters like
\N{NEITHER LESS-THAN NOR GREATER-THAN}, or let users define them in new
ways? We could just provide, on object,
def __u2279__(self, other): return not self.__gt__(other) and other.__gt__(self)
which would in effect satisfy all users.

* Do we immediately implement the combination of operators with nonspacing
marks, or defer it? If we implement it, do we allow the combination with
pure ASCII operators, as in
u'\N{COMBINING LEFT RIGHT ARROW ABOVE}+'
or treat it as a syntax error? (BTW the method name for this would be
__u20e1u002b__, even though it might be tempting to support __u20e1x2b__,
__u2oe1add__ and similar method names) How and when do we normalize
operators combined with more than one nonspacing mark?

* Which unicode operator methods should be supported by built-in types?
Implementing __u222a__ and __iu222a__ for sets is a no-brainer,
obviously, but what about __iu2206__ for integers and long?

* Should some of the unicode mathematical symbols be reserved for literals?
It would be greatly preferable to write \u2205 instead of the other proposed
empty-set literal notation, {-}. Perhaps nullary operators could be defined,
so that writing \u2205 alone is the same as __u2205__() i.e., calling the
nullary function, whether it is defined at the local, lexical, module, or
built-in scope.

* Do we support characters from the category 'So' (symbol, other)? Not
doing so means preventing programmers from using operators like
\u"n{HEAVY CONCAVE-POINTED BLACK RIGHTWARDS ARROW}". Who are we to
make those kinds of choices for our users?

Jeff

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)

iD8DBQFDg6EaJd01MZaTXX0RAn8NAJ0enTxrgz3KAS1otCMHFFDYkSKeQQCgmtyV
OvbivR1dPtSaT2+bAMjK4jg=
=rK5l
-----END PGP SIGNATURE-----

, Nov 22, 2005
4. ### Dan BishopGuest

Steve R. Hastings wrote:
> I have been studying Python recently, and I read a comment on one
> web page that said something like "the people using Python for heavy math
> really wish they could define their own operators". The specific
> example was to define an "outer product" operator for matrices. (There
>
> and came up with this suggestion:
>
> User-defined operators could be defined like the following: ]+[
>
> I'm not any kind of language design expert, but this seems to me like a
> syntax that would be easy for Python to recognize. Because the square
> braces are reversed from the usual "[]" order, this should not look like
> any currently-valid code.

Is [a,b]+[c] the concatenation of two lists, or a single two-element
list containing a and b ]+[ c?

> And square braces, IMHO, do not fail the "low-toner printout" test.

They do. Just yesterday I printed some code in which some of the
square braces didn't show up.

Dan Bishop, Nov 22, 2005
5. ### Mike MeyerGuest

"Steve R. Hastings" <> writes:

> I have been studying Python recently, and I read a comment on one
> web page that said something like "the people using Python for heavy math
> really wish they could define their own operators". The specific
> example was to define an "outer product" operator for matrices. (There
> and came up with this suggestion:
> User-defined operators could be defined like the following: ]+[

See <URL:
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/384122 > for
some better suggestions, including an implementation in Python.

<mike
--
Mike Meyer <> http://www.mired.org/home/mwm/

Mike Meyer, Nov 22, 2005
6. ### Steve R. HastingsGuest

> Here is a thought: Python already supports an unlimited number of
> operators, if you write them in prefix notation:

And indeed, so far Python hasn't added user-defined operators because this

> Here is some syntax that I don't object to, although that's not saying
> much.

> m0(+)m1

That form was discussed previously, as were "[+]", "<+>", etc. The
favorite was "{+}". I believe such forms were considered hard to tell
from code. In particular, m0(+) looks like a function call.

See the PEP:

http://www.python.org/peps/pep-0225.html

possible to use the Google Groups archive of comp.lang.python to read some
of the discussion.
--
Steve R. Hastings "Vita est"
http://www.blarg.net/~steveha

Steve R. Hastings, Nov 22, 2005
7. ### Steve R. HastingsGuest

> if [1,2]+[3,4] != [1,2,3,4]: raise TestFailed, 'list concatenation'
> Since it contains ']+[' I assume it must now be parsed as a user-defined
> operator, but this code currently has a meaning in Python.

Yes. I agree that this is a fatal flaw in my suggestion.

Perhaps there is no syntax that can be done inside the bounds of ASCII
that will please everyone and not break existing code.

Your suggestion of Unicode makes a lot of sense. There are glyphs for
math operators, and if Python can accept Unicode source files, that seems
to me like a much better solution than hacks involving ASCII characters.

I didn't notice it before, but PEP 263 allows Python source files to be
Unicode:

http://www.python.org/peps/pep-0263.html

files!

Could such Unicode sources be exported to ASCII for porting code to
platforms that don't allow Unicode Python files? Yes: just replace the
Unicode character with a symbol like __op__, where op is the operator.

Actually, that's a better syntax than the one I proposed, too:

__+__
__outer*__

--
Steve R. Hastings "Vita est"
http://www.blarg.net/~steveha

Steve R. Hastings, Nov 23, 2005
8. ### Guest

On Tue, Nov 22, 2005 at 04:08:41PM -0800, Steve R. Hastings wrote:
> Actually, that's a better syntax than the one I proposed, too:
>
> __+__
> # __add__ # this one's already in use, so not allowed
> __outer*__

>>> __ = 3
>>> __+__

6
>>> __outer = 'x'
>>> __outer*__

'xxx'

Jeff

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)

iD8DBQFDg74ZJd01MZaTXX0RAtDAAJ9pMXaY8ybWaCznIQgR4N4xHISDcQCfaFJw
yAbNACnP5Tx2wGO6jJE7UXU=
=7jGl
-----END PGP SIGNATURE-----

, Nov 23, 2005
9. ### Tom AndersonGuest

On Tue, 22 Nov 2005, Steve R. Hastings wrote:

> User-defined operators could be defined like the following: ]+[

Eeek. That really doesn't look right.

Could you remind me of the reason we can't say [+]? It seems to me that an
operator can never be a legal filling for an array literal or a subscript,
so there wouldn't be ambiguity.

We could even just say that [?] is an array version of whatever operator ?
is, and let python do the heavy lifting (excuse the pun) of looping it
over the operands. [[?]] would obviously be a doubly-lifted version.
Although that would mean [*] is a componentwise product, rather than an
outer product, which wouldn't really help you very much! Maybe we could
define {?} as the generalised outer/tensor version of the ? operator ...

> For improved readability, Python could even enforce a requirement that
> there should be white space on either side of a user-defined operator. I
> don't really think that's necessary.

Indeed, it would be extremely wrong - normal operators don't require that,
and special cases aren't special enough to break the rules.

Reminds me of my idea for using spaces instead of parentheses for grouping
in expressions, so a+b * c+d evaluates as (a+b)*(c+d) - one of my worst
ideas ever, i'd say, up there with gin milkshakes.

> Also, there should be a way to declare what kind of precedence the
> user-defined operators use.

Can't be done - different uses of the same operator symbol on different
classes could have different precedence, right? So python would need to
know what the class of the receiver is before it can work out the
evaluation order of the expression; python does evaluation order at
compile time, but only knows classes at execute time, so no dice.

Also, i'm pretty sure you could cook up a situation where you could
exploit differing precedences of different definitions of one symbol to
generate ambiguous cases, but i'm not in a twisted enough mood to actually
work out a concrete example!

And now for something completely different.

For Py4k, i think we should allow any sequence of characters that doesn't
mean something else to be an operator, supported with one special method
to rule them all, __oper__(self, ator, and), so:

a + b

Becomes:

a.__oper__("+", b)

And:

a --{--@ b

Becomes:

a.__oper__("--{--@", b) # Euler's 'single rose' operator

Etc. We need to be able to distinguish a + -b from a +- b, but this is
where i can bring my grouping-by-whitespace idea into play, requiring
whitespace separating operands and operators - after all, if it's good
enough for grouping statements (as it evidently is at present), it's good
enough for expressions. The character ']' would be treated as whitespace,
so a would be handled as a.__oper__("[", b). Naturally, the . operator
would also be handled through __oper__.

Jeff Epler's proposal to use unicode operators would synergise most
excellently with this, allowing python to finally reach, and even surpass,
the level of expressiveness found in languages such as perl, APL and
INTERCAL.

tom

--
I DO IT WRONG!!!

Tom Anderson, Nov 23, 2005
10. ### Joseph GarvinGuest

Tom Anderson wrote:

>Jeff Epler's proposal to use unicode operators would synergise most
>excellently with this, allowing python to finally reach, and even surpass,
>the level of expressiveness found in languages such as perl, APL and
>INTERCAL.
>
>tom
>
>
>

What do you mean by unicode operators? Link?

Joseph Garvin, Nov 23, 2005
11. ### Tom AndersonGuest

On Tue, 22 Nov 2005 wrote:

> Each unicode character in the class 'Sm' (Symbol,
> Math) whose value is greater than 127 may be used as a user-defined operator.

EXCELLENT idea, Jeff!

> Also, to accomodate operators such as u'\N{DOUBLE INTEGRAL}', which are not
> simple unary or binary operators, the character u'\N{NO BREAK SPACE}' will be
> used to separate arguments. When necessary, parentheses will be added to
> remove ambiguity. This leads naturally to expressions like
> \N{DOUBLE INTEGRAL} (y * x**2) \N{NO BREAK SPACE} dx \N{NO BREAK SPACE} dy
> (corresponding to the call (y*x**2).__u222c__(dx, dy)) which are clearly easy
> to love, except for the small issue that many inferior editors will not clearly
> display the \N{NO BREAK SPACE} characters.

Could we use '\u2202' instead of 'd'? Or, to be more correct, is there a
d-which-is-not-a-d somewhere in the mathematical character sets? It would
be very useful to be able to distinguish d'x', as it were, from 'dx'.

> * Do we immediately implement the combination of operators with nonspacing
> marks, or defer it?

As long as you don't use normalisation form D, i'm happy.

> * Should some of the unicode mathematical symbols be reserved for literals?
> It would be greatly preferable to write \u2205 instead of the other proposed
> empty-set literal notation, {-}. Perhaps nullary operators could be defined,
> so that writing \u2205 alone is the same as __u2205__() i.e., calling the
> nullary function, whether it is defined at the local, lexical, module, or
> built-in scope.

Sounds like a good idea. \u211D and relatives would also be a candidate
for this treatment.

And for those of you out there who are laughing at this, i'd point out
that Perl IS ACTUALLY DOING THIS.

tom

--
I DO IT WRONG!!!

Tom Anderson, Nov 23, 2005
12. ### Fredrik LundhGuest

Joseph Garvin wrote:

> >Jeff Epler's proposal to use unicode operators would synergise most
> >excellently with this, allowing python to finally reach, and even surpass,
> >the level of expressiveness found in languages such as perl, APL and
> >INTERCAL.
> >

> What do you mean by unicode operators? Link?

a few messages earlier in the thead you're posting to. if your mail or news
provider is dropping messages, you can read the group via e.g.

http://news.gmane.org/gmane.comp.python.general

jeff's proposal is here:

http://article.gmane.org/gmane.comp.python.general/433247

</F>

Fredrik Lundh, Nov 23, 2005
13. ### Simon BrunningGuest

Simon Brunning, Nov 23, 2005
14. ### Fredrik LundhGuest

Fredrik Lundh, Nov 23, 2005
15. ### Simon BrunningGuest

On 23/11/05, Fredrik Lundh <> wrote:

PEP 666 should have been left open. There are a number of ideas that
come up here that should be added to it - and i'm sure there'll be
more.

--
Cheers,
Simon B,
,
http://www.brunningonline.net/simon/blog/

Simon Brunning, Nov 23, 2005
16. ### bruno at modulixGuest

Joseph Garvin wrote:
> Tom Anderson wrote:
>
>> Jeff Epler's proposal to use unicode operators would synergise most
>> excellently with this, allowing python to finally reach, and even
>> surpass, the level of expressiveness found in languages such as perl,
>> APL and INTERCAL.

--
bruno desthuilliers
python -c "print '@'.join(['.'.join([w[::-1] for w in p.split('.')]) for
p in ''.split('@')])"

bruno at modulix, Nov 23, 2005
17. ### Kay SchluehrGuest

Steve R. Hastings wrote:

> It should be possible to define operators using punctuation,
> alphanumerics, or both:
>
> ]+[
> ]outer*[

Seems like you look for advanced source-code editors.Some ideas are
around for quite a while e.g. here

http://en.wikipedia.org/wiki/Intentional_programming

I'm not sure if current computer algebra systems also offer a WYSIWYG
input mode? Of course this is not clutter and line noise but domain
specific standard notation.

There has also been a more Python related ambitious multi-language
project called Logix that enabled user-defined operators but it seems

Kay

Kay Schluehr, Nov 23, 2005
18. ### Antoon PardonGuest

Op 2005-11-22, schreef <>:
> * Should some of the unicode mathematical symbols be reserved for literals?
> It would be greatly preferable to write \u2205 instead of the other proposed
> empty-set literal notation, {-}. Perhaps nullary operators could be defined,
> so that writing \u2205 alone is the same as __u2205__() i.e., calling the
> nullary function, whether it is defined at the local, lexical, module, or
> built-in scope.

Isn't this essentially already happening with lists?.

And isn't something like this already possible with properties, except
for the scoping.

If python would develop the property idea a bit further and have
variables that would call a function each time they are accessed,
something like this could work.

--
Antoon Pardon

Antoon Pardon, Nov 24, 2005