user-defined operators: a very modest proposal

Discussion in 'Python' started by Steve R. Hastings, Nov 22, 2005.

  1. I have been studying Python recently, and I read a comment on one
    web page that said something like "the people using Python for heavy math
    really wish they could define their own operators". The specific
    example was to define an "outer product" operator for matrices. (There
    was even a PEP, number 211, about this.)

    I gave it some thought, and Googled for previous discussions about this,
    and came up with this suggestion:

    User-defined operators could be defined like the following: ]+[

    I'm not any kind of language design expert, but this seems to me like a
    syntax that would be easy for Python to recognize. Because the square
    braces are reversed from the usual "[]" order, this should not look like
    any currently-valid code. And square braces, IMHO, do not fail the
    "low-toner printout" test. (Some earlier proposals included operators like
    "~+" and these were deemed too hard to read.)

    For improved readability, Python could even enforce a requirement that
    there should be white space on either side of a user-defined operator.
    I don't really think that's necessary.

    It should be possible to define operators using punctuation,
    alphanumerics, or both:

    ]+[
    ]add[
    ]outer*[


    Examples of use:

    m = m0 ]*[ m1
    m = m0]*[m1

    m = m0 ]outer*[ m1
    m = m0]outer*[m1



    It looks a lot better with the white space, I think, but it's not horrible
    without the white space.


    Also, there should be a way to declare what kind of precedence the user-defined
    operators use. Python already has lots of operators with different precedence,
    and I think the best way is just to indicate which Python operator the new
    operator's precedence should match:

    class MyExcellentMatrix(object):
    @precedence('*')
    def __op_outer*__(self, right):
    # ...do stuff...

    I think a decorator is a good way to set the precedence.
    Perhaps the default precedence should be that of '+'.

    Augmented forms should be supported:

    ]+=[
    ]*=[
    ]outer*=[


    Examples:

    m ]*=[ m0
    m]*=[m0

    m ]outer*=[ m0
    m]outer*=[


    Either I actually have made a sensible suggestion, or else people will now
    explain why this idea isn't good (and I'll learn something). Either way,
    I look forward to your comments.



    References:

    Elementwise/Objectwise Operators
    http://www.python.org/peps/pep-0225.html


    Adding A New Outer Product Operator
    http://www.python.org/peps/pep-0211.html


    --
    Steve R. Hastings "Vita est"
    http://www.blarg.net/~steveha
     
    Steve R. Hastings, Nov 22, 2005
    #1
    1. Advertising

  2. On Tue, 22 Nov 2005 13:48:05 -0800, Steve R. Hastings wrote:

    > User-defined operators could be defined like the following: ]+[


    [snip]

    > Examples of use:
    >
    > m = m0 ]*[ m1
    > m = m0]*[m1


    That looks to me like multiplying two lists. I have to look twice to see
    that the operands are merely m0 and m1 and not [m0] and [m1].

    > m = m0 ]outer*[ m1
    > m = m0]outer*[m1


    That just looks weird.


    Here is a thought: Python already supports an unlimited number of
    operators, if you write them in prefix notation:

    inner_product(m0, m1)
    outer_product(m0, m1)
    etc.

    Here is some syntax that I don't object to, although that's not saying
    much. In mathematics, there are operators of a plus sign within a circle,
    multiply sign within a circle, etc. The closest we can get in plain ASCII
    would be:

    m0(+)m1
    m0(*)m1
    m0(-)m1
    etc.


    --
    Steven.
     
    Steven D'Aprano, Nov 22, 2005
    #2
    1. Advertising

  3. Steve R. Hastings

    Guest

    If your proposal is implemented, what does this code mean?
    if [1,2]+[3,4] != [1,2,3,4]: raise TestFailed, 'list concatenation'
    Since it contains ']+[' I assume it must now be parsed as a user-defined
    operator, but this code currently has a meaning in Python.

    (This code is the first example I found, Python 2.3's test/test_types.py, so it
    is actual code)

    I don't believe that Python needs user-defined operators, but let me share my
    terrible proposal anyway: Each unicode character in the class 'Sm' (Symbol,
    Math) whose value is greater than 127 may be used as a user-defined operator.
    The special method called depends on the ord() of the unicode character, so
    that __u2044__ is called when the source code contains u'\N{FRACTION SLASH}'.
    Whatever alternate syntax is adopted to allow unicode identifier characters to
    be typed in pure ASCII will also apply to typing user-defined operators. "r"
    and "i" versions of the operators will of course exist, as in __ru2044__ and
    __iu2044__.

    Also, to accomodate operators such as u'\N{DOUBLE INTEGRAL}', which are not
    simple unary or binary operators, the character u'\N{NO BREAK SPACE}' will be
    used to separate arguments. When necessary, parentheses will be added to
    remove ambiguity. This leads naturally to expressions like
    \N{DOUBLE INTEGRAL} (y * x**2) \N{NO BREAK SPACE} dx \N{NO BREAK SPACE} dy
    (corresponding to the call (y*x**2).__u222c__(dx, dy)) which are clearly easy
    to love, except for the small issue that many inferior editors will not clearly
    display the \N{NO BREAK SPACE} characters.

    Some items on which I think I'd like to hear the community's ideas are:
    * Do we give special meaning to comparison characters like
    \N{NEITHER LESS-THAN NOR GREATER-THAN}, or let users define them in new
    ways? We could just provide, on object,
    def __u2279__(self, other): return not self.__gt__(other) and other.__gt__(self)
    which would in effect satisfy all users.

    * Do we immediately implement the combination of operators with nonspacing
    marks, or defer it? If we implement it, do we allow the combination with
    pure ASCII operators, as in
    u'\N{COMBINING LEFT RIGHT ARROW ABOVE}+'
    or treat it as a syntax error? (BTW the method name for this would be
    __u20e1u002b__, even though it might be tempting to support __u20e1x2b__,
    __u2oe1add__ and similar method names) How and when do we normalize
    operators combined with more than one nonspacing mark?

    * Which unicode operator methods should be supported by built-in types?
    Implementing __u222a__ and __iu222a__ for sets is a no-brainer,
    obviously, but what about __iu2206__ for integers and long?

    * Should some of the unicode mathematical symbols be reserved for literals?
    It would be greatly preferable to write \u2205 instead of the other proposed
    empty-set literal notation, {-}. Perhaps nullary operators could be defined,
    so that writing \u2205 alone is the same as __u2205__() i.e., calling the
    nullary function, whether it is defined at the local, lexical, module, or
    built-in scope.

    * Do we support characters from the category 'So' (symbol, other)? Not
    doing so means preventing programmers from using operators like
    \u"n{HEAVY CONCAVE-POINTED BLACK RIGHTWARDS ARROW}". Who are we to
    make those kinds of choices for our users?

    Jeff

    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v1.4.1 (GNU/Linux)

    iD8DBQFDg6EaJd01MZaTXX0RAn8NAJ0enTxrgz3KAS1otCMHFFDYkSKeQQCgmtyV
    OvbivR1dPtSaT2+bAMjK4jg=
    =rK5l
    -----END PGP SIGNATURE-----
     
    , Nov 22, 2005
    #3
  4. Steve R. Hastings

    Dan Bishop Guest

    Steve R. Hastings wrote:
    > I have been studying Python recently, and I read a comment on one
    > web page that said something like "the people using Python for heavy math
    > really wish they could define their own operators". The specific
    > example was to define an "outer product" operator for matrices. (There
    > was even a PEP, number 211, about this.)
    >
    > I gave it some thought, and Googled for previous discussions about this,
    > and came up with this suggestion:
    >
    > User-defined operators could be defined like the following: ]+[
    >
    > I'm not any kind of language design expert, but this seems to me like a
    > syntax that would be easy for Python to recognize. Because the square
    > braces are reversed from the usual "[]" order, this should not look like
    > any currently-valid code.


    Is [a,b]+[c] the concatenation of two lists, or a single two-element
    list containing a and b ]+[ c?

    > And square braces, IMHO, do not fail the "low-toner printout" test.


    They do. Just yesterday I printed some code in which some of the
    square braces didn't show up.
     
    Dan Bishop, Nov 22, 2005
    #4
  5. Steve R. Hastings

    Mike Meyer Guest

    "Steve R. Hastings" <> writes:

    > I have been studying Python recently, and I read a comment on one
    > web page that said something like "the people using Python for heavy math
    > really wish they could define their own operators". The specific
    > example was to define an "outer product" operator for matrices. (There
    > was even a PEP, number 211, about this.)
    > I gave it some thought, and Googled for previous discussions about this,
    > and came up with this suggestion:
    > User-defined operators could be defined like the following: ]+[


    See <URL:
    http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/384122 > for
    some better suggestions, including an implementation in Python.

    <mike
    --
    Mike Meyer <> http://www.mired.org/home/mwm/
    Independent WWW/Perforce/FreeBSD/Unix consultant, email for more information.
     
    Mike Meyer, Nov 22, 2005
    #5
  6. > Here is a thought: Python already supports an unlimited number of
    > operators, if you write them in prefix notation:


    And indeed, so far Python hasn't added user-defined operators because this
    has been adequate.


    > Here is some syntax that I don't object to, although that's not saying
    > much.


    > m0(+)m1


    That form was discussed previously, as were "[+]", "<+>", etc. The
    favorite was "{+}". I believe such forms were considered hard to tell
    from code. In particular, m0(+) looks like a function call.

    See the PEP:

    http://www.python.org/peps/pep-0225.html

    Alas, the links to the discussion about this don't work. But it is
    possible to use the Google Groups archive of comp.lang.python to read some
    of the discussion.
    --
    Steve R. Hastings "Vita est"
    http://www.blarg.net/~steveha
     
    Steve R. Hastings, Nov 22, 2005
    #6
  7. > if [1,2]+[3,4] != [1,2,3,4]: raise TestFailed, 'list concatenation'
    > Since it contains ']+[' I assume it must now be parsed as a user-defined
    > operator, but this code currently has a meaning in Python.


    Yes. I agree that this is a fatal flaw in my suggestion.

    Perhaps there is no syntax that can be done inside the bounds of ASCII
    that will please everyone and not break existing code.


    Your suggestion of Unicode makes a lot of sense. There are glyphs for
    math operators, and if Python can accept Unicode source files, that seems
    to me like a much better solution than hacks involving ASCII characters.

    I didn't notice it before, but PEP 263 allows Python source files to be
    Unicode:

    http://www.python.org/peps/pep-0263.html

    So the latest versions of Python already have support for Unicode source
    files!


    Could such Unicode sources be exported to ASCII for porting code to
    platforms that don't allow Unicode Python files? Yes: just replace the
    Unicode character with a symbol like __op__, where op is the operator.

    Actually, that's a better syntax than the one I proposed, too:

    __+__
    # __add__ # this one's already in use, so not allowed
    __outer*__


    --
    Steve R. Hastings "Vita est"
    http://www.blarg.net/~steveha
     
    Steve R. Hastings, Nov 23, 2005
    #7
  8. Steve R. Hastings

    Guest

    On Tue, Nov 22, 2005 at 04:08:41PM -0800, Steve R. Hastings wrote:
    > Actually, that's a better syntax than the one I proposed, too:
    >
    > __+__
    > # __add__ # this one's already in use, so not allowed
    > __outer*__


    Again, this means something already.

    >>> __ = 3
    >>> __+__

    6
    >>> __outer = 'x'
    >>> __outer*__

    'xxx'

    Jeff

    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v1.4.1 (GNU/Linux)

    iD8DBQFDg74ZJd01MZaTXX0RAtDAAJ9pMXaY8ybWaCznIQgR4N4xHISDcQCfaFJw
    yAbNACnP5Tx2wGO6jJE7UXU=
    =7jGl
    -----END PGP SIGNATURE-----
     
    , Nov 23, 2005
    #8
  9. Steve R. Hastings

    Tom Anderson Guest

    On Tue, 22 Nov 2005, Steve R. Hastings wrote:

    > User-defined operators could be defined like the following: ]+[


    Eeek. That really doesn't look right.

    Could you remind me of the reason we can't say [+]? It seems to me that an
    operator can never be a legal filling for an array literal or a subscript,
    so there wouldn't be ambiguity.

    We could even just say that [?] is an array version of whatever operator ?
    is, and let python do the heavy lifting (excuse the pun) of looping it
    over the operands. [[?]] would obviously be a doubly-lifted version.
    Although that would mean [*] is a componentwise product, rather than an
    outer product, which wouldn't really help you very much! Maybe we could
    define {?} as the generalised outer/tensor version of the ? operator ...

    > For improved readability, Python could even enforce a requirement that
    > there should be white space on either side of a user-defined operator. I
    > don't really think that's necessary.


    Indeed, it would be extremely wrong - normal operators don't require that,
    and special cases aren't special enough to break the rules.

    Reminds me of my idea for using spaces instead of parentheses for grouping
    in expressions, so a+b * c+d evaluates as (a+b)*(c+d) - one of my worst
    ideas ever, i'd say, up there with gin milkshakes.

    > Also, there should be a way to declare what kind of precedence the
    > user-defined operators use.


    Can't be done - different uses of the same operator symbol on different
    classes could have different precedence, right? So python would need to
    know what the class of the receiver is before it can work out the
    evaluation order of the expression; python does evaluation order at
    compile time, but only knows classes at execute time, so no dice.

    Also, i'm pretty sure you could cook up a situation where you could
    exploit differing precedences of different definitions of one symbol to
    generate ambiguous cases, but i'm not in a twisted enough mood to actually
    work out a concrete example!

    And now for something completely different.

    For Py4k, i think we should allow any sequence of characters that doesn't
    mean something else to be an operator, supported with one special method
    to rule them all, __oper__(self, ator, and), so:

    a + b

    Becomes:

    a.__oper__("+", b)

    And:

    a --{--@ b

    Becomes:

    a.__oper__("--{--@", b) # Euler's 'single rose' operator

    Etc. We need to be able to distinguish a + -b from a +- b, but this is
    where i can bring my grouping-by-whitespace idea into play, requiring
    whitespace separating operands and operators - after all, if it's good
    enough for grouping statements (as it evidently is at present), it's good
    enough for expressions. The character ']' would be treated as whitespace,
    so a would be handled as a.__oper__("[", b). Naturally, the . operator
    would also be handled through __oper__.

    Jeff Epler's proposal to use unicode operators would synergise most
    excellently with this, allowing python to finally reach, and even surpass,
    the level of expressiveness found in languages such as perl, APL and
    INTERCAL.

    tom

    --
    I DO IT WRONG!!!
     
    Tom Anderson, Nov 23, 2005
    #9
  10. Tom Anderson wrote:

    >Jeff Epler's proposal to use unicode operators would synergise most
    >excellently with this, allowing python to finally reach, and even surpass,
    >the level of expressiveness found in languages such as perl, APL and
    >INTERCAL.
    >
    >tom
    >
    >
    >

    What do you mean by unicode operators? Link?
     
    Joseph Garvin, Nov 23, 2005
    #10
  11. Steve R. Hastings

    Tom Anderson Guest

    On Tue, 22 Nov 2005 wrote:

    > Each unicode character in the class 'Sm' (Symbol,
    > Math) whose value is greater than 127 may be used as a user-defined operator.


    EXCELLENT idea, Jeff!

    > Also, to accomodate operators such as u'\N{DOUBLE INTEGRAL}', which are not
    > simple unary or binary operators, the character u'\N{NO BREAK SPACE}' will be
    > used to separate arguments. When necessary, parentheses will be added to
    > remove ambiguity. This leads naturally to expressions like
    > \N{DOUBLE INTEGRAL} (y * x**2) \N{NO BREAK SPACE} dx \N{NO BREAK SPACE} dy
    > (corresponding to the call (y*x**2).__u222c__(dx, dy)) which are clearly easy
    > to love, except for the small issue that many inferior editors will not clearly
    > display the \N{NO BREAK SPACE} characters.


    Could we use '\u2202' instead of 'd'? Or, to be more correct, is there a
    d-which-is-not-a-d somewhere in the mathematical character sets? It would
    be very useful to be able to distinguish d'x', as it were, from 'dx'.

    > * Do we immediately implement the combination of operators with nonspacing
    > marks, or defer it?


    As long as you don't use normalisation form D, i'm happy.

    > * Should some of the unicode mathematical symbols be reserved for literals?
    > It would be greatly preferable to write \u2205 instead of the other proposed
    > empty-set literal notation, {-}. Perhaps nullary operators could be defined,
    > so that writing \u2205 alone is the same as __u2205__() i.e., calling the
    > nullary function, whether it is defined at the local, lexical, module, or
    > built-in scope.


    Sounds like a good idea. \u211D and relatives would also be a candidate
    for this treatment.

    And for those of you out there who are laughing at this, i'd point out
    that Perl IS ACTUALLY DOING THIS.

    tom

    --
    I DO IT WRONG!!!
     
    Tom Anderson, Nov 23, 2005
    #11
  12. Joseph Garvin wrote:

    > >Jeff Epler's proposal to use unicode operators would synergise most
    > >excellently with this, allowing python to finally reach, and even surpass,
    > >the level of expressiveness found in languages such as perl, APL and
    > >INTERCAL.
    > >

    > What do you mean by unicode operators? Link?


    a few messages earlier in the thead you're posting to. if your mail or news
    provider is dropping messages, you can read the group via e.g.

    http://news.gmane.org/gmane.comp.python.general

    jeff's proposal is here:

    http://article.gmane.org/gmane.comp.python.general/433247

    </F>
     
    Fredrik Lundh, Nov 23, 2005
    #12
  13. Simon Brunning, Nov 23, 2005
    #13
  14. Fredrik Lundh, Nov 23, 2005
    #14
  15. Simon Brunning, Nov 23, 2005
    #15
  16. Joseph Garvin wrote:
    > Tom Anderson wrote:
    >
    >> Jeff Epler's proposal to use unicode operators would synergise most
    >> excellently with this, allowing python to finally reach, and even
    >> surpass, the level of expressiveness found in languages such as perl,
    >> APL and INTERCAL.


    s/expressiveness/unreadability/


    --
    bruno desthuilliers
    python -c "print '@'.join(['.'.join([w[::-1] for w in p.split('.')]) for
    p in ''.split('@')])"
     
    bruno at modulix, Nov 23, 2005
    #16
  17. Steve R. Hastings

    Kay Schluehr Guest

    Steve R. Hastings wrote:

    > It should be possible to define operators using punctuation,
    > alphanumerics, or both:
    >
    > ]+[
    > ]add[
    > ]outer*[


    Seems like you look for advanced source-code editors.Some ideas are
    around for quite a while e.g. here

    http://en.wikipedia.org/wiki/Intentional_programming

    I'm not sure if current computer algebra systems also offer a WYSIWYG
    input mode? Of course this is not clutter and line noise but domain
    specific standard notation.

    There has also been a more Python related ambitious multi-language
    project called Logix that enabled user-defined operators but it seems
    to be dead.

    Kay
     
    Kay Schluehr, Nov 23, 2005
    #17
  18. Op 2005-11-22, schreef <>:
    > * Should some of the unicode mathematical symbols be reserved for literals?
    > It would be greatly preferable to write \u2205 instead of the other proposed
    > empty-set literal notation, {-}. Perhaps nullary operators could be defined,
    > so that writing \u2205 alone is the same as __u2205__() i.e., calling the
    > nullary function, whether it is defined at the local, lexical, module, or
    > built-in scope.


    Isn't this essentially already happening with lists?.

    And isn't something like this already possible with properties, except
    for the scoping.

    If python would develop the property idea a bit further and have
    variables that would call a function each time they are accessed,
    something like this could work.

    --
    Antoon Pardon
     
    Antoon Pardon, Nov 24, 2005
    #18
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Oodini
    Replies:
    1
    Views:
    1,796
    Keith Thompson
    Sep 27, 2005
  2. ant

    GUIs - A Modest Proposal

    ant, Jun 6, 2010, in forum: Python
    Replies:
    330
    Views:
    6,131
  3. rantingrick

    Community (A Modest Proposal)

    rantingrick, Jun 13, 2010, in forum: Python
    Replies:
    37
    Views:
    823
    geremy condra
    Jun 15, 2010
  4. TonyMc

    A Modest Proposal

    TonyMc, Jul 7, 2011, in forum: C Programming
    Replies:
    5
    Views:
    482
    gwowen
    Jul 8, 2011
  5. W. eWatson
    Replies:
    0
    Views:
    127
    W. eWatson
    Sep 5, 2013
Loading...

Share This Page