Uniform Function Call Syntax (UFCS)

J

jongiddy

The language D has a feature called Uniform Function Call Syntax, which allows instance methods to be resolved using function calls.

In Python terms, the call:

x.len()

would first check if 'x' has a method 'len', and would then look for a function 'len', passing 'x' as the first argument.

The big wins are:

- the ability to override functions with more optimal class-specific implementations. (Of course, len() is a bad example, since we already have a way to override it, but there are other functions that do not have a special method).

- the readability of a.b().c().d() vs c(a.b()).d()

Here's a few links discussing the feature in D:
- First, a fairly gentle "this is cool" post: http://www.kr41.net/2013/08/27/uniform_function_call_syntax_in_d.html
- Second, an article from the Walter Bright, the creator of D: http://www.drdobbs.com/cpp/uniform-function-call-syntax/232700394

Has this been discussed or proposed before? I found PEP's 443 and 3124, which provide a form of function overloading, but not reordering.
 
I

Ian Kelly

The language D has a feature called Uniform Function Call Syntax, which allows instance methods to be resolved using function calls.

In Python terms, the call:

x.len()

would first check if 'x' has a method 'len', and would then look for a function 'len', passing 'x' as the first argument.

The big wins are:

- the ability to override functions with more optimal class-specific implementations. (Of course, len() is a bad example, since we already have a way to override it, but there are other functions that do not have a special method).

- the readability of a.b().c().d() vs c(a.b()).d()

Here's a few links discussing the feature in D:
- First, a fairly gentle "this is cool" post: http://www.kr41.net/2013/08/27/uniform_function_call_syntax_in_d.html
- Second, an article from the Walter Bright, the creator of D: http://www.drdobbs.com/cpp/uniform-function-call-syntax/232700394

Has this been discussed or proposed before? I found PEP's 443 and 3124, which provide a form of function overloading, but not reordering.

It's a nice feature in a statically typed language, but I'm not sure
how well it would work in a language as dynamic as Python. There are
some questions that would need to be addressed.

1) Where should the function (or perhaps callable) be looked for? The
most obvious place is the global scope. I think it would be a bit too
far-reaching and inconsistent with other language features to reach
directly inside imported modules (not to mention that it could easily
get to be far too slow in a module with lots of imports). As a result
it would have to be imported using the "from module import function"
syntax, rather than the somewhat cleaner "import module" syntax.
While there's nothing wrong with such imports, I'm not sure I like the
thought of the language encouraging them any more than necessary.

Probably local (and by extension nonlocal) scoping is fine also. This
makes perfect sense to me:

def some_function(x):
def my_local_extension_method(self): return 42
print(x.my_local_extension_method())

2) What about getattr and hasattr? If I call hasattr(x,
"some_method"), and x has no such attribute, but there is a function
in the global scope named "some_method", should it return True? I
think the answer is no, because that could mess with duck typing. Say
I have a function that checks the methods of some object that was
passed in, and it then passes that object on to some other function:

def gatekeeper_for_f(x):
# f behaves badly if passed an x without a key_func,
# so verify that it has one.
if not hasattr(x, 'key_func'):
raise TypeError("x has no key_func")
else:
return f(x)

Okay, so suppose we pass in to gatekeeper_for_f a non-conformant
object, but there happens to be a key_func in our global scope, so
hasattr returns True. Great! gatekeeper_for_f can call x.key_func().
But that doesn't mean that *f* can call x.key_func(), if it happened
to be defined in a different global scope.

If we instead have hasattr return False though, and have getattr raise
an exception, then we have this very magical and confusing
circumstance where getattr(x, 'method') raises an exception but
x.method does not. So I don't think that's really a good scenario
either.

Also the idea makes me nervous in the thought that an incorrect
attribute access could accidentally and somewhat randomly pick up some
object from the environment. In statically typed languages this isn't
a huge concern, because the extension method has to take an
appropriately typed object as its first argument (and in C# it even
has to be explicitly marked as an extension method), so if you resolve
an extension method by accident, at least it will be something that
makes sense as a method. Without the static typing you could
mistakenly pick up arbitrary functions that have nothing at all to do
with your object.

But if you want to experiment with the idea, here's a (lightly tested)
mixin that implements the behavior:

import inspect
import types

class ExtensionMethodMixin:
def __getattr__(self, attr):
parent_frame = inspect.currentframe().f_back
if parent_frame:
try:
func = parent_frame.f_locals[attr]
except KeyError:
func = parent_frame.f_globals.get(attr)
if callable(func):
try:
__get__ = func.__get__
except AttributeError:
return types.MethodType(func, self)
else:
return __get__(self, type(self))
return super().__getattr__(attr)
 
G

Gregory Ewing

Ian said:
It's a nice feature in a statically typed language, but I'm not sure
how well it would work in a language as dynamic as Python.

Also it doesn't sit well with Python's "one obvious
way to do it" guideline, because it means there are
*two* equally obvious ways to call a function.
 
J

jongiddy

Thanks for the extensive feedback. Here's my thoughts on how to address these issues.

It's a nice feature in a statically typed language, but I'm not sure
how well it would work in a language as dynamic as Python. There are
some questions that would need to be addressed.

1) Where should the function (or perhaps callable) be looked for? The
most obvious place is the global scope. I think it would be a bit too
far-reaching and inconsistent with other language features to reach
directly inside imported modules (not to mention that it could easily
get to be far too slow in a module with lots of imports). As a result
it would have to be imported using the "from module import function"
syntax, rather than the somewhat cleaner "import module" syntax.

While there's nothing wrong with such imports, I'm not sure I like the
thought of the language encouraging them any more than necessary.

It would only work on functions in scope. x.len() would only work if len(x)would work. I actually think this would work better in Python than in D. In D, "import module;" imports all the symbols from the module, so it is easier to invoke a function unexpectedly. In Python, "import module" does not fill the namespace with lots of callable symbols, so UFCS would generally work with built-ins, local functions, or functions explicitly imported with "from module import...". In this case, the need to use the "from module import fname" form can document that something unusual is happening.
2) What about getattr and hasattr? If I call hasattr(x,
"some_method"), and x has no such attribute, but there is a function
in the global scope named "some_method", should it return True?
If we instead have hasattr return False though, and have getattr raise
an exception, then we have this very magical and confusing
circumstance where getattr(x, 'method') raises an exception but
x.method does not. So I don't think that's really a good scenario
either.

AS you suggest, the preferable route is that hasattr should return False. The object clearly does not have that attribute. It is a property of the current module that the object can use "instance.fname". While the behaviour that hasattr("fname") returns False, but instance.fname works is an exception, and a function could be added to test this quickly, so new code that cares could use:
if hasattr(instance, "fname") or inscopecallable('fname'):

The bigger problem I find is reading other code that uses UFCS and not realising that a "method" is not actually a method of the class, but requires importing a module. That can cause confusion when trying to use it in your own code. However, the need to use "from module import fname" would at least link the method name and the module.
Also the idea makes me nervous in the thought that an incorrect
attribute access could accidentally and somewhat randomly pick up some
object from the environment.

As before, I think the limited number of strange callable objects in most modules in Python protects against this. Of course, "from module import *" might cause problems, but that is already true. You need to be extra careful doing this, and should only do it for modules when you have a reasonableunderstanding of their exported names.
But if you want to experiment with the idea, here's a (lightly tested)
mixin that implements the behavior:

Thanks for the headstart! I'll need to read up on descriptors to understandthat last bit fully (when a function has a __get__ method).

One problem with your untested code, the superclasses would need to be checked before using UFCS, so the structure is:

try:
return super().__getattr__(attr)
except AttributeError:
# resolve using UFCS
 
J

jongiddy

Also it doesn't sit well with Python's "one obvious
way to do it" guideline, because it means there are
*two* equally obvious ways to call a function.

This provides a way to do something new (add class-optimized implementations for existing general-purpose functions). It also adds significant readability improvements by putting function-call chains in order.
 
J

jongiddy

Also it doesn't sit well with Python's "one obvious
way to do it" guideline, because it means there are
*two* equally obvious ways to call a function.

Actually, one of the best arguments against introducing UFCS is that Python currently provides two equivalent ways to check if an instance has an attribute: ask-permission using hasattr and ask-forgiveness using AttributeError.

On the negative side, these currently equivalent (aside from performance) techniques could give different results using UFCS, potentially breaking some code.

On the positive side, that means the proposal would add one "two ways to do something" and eliminate another "two ways to do something", giving a net Zen of Python effect of zero.
 
P

Paul Sokolovsky

Hello,

Thanks for the extensive feedback. Here's my thoughts on how to
address these issues.



It would only work on functions in scope. x.len() would only work if
len(x) would work.

In other words, you propose you add yet another check for each function
call. But what many people has to say about Python is that it's "slow".
There should be lookout for how to make it faster, not yet slower.


[]
The bigger problem I find is reading other code that uses UFCS and
not realising that a "method" is not actually a method of the class,
but requires importing a module. That can cause confusion when
trying to use it in your own code.

Indeed, this UFCS idea adds inefficiency and confusion, but doesn't
appear to solve any reasonable problem or add any firm benefit.
 
P

Paul Sokolovsky

Hello,

This provides a way to do something new (add class-optimized
implementations for existing general-purpose functions).

Python already has that - like, len(x) calls x.__len__() if it's
defined (for objects where it makes sense for it to be defined). Many
builtin functions have such behavior. For your custom functions, you
can add similar conventions and functionality very easily (if you'll
want to apply it to "not your" types, you'll need to subclass them,
as expected).

Getting x.foo() to call foo(x) is what's bigger problem, which has
serious performance and scoping confusion implications, as discussed in
other mails.
It also adds
significant readability improvements by putting function-call chains
in order.


Not sure what exactly you mean, but the order is usually pretty obvious
- Python follows mathematical notation for function calls, and OO
standard notation for method calls, one known from primary school,
another from secondary (hopefully). They can be reordered with
parentheses, which is also well-known basic math technique.
 
R

Roy Smith

jongiddy said:
Actually, one of the best arguments against introducing UFCS is that Python
currently provides two equivalent ways to check if an instance has an
attribute: ask-permission using hasattr and ask-forgiveness using
AttributeError.

On the negative side, these currently equivalent (aside from performance)
techniques could give different results using UFCS, potentially breaking some
code.

Why? I assume a language which promoted the global namespace to be in
the attribute search path (which, as far as I can tell, is what we're
talking about here) would implement hasattr and raising AttributeError
in a consistent way.
 
J

jongiddy

Why? I assume a language which promoted the global namespace to be in
the attribute search path (which, as far as I can tell, is what we're
talking about here) would implement hasattr and raising AttributeError
in a consistent way.

It's slightly different. Although I used len() as an example, the idea is to allow any function to be used in this way, including local symbols.

e.g. I could define:

def squared(x):
return x * x

i = 3
i.squared() => 9

j = AClassThatImplements__mul__()
j.squared() => whatever j * j returns

but also:
class AnotherClass:
def __mul__(self, other):
...
def squared(self):
return specialised_method_for_calculating_squares()

k = AnotherClass()
k.squared() => calls method, not function

In this case, there is a problem with letting hasattr('squared') return True for these first two instances. See Ian's post for a description of the problem.
 
M

Marko Rauhamaa

Paul Sokolovsky said:
Python already has that - like, len(x) calls x.__len__() if it's
defined

In fact, what's the point of having the duality?

len(x) <==> x.__len__()

x < y <==> x.__lt__(y)

str(x) <==> x.__str__()

etc.

I suppose the principal reason is that people don't like UFCS. Plus some
legacy from Python1 days.

Lisp & co. rigorously follow its UFCS. I think it works great, but that
is what people most ridicule Lisp for.

What do you think? Would you rather write/read:

if size + len(data) >= limit:

or UFCS-ly:

if size.__add__(data.__len__()).__le__(limit):


Marko
 
J

jongiddy

Getting x.foo() to call foo(x) is what's bigger problem, which has
serious performance and scoping confusion implications, as discussed in
other mails.

The performance hit will only occur when the attribute access is about to throw an AttributeError. Successful attribute accesses would be just as fast as before. And the cost of a symbol lookup is usually considered cheap compared to a thrown exception, so I don't believe there is a serious performance implication.

As to the scoping confusion, I repeat that Python benefits from the fact that most modules will only have the builtins and local functions to worry about. This is a small enough space for users to manage. There's no surprises waiting to occur when the user adds or removes normal imports (a problemthat can occur in D).
Not sure what exactly you mean, but the order is usually pretty obvious
- Python follows mathematical notation for function calls, and OO
standard notation for method calls, one known from primary school,
another from secondary (hopefully). They can be reordered with
parentheses, which is also well-known basic math technique.

A contrived example - which of these is easier to understand?

from base64 import b64encode

# works now
print(b64encode(str(min(map(int, f.readlines()), key=lambda n: n % 10)), b'?-'))

# would work with UFCS
f.readlines().map(int).min(key=lambda n: n % 10).str().b64encode(b'?-').print()

You can read the second form left to right, and arguments like b64encode's b'?-' are near the function call, making it a lot more obvious with which function this obscure argument is used.

Note, I'm not suggesting either of these examples is good programming, but the same problem does occur in more reasonable scenarios - I just made thisexample a little extreme to emphasise the readability benefits.
 
J

jongiddy

# would work with UFCS
f.readlines().map(int).min(key=lambda n: n % 10).str().b64encode(b'?-').print()

Ooops - map is the wrong way round to support UFCS in this case. However, with UFCS, I could fix this by changing it to smap, and defining:

def smap(seq, func):
return map(func, seq)
 
I

Ian Kelly

In fact, what's the point of having the duality?

len(x) <==> x.__len__()

x < y <==> x.__lt__(y)

str(x) <==> x.__str__()

Python prefers having functions for operations that are common to a
lot of types rather than methods. This allows for consistency of
interface -- think of len() as the interface and .__len__() as the
implementation. If .len() were the interface then it would be easy
(and probably all too common) for Python programmers to change those
interfaces in subclasses. It also means that if you want to pass the
len function itself around, you just pass around len and know that it
will work generally -- instead of passing around list.len and hoping
that whatever it gets applied to is a list.

This is a fair point against UFCS -- if x.len() comes to mean len(x)
then it both makes it easy to change that interface (at least for the
x.len() spelling) and makes it easier to pass around the function's
implementation rather than its interface.
What do you think? Would you rather write/read:

if size + len(data) >= limit:

or UFCS-ly:

if size.__add__(data.__len__()).__le__(limit):

You may be misunderstanding the proposal. The UFCS style of that would be:

if size + data.len() <= limit:
 
P

Paul Sokolovsky

Hello,

In fact, what's the point of having the duality?

len(x) <==> x.__len__()

x < y <==> x.__lt__(y)

str(x) <==> x.__str__()

etc.

I suppose the principal reason is that people don't like UFCS. Plus
some legacy from Python1 days.

I personally don't see it as "duality". There're few generic operators -
the fact that they are really generic (apply to wide different classes
of objects) is exactly the reason why the're defined in global
namespace, and not methods. And yep, I see things like "len" as
essentially an operator, even though its name consists of letters, and
it has function call syntax.

Then, there's just a way to overload these operators for user types,
that's it. You *can* use x.__len__() but that's not how Python intends
it.

And like with any idea, one should not forget implementation side and
efficiency - these operators are really core and expected to be used in
performance-tight contexts, so they are implemented specially
(optimized). Extending that handling to any function would cost either
high memory usage, or high runtime cost.
Lisp & co. rigorously follow its UFCS. I think it works great, but
that is what people most ridicule Lisp for.

Exactly my thinking - there're bunch of languages which follow that
UFCS-like idea, likely most homoiconic (or -like) do. Or you can use
plain old C ;-). So, I don't see why people want to stuff this into
Python - there're lot of ready alternatives. And Python provides very
intuitive and obvious separation between generic functions and object
methods IMHO, so there's nothing to "fix".
What do you think? Would you rather write/read:

if size + len(data) >= limit:

"How else could it be?"
 
C

Chris Angelico

e.g. I could define:

def squared(x):
return x * x

i = 3
i.squared() => 9

j = AClassThatImplements__mul__()
j.squared() => whatever j * j returns

but also:
class AnotherClass:
def __mul__(self, other):
...
def squared(self):
return specialised_method_for_calculating_squares()

k = AnotherClass()
k.squared() => calls method, not function

In this case, there is a problem with letting hasattr('squared') return True for these first two instances. See Ian's post for a description of the problem.

class Circle:
def squared(self):
raise NotImplementedError("Proven impossible in 1882")

The trouble is that logically Circle does have a 'squared' attribute,
while 3 doesn't; and yet Python guarantees this:

foo.squared()
# is equivalent [1] to
func = foo.squared
func()

Which means that for (3).squared() to be 9, it has to be possible to
evaluate (3).squared, which means that hasattr (which is defined by
attempting to get the attribute and seeing if an exception is thrown)
has to return True.

Except that it's even more complicated than that, because hasattr
wasn't defined in your module, so it has a different set of globals.
In fact, this would mean that hasattr would become quite useless.
(Hmm, PEP 463 might become a prerequisite of your proposal...) It also
means that attribute lookup becomes extremely surprising any time the
globals change; currently, "x.y" means exactly the same thing for any
given object x and attribute y, no matter where you do it.

The only way I can think of for all this to make sense is actually
doing it the other way around. Instead of having x.y() fall back on
y(x), have y(x) attempt x.y() first. To pull this off, you'd need a
special bouncer around every global or builtin... which may be tricky.

class MagicDict(dict):
def __getitem__(self, item):
# If this throws, let the exception propagate
obj = super().__getitem__(item)
if not callable(obj): return obj
def bouncer(*a, **kw):
if len(a)==1 and not kw:
try: return getattr(a[0], item)()
except AttributeError: pass
return obj(*a, **kw)
return bouncer
import __main__
# Except that this bit doesn't work.
__main__.__dict__ = MagicDict(__main__.__dict__)

It's theoretically possible, along these lines, I think. Whether it's
actually any good or not is another question, though!

ChrisA

[1] Modulo performance. CPython, AFAIK, does this exactly as written,
but other Pythons may and do optimize the actual "foo.squared()" form
to reduce heap usage. But in terms of visible effects, equivalent.
 
I

Ian Kelly

A contrived example - which of these is easier to understand?

from base64 import b64encode

# works now
print(b64encode(str(min(map(int, f.readlines()), key=lambda n: n % 10)), b'?-'))

# would work with UFCS
f.readlines().map(int).min(key=lambda n: n % 10).str().b64encode(b'?-').print()

I prefer not making it a one-liner:

data = map(int, f.readlines())
min_data = min(data, key=lambda n: n % 10)
print(b64encode(str(smallest_data), b'?-'))

Python's standard of having in-place methods return None also forces
this to an extent. Whenever you want to tack on something like
..append(), that's the end of your chain and it's time to start a new
line anyway. Of course, you could always define something like:

def appended(iterable, x):
result = list(iterable)
result.append(x)
return result

and use that in your chain.
 
I

Ian Kelly

Except that it's even more complicated than that, because hasattr
wasn't defined in your module, so it has a different set of globals.
In fact, this would mean that hasattr would become quite useless.

hasattr is a builtin, so it has no globals at all. It would have to
use the calling scope for UFCS resolution as in my example
implementation.
 
C

Chris Angelico

A contrived example - which of these is easier to understand?

from base64 import b64encode

# works now
print(b64encode(str(min(map(int, f.readlines()), key=lambda n: n % 10)), b'?-'))

# would work with UFCS
f.readlines().map(int).min(key=lambda n: n % 10).str().b64encode(b'?-').print()

You can read the second form left to right

Actually, this is something that I've run into sometimes. I can't
think of any Python examples, partly because Python tends to avoid
unnecessary method chaining, but the notion of "data flow" is a very
clean one - look at shell piping, for instance. Only slightly
contrived example:

cat foo*.txt | gzip | ssh other_server 'gunzip | foo_analyze'

The data flows from left to right, even though part of the data flow
is on a different computer.

A programming example might come from Pike's image library [1]. This
definitely isn't what you'd normally call good code, but sometimes I'm
working at the interactive prompt and I do something as a one-liner.
It might look like this:

Stdio.write_file("foo.png",Image.PNG.encode(Image.JPEG.decode(Stdio.read_file("foo.jpg")).autocrop().rotate(0.5).grey()));

With UFCS, that could become perfect data flow:

read_file("foo.jpg").JPEG_decode().autocrop().rotate(0.5).grey().PNG_encode().write_file("foo.png");

I had to solve the syntactic ambiguity here by importing all the
appropriate names, which does damage readability a bit. But you should
be able to figure out what this is doing, with only minimal glancing
at the docs (eg to find out that rotate(0.5) is rotating by half a
degree).

So the proposal does have some merit, in terms of final syntactic
readability gain. The problem is the internal ambiguity along the way.

ChrisA

[1] http://pike.lysator.liu.se/generated/manual/modref/ex/predef_3A_3A/Image/Image.html
 
C

Chris Angelico

hasattr is a builtin, so it has no globals at all. It would have to
use the calling scope for UFCS resolution as in my example
implementation.

Same difference. It can't simply look for the name in globals(), it
has to figure out based on the caller's globals.

ChrisA
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,480
Members
44,900
Latest member
Nell636132

Latest Threads

Top