Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

performance shortcut in functools.partial behaves differently in C and in Python version #100242

Open
cfbolz opened this issue Dec 14, 2022 · 1 comment
Assignees
Labels
stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@cfbolz
Copy link
Contributor

cfbolz commented Dec 14, 2022

Bug

functools.partial is implemented in functools.py and in _functoolsmodule.c. The former is almost never used, so libraries come to depend on the quirks and corner cases of the C implementation. This is a problem for PyPy, where the Python implementation is the only one as of the most recent PyPy version. Here's one such difference, which was uncovered by the lxml library. The following code leads to a RecursionError:

import sys
sys.modules['_functools'] = None # force use of pure python version, if this is commented out it works

from functools import partial

class Builder:
    def __call__(self, tag, *children, **attrib):
        return (tag, children, attrib)

    def __getattr__(self, tag):
        return partial(self, tag)

B = Builder()
m = B.m

this is the traceback:

Traceback (most recent call last):
  File "/home/cfbolz/projects/cpython/bug.py", line 14, in <module>
    m = B.m
        ^^^
  File "/home/cfbolz/projects/cpython/bug.py", line 11, in __getattr__
    return partial(self, tag)
           ^^^^^^^^^^^^^^^^^^
  File "/home/cfbolz/projects/cpython/Lib/functools.py", line 287, in __new__
    if hasattr(func, "func"):
       ^^^^^^^^^^^^^^^^^^^^^
  File "/home/cfbolz/projects/cpython/bug.py", line 11, in __getattr__
    return partial(self, tag)
           ^^^^^^^^^^^^^^^^^^
  File "/home/cfbolz/projects/cpython/Lib/functools.py", line 287, in __new__
    if hasattr(func, "func"):
       ^^^^^^^^^^^^^^^^^^^^^
... and repeated

The problem is the following performance shortcut in partial.__new__:

class partial:
    ...
    def __new__(cls, func, /, *args, **keywords):
        if not callable(func):
            raise TypeError("the first argument must be callable")

        if hasattr(func, "func"): # <------------------- problem
            args = func.args + args
            keywords = {**func.keywords, **keywords}
            func = func.func

Basically in this case func is an object where calling hasattr(func, "func") is not safe. The equivalent C code does this check:

    if (Py_TYPE(func)->tp_call == (ternaryfunc)partial_call) {
        // The type of "func" might not be exactly the same type object
        // as "type", but if it is called using partial_call, it must have the
        // same memory layout (fn, args and kw members).
        // We can use its underlying function directly and merge the arguments.
        partialobject *part = (partialobject *)func;

In particular, it does not simply call hasattr on func.

Real World Version

This is not an artificial problem, we discovered this via the class lxml.builder.ElementMaker. It has a __call__ method implemented. It also has __getattr__ that looks like this:

    def __getattr__(self, tag):
        return partial(self, tag)

Which yields the above RecursionError on PyPy.

Solution ideas

One approach would be to file a bug with lxml, but it is likely that more libraries depend on this behaviour. So I would suggest to change the __new__ Python code to add an isinstance check, to bring its behaviour closer to that of the C code:

    def __new__(cls, func, /, *args, **keywords):
        if not callable(func):
            raise TypeError("the first argument must be callable")

        if isinstance(func, partial) and hasattr(func, "func"):
            args = func.args + args
        ...

I'll open a PR with this approach soon. /cc @mgorny

Linked PRs

@mattip
Copy link
Contributor

mattip commented Dec 14, 2022

Note that I needed to modify the tests which assume the c-import worked:

# HG changeset patch
# Branch py3.9
# Node ID 4016ae64db38512857aea9987146e897f6b25ebd
# Parent  f5d3e9e0fa5cb1ee62ccd2867bcf5fe094d4eae8
fix tests for no c_functools

diff -r f5d3e9e0fa5c -r 4016ae64db38 lib-python/3/test/test_functools.py
--- a/lib-python/3/test/test_functools.py	Fri Dec 02 16:15:06 2022 +0200
+++ b/lib-python/3/test/test_functools.py	Sat Dec 03 10:42:01 2022 +0200
@@ -199,7 +199,9 @@
         kwargs = {'a': object(), 'b': object()}
         kwargs_reprs = ['a={a!r}, b={b!r}'.format_map(kwargs),
                         'b={b!r}, a={a!r}'.format_map(kwargs)]
-        if self.partial in (c_functools.partial, py_functools.partial):
+        if c_functools and self.partial is c_functools.partial:
+            name = 'functools.partial'
+        elif self.partial is py_functools.partial:
             name = 'functools.partial'
         else:
             name = self.partial.__name__
@@ -221,7 +223,9 @@
                        for kwargs_repr in kwargs_reprs])
 
     def test_recursive_repr(self):
-        if self.partial in (c_functools.partial, py_functools.partial):
+        if c_functools and self.partial is c_functools.partial:
+            name = 'functools.partial'
+        elif self.partial is py_functools.partial:
             name = 'functools.partial'
         else:
             name = self.partial.__name__
@@ -1753,9 +1757,10 @@
 def py_cached_func(x, y):
     return 3 * x + y
 
-@c_functools.lru_cache()
-def c_cached_func(x, y):
-    return 3 * x + y
+if c_functools:
+    @c_functools.lru_cache()
+    def c_cached_func(x, y):
+        return 3 * x + y
 
 
 class TestLRUPy(TestLRU, unittest.TestCase):
@@ -1772,18 +1777,20 @@
         return 3 * x + y
 
 
-class TestLRUC(TestLRU, unittest.TestCase):
-    module = c_functools
-    cached_func = c_cached_func,
-
-    @module.lru_cache()
-    def cached_meth(self, x, y):
-        return 3 * x + y
-
-    @staticmethod
-    @module.lru_cache()
-    def cached_staticmeth(x, y):
-        return 3 * x + y
+if c_functools:
+    @unittest.skipUnless(c_functools, 'requires the C _functools module')
+    class TestLRUC(TestLRU, unittest.TestCase):
+        module = c_functools
+        cached_func = c_cached_func,
+
+        @module.lru_cache()
+        def cached_meth(self, x, y):
+            return 3 * x + y
+
+        @staticmethod
+        @module.lru_cache()
+        def cached_staticmeth(x, y):
+            return 3 * x + y
 
 
 class TestSingleDispatch(unittest.TestCase):

@iritkatriel iritkatriel added the stdlib Python modules in the Lib dir label Nov 26, 2023
ambv pushed a commit that referenced this issue Apr 17, 2024
…h C code (GH-100244)

in partial.__new__, before checking for the existence of the attribute
'func', first check whether the argument is an instance of partial.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

4 participants