Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't type check most function bodies if ignoring errors #14150

Open
wants to merge 24 commits into
base: master
Choose a base branch
from

Conversation

JukkaL
Copy link
Collaborator

@JukkaL JukkaL commented Nov 19, 2022

If errors are ignored, type checking function bodies often can have no effect. Remove function bodies after parsing to speed up type checking.

Methods that define attributes have an externally visible effect even if errors are ignored. The body of any method that assigns to any attribute is preserved to deal with this (even if it doesn't actually define a new attribute). Most methods don't assign to an attribute, so stripping bodies is still effective for methods.

There are a couple of additional interesting things in the implementation:

  1. We need to know whether an abstract method has a trivial body (e.g. just ...) to check super() method calls. The approach here is to preserve such trivial bodies and treat them differently from no body at all.
  2. Stubgen analyzes the bodies of functions to e.g. infer some return types. As a workaround, explicitly preserve full ASTs when using stubgen.

The main benefit is faster type checking when using installed packages with inline type information (PEP 561). Errors are ignored in this case, and it's common to have a large number of third-party code to type check. For example, a self check (code under mypy/) is now about 20% faster, with a compiled mypy on Python 3.11.

Another, more subtle benefit is improved reliability. A third-party library may have some code that triggers a mypy crash or an invalid blocking error. If bodies are stripped, often the error will no longer be triggered, since the amount code to type check is much lower.

@github-actions

This comment has been minimized.

Copy link
Member

@ilevkivskyi ilevkivskyi left a comment

Nice! I like the idea, left couple comments (not a full review).

self.lvalue = False
self.found = False

def visit_assignment_stmt(self, s: AssignmentStmt) -> None:
Copy link
Member

@ilevkivskyi ilevkivskyi Nov 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you also need to check assignment expression? (i.e. walrus a.k.a :=)

Copy link
Collaborator Author

@JukkaL JukkaL Nov 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't support assigning to an attribute.

):
# We only strip method bodies if they don't assign to an attribute, as
# this may define an attribute which has an externally visible effect.
visitor = FindAttributeAssign()
Copy link
Member

@ilevkivskyi ilevkivskyi Nov 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to strip statements after last assignment to attribute? This could make a bit more perf gain.

Copy link
Collaborator Author

@JukkaL JukkaL Nov 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may be possible! I'd rather not do it in this PR to keep it simple (and the impact is probably pretty minor), but it's seems like a promising follow-up improvement to investigate.

Copy link
Contributor

@cdce8p cdce8p left a comment

Got a chance to test the PR on a full run with Home Assistant (with all dependencies installed). The performance improvement is noticeable. Both with the uncompiled version on Python 3.9 - perviously: ~11min - now: ~7:30min 🚀

Unfortunately, I also got a few new error messages which the primer didn't pick up on.

homeassistant/components/nibe_heatpump/__init__.py:267: error: "Coroutine[Any, Any, AsyncIterator[Coil]]" has no attribute "__aiter__" (not async iterable)  [attr-defined]
homeassistant/components/nibe_heatpump/__init__.py:267: note: Maybe you forgot to use "await"?
homeassistant/components/google/calendar.py:350: error: "Coroutine[Any, Any, AsyncIterator[ListEventsResponse]]" has no attribute "__anext__"  [attr-defined]
homeassistant/components/google/calendar.py:350: note: Maybe you forgot to use "await"?
homeassistant/components/homekit_controller/config_flow.py:155: error: "Coroutine[Any, Any, AsyncIterable[AbstractDiscovery]]" has no attribute "__aiter__" (not async iterable)  [attr-defined]
homeassistant/components/homekit_controller/config_flow.py:155: note: Maybe you forgot to use "await"?
homeassistant/components/amcrest/__init__.py:209: error: Return type "_AsyncGeneratorContextManager[Response]" of "async_stream_command" incompatible with return type "_AsyncGeneratorContextManager[<nothing>]" in supertype "Http"  [override]
homeassistant/components/amcrest/__init__.py:214: error: Need type annotation for "ret"  [var-annotated]
Found 5 errors in 4 files (checked 5433 source files)

One of the edge cases seem to be async iterators with yield. To reproduce it, create two files

# a.py
from typing import AsyncIterator


class L:
    async def some_func(self, i: int) -> str:
        return str(i)

    async def get_iterator(self) -> AsyncIterator[str]:
        for i in range(5):
            yield await self.some_func(i)
# b.py
from a import L

async def func(l: L) -> None:
    reveal_type(l.get_iterator)
    async for i in l.get_iterator():
        print(i)

And run mypy b.py.

b.py:5: note: Revealed type is "def () -> typing.Coroutine[Any, Any, typing.AsyncIterator[builtins.str]]"
b.py:6: error: "Coroutine[Any, Any, AsyncIterator[str]]" has no attribute "__aiter__" (not async iterable)  [attr-defined]
b.py:6: note: Maybe you forgot to use "await"?

@cdce8p
Copy link
Contributor

cdce8p commented Nov 20, 2022

The second edge case is fairly similar, just with the addition of @asynccontextmanager.

# a.py
from contextlib import asynccontextmanager
from typing import AsyncIterator

class Parent:
    @asynccontextmanager
    async def async_func(self) -> AsyncIterator[str]:
        yield ''
# b.py
from contextlib import asynccontextmanager
from typing import AsyncIterator

from test8 import Parent

class Child(Parent):
    @asynccontextmanager
    async def async_func(self) -> AsyncIterator[str]:
        yield ''

Running mypy b.py

b.py:9: error: Return type "_AsyncGeneratorContextManager[str]" of "async_func" incompatible with return type "_AsyncGeneratorContextManager[<nothing>]" in supertype "Parent"  [override]

@JukkaL
Copy link
Collaborator Author

JukkaL commented Nov 21, 2022

@cdce8p Thanks for the detailed reports about regressions! I clearly need to investigate those cases.

JukkaL added 3 commits Dec 4, 2022
These can have an externally visible effect within an async function.
The existence of yield affects the inferred return type, and a yield
from generates a blocking error.
@github-actions
Copy link
Contributor

github-actions bot commented Dec 4, 2022

According to mypy_primer, this change has no effect on the checked open source code. 🤖🎉

@cdce8p
Copy link
Contributor

cdce8p commented Dec 4, 2022

@cdce8p Thanks for the detailed reports about regressions! I clearly need to investigate those cases.

Just tested the PR with Home Assistant again. The last changes resolve all issues 🎉

jhance
jhance approved these changes Dec 5, 2022
@JukkaL
Copy link
Collaborator Author

JukkaL commented Dec 8, 2022

There are some performance regressions on master that I'd like to address first before merging this. Otherwise the performance impact estimate might be inflated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants