Initial tensorflow stubs #8974

hmc-cs-mdrissi · 2022-10-24T04:26:05Z

Summary

Initial tensorflow stubs to start with. These are very incomplete given that tensorflow's api is very large. I'm reviving this now as supporting non-types dependencies has made progress and is a requirement for these stubs as they strongly rely on numpy. Similar to opencv stubs this should not merge until non type dependencies work in CI.

These are all manually made based on tensorflow documentation, testing in repl, and usage in large codebase reliant on tensorflow. mypy stubgen last I tried, crashes on tensorflow's codebase and pyright works but produces a large number of false positives. Both mypy and pyright share a weakness in that tensorflow relies on a lot of dynamic module magic with imports. It uses a decorator(tf_export) to re-export class in a different module sometimes even though module may not exist by itself. A secondary issue is most tensorflow classes are defined in internal locations without any private prefixing, and re-exported elsewhere that's documented. And some core tensorflow classes rely on dynamic patching of methods. As an example tf.Tensor class one of the most commonly used types in tensorflow many of it's basic methods like __add__ and __mul__ are dynamically patched on.

I consider this ready for discussion as there are a couple questions I have before moving forward.

What would be best approach for review? I currently have roughly ~3k lines of incomplete stubs for tensorflow and this pr adds some of the basic pieces. Would you prefer a large pr with all I currently have or chop it up into dozen smaller prs of a few hundred lines each? I'm using getattr(name: str) -> Incomplete pattern to minimize false positives users would experience. If I do make a pr with full stubs I have currently I'll need to be careful of that as my current local stubs often missed that because at work I followed a philosophy of false positives are better then missing an error and fix stub as needed. A full version of what I have would look roughly like this pr.
Is it fine to leave default values for functions in stubs? My current stubs include default values and they are mostly int/str/None. I find them useful for IDE usage, but I'm ok dropping them if that's preferred for consistency with most other typeshed stubs. If including defaults is ok, where should I adjust mypy/pyright/etc settings to not produce an error of Default values in stub files should be specified as "..."
Tensor related classes probably should be generic over data type and shape. I'm avoiding that as both would add a lot more complexity to stubs. Shapes especially as typevartuple support is in progress and a lot of shape types may require further peps. DTypes are mostly possible now just may add a lot more overloads/verbosity to track. I could make class generic today but mostly use Any for the type arguments? Not sure on recommended practice here.

The files/classes added here are a couple of the most commonly used ones. If small pr approach is good I'll likely continue by filling in more of tensorflow/__init__.pyi and tensorflow/math.pyi followed by adding in other commonly used files.

hauntsaninja · 2022-10-24T04:46:45Z

You're missing a layer of nesting. It should be stubs/distribution_name/package_name. This explains many of the mypy_primer errors you're seeing.

For the questions you ask, my non-authoritative opinions are:

I'm mostly agnostic. If there's stuff that you expect to be controversial, I would separate it from stuff that is non-controversial. Note that while typeshed's philosophy is that false negatives are better than false positives, it's not the biggest deal if we have some false positives as we work through your PRs. As long as we do this quickly, it won't hurt users, since there are currently zero users of types-tensorflow.
I would prefer omitting default values from stubs.
Let's stick with the simple non-generic stuff for now.

github-actions · 2022-10-24T17:48:00Z

Diff from mypy_primer, showing the effect of this PR on open source code:

jax (https://github.com/google/jax)
- jax/collect_profile.py:25: error: Cannot find implementation or library stub for module named "tensorflow.python.profiler"  [import]
- jax/experimental/jax2tf/tests/model_harness.py:35: error: Cannot find implementation or library stub for module named "tensorflow"  [import]
- jax/experimental/jax2tf/examples/tf_js/quickdraw/quickdraw.py:30: error: Cannot find implementation or library stub for module named "tensorflow"  [import]
- jax/tools/jax_to_ir.py:86: error: Cannot find implementation or library stub for module named "tensorflow"  [import]
- jax/experimental/jax2tf/tests/converters.py:20: error: Cannot find implementation or library stub for module named "tensorflow"  [import]
+ jax/experimental/jax2tf/tests/call_tf_test.py:36: error: Incompatible types in assignment (expression has type "None", variable has type Module)  [assignment]

streamlit (https://github.com/streamlit/streamlit)
+ lib/tests/streamlit/runtime/caching/hashing_test.py:48: note: ... from here:
+ lib/tests/streamlit/runtime/caching/hashing_test.py:48: note: ... from here:
+ lib/tests/streamlit/runtime/caching/hashing_test.py:48: note: In module imported here:

gramster · 2022-10-25T00:34:43Z

I would prefer omitting default values from stubs.

@hauntsaninja, an advantage of including default values is that tools like pyright or pylance will use stubs (if available) for the signatures they show to users when hovering over a function call, and users want to see the default values. I'd like to understand your point of view though as to why you would prefer them removed?

hmc-cs-mdrissi · 2022-10-25T01:34:18Z

I took a look at primer hits.

The jax hit mypy's redefinition rule and they have code like,

try:
  import tensorflow
except:
  tensorflow = None

I'm less sure about what streamlit hit means about note ... here. The code they have is,

try:
    import tensorflow as tf
except ImportError:
    pass

The remaining errors look to be expected from lack of numpy in environment.

hauntsaninja · 2022-10-26T01:45:32Z

@gramster I'm personally okay in theory with default values in stubs if we restrict to simple literals, but it should be its own discussion — my comment here mostly just reflected the status quo, changing of which would need discussion. I also figured we're not really losing that much work on this PR: simple literal default values should be easy to automatically reintroduce to all non-overload stubs and we'd need to do so for all existing stubs anyway.

hauntsaninja · 2022-10-26T01:48:56Z

Anyway, opened #8988 to discuss default values

Initial tensorflow stubs

fc7a2c0

This comment has been minimized.

Sign in to view

Mehdi Drissi added 2 commits Oct 24, 2022

fix flake8 lints besides default values

29017d2

fix mypy primer

93d1db9

This comment has been minimized.

Sign in to view

Remove default values for consistency with typeshed.

46369ed

Initial tensorflow stubs #8974

Initial tensorflow stubs #8974

hmc-cs-mdrissi commented Oct 24, 2022 •

edited

This comment has been minimized.

hauntsaninja commented Oct 24, 2022

This comment has been minimized.

github-actions bot commented Oct 24, 2022

gramster commented Oct 25, 2022 •

edited

hmc-cs-mdrissi commented Oct 25, 2022

hauntsaninja commented Oct 26, 2022

hauntsaninja commented Oct 26, 2022

Initial tensorflow stubs #8974

Are you sure you want to change the base?

Initial tensorflow stubs #8974

Conversation

hmc-cs-mdrissi commented Oct 24, 2022 • edited

Summary

This comment has been minimized.

hauntsaninja commented Oct 24, 2022

This comment has been minimized.

github-actions bot commented Oct 24, 2022

gramster commented Oct 25, 2022 • edited

hmc-cs-mdrissi commented Oct 25, 2022

hauntsaninja commented Oct 26, 2022

hauntsaninja commented Oct 26, 2022

hmc-cs-mdrissi commented Oct 24, 2022 •

edited

gramster commented Oct 25, 2022 •

edited