Skip to content

ptype propagation on numpy functions #22247

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from
Closed

ptype propagation on numpy functions #22247

wants to merge 3 commits into from

Conversation

dylanbespalko
Copy link
Contributor

@dylanbespalko dylanbespalko commented Jun 26, 2019

Fixes issue #17249

Added support for python subclass (ptype) propagation when calling numpy functions on Tensors (Parameters are unaffected).

I have addressed and responded to the comments on the previous PR. #18610. Please visit this link to see that issues that were resolved. As suggested the changes have been broken in to two PR:

  1. ptype propagation on torch functions. ptype propagation on torch functions #22235
  2. ptype propagation on numpy functions.

A simple python script demonstrating ptype propagation is as follows:

import sys
import pickle
import copy
import unittest
from collections import OrderedDict

import numpy as np
from numpy.testing import *

import torch
import torch.multiprocessing as mp


def _rebuild_subclass(type_, data, requires_grad, backward_hooks):
    param = type_(data, requires_grad)
    # NB: This line exists only for backwards compatibility; the
    # general expectation is that backward_hooks is an empty
    # OrderedDict.  See Note [Don't serialize hooks]
    param._backward_hooks = backward_hooks

    return param


class TensorSubclass(torch.Tensor):

    def __new__(cls, data=None, requires_grad=False):
        if data is None:
            data = torch.Tensor()
        self = torch.Tensor._make_subclass(cls, data, requires_grad)
        return self

    def __deepcopy__(self, memo):
        if id(self) in memo:
            return memo[id(self)]
        else:
            result = type(self)(self.data.clone(), self.requires_grad)
            memo[id(self)] = result
            return result

    def __reduce_ex__(self, proto):
        # See Note [Don't serialize hooks]
        return _rebuild_subclass, (self.__class__, self.data, self.requires_grad, OrderedDict())


class A(TensorSubclass):
    pass


class B(TensorSubclass):
    pass


class C(A, B):
    pass


if __name__ == '__main__':
    a_tensor = torch.tensor([1.0, 2.0, 3.0], dtype=torch.double, requires_grad=True)
    b_tensor = torch.tensor([4.0, 5.0, 6.0], dtype=torch.double, requires_grad=True)
    c_tensor = torch.tensor([7.0, 8.0, 9.0], dtype=torch.double, requires_grad=True)
    d_tensor = torch.ones((4, 3, 2), dtype=torch.double, requires_grad=True)
    a = A(a_tensor, requires_grad=True)
    b = B(b_tensor, requires_grad=True)
    c = C(c_tensor, requires_grad=True)
    d = C(d_tensor, requires_grad=True)

    print("Numpy ptype propagation")
    print(type(np.add(a, b)))
    print(np.add(a, b).requires_grad)
    print(type(np.add(a, c)))
    print(np.add(a, c).requires_grad)

    print("Numpy ptype propagation, out_args")
    print(type(np.add(a, b, out=a)))
    print(np.add(a, b, out=a).requires_grad)
    print(type(np.add(a, c, out=a)))
    print(np.add(a, c, out=a).requires_grad)

@pytorchbot pytorchbot added module: autograd Related to torch.autograd, and the autograd engine in general module: internals Related to internal abstractions in c10 and ATen module: nn Related to torch.nn module: numpy Related to numpy support, and also numpy compatibility of our operators module: operators module: pybind Related to our Python bindings / interactions with other Python libraries labels Jun 26, 2019
@yf225 yf225 requested a review from gchanan June 26, 2019 02:15
@yf225 yf225 added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Jun 26, 2019
@yf225 yf225 requested a review from ezyang June 26, 2019 02:18
@dylanbespalko
Copy link
Contributor Author

Looks like some builds failed (likely Python 2 compatibility issues). I'll get this fixed and submit another PR.

@colesbury
Copy link
Member

I am not convinced that this is an API behavior that we should commit to supporting. I'm also concerned about the performance implications of THPVariable_result_ptype

@ezyang ezyang changed the title Added support for python subclass propagation numpy functions to fix #17249 ptype propagation on numpy functions Jun 26, 2019
@ezyang
Copy link
Contributor

ezyang commented Jun 26, 2019

I am not convinced that this is an API behavior that we should commit to supporting

The actual diff here is a little confused, because it's mixed up with changes from #22235 which I definitely agree need some discussion about what exactly we want to support.

However, the goal of this patch is just to make it so that you can pass a PyTorch tensor straight to numpy operations, and get out (non-backwards-supporting) PyTorch tensors at the end. This seems like a pretty harmless affordance that end users might like. Do you have an objection to this bit?

@ezyang
Copy link
Contributor

ezyang commented Jun 26, 2019

@dylanbespalko I'm not sure I can properly review this PR if it also contains changes from #22235. Can these two PRs be setup so that they don't have any dependency on each other?

@rgommers
Copy link
Collaborator

rgommers commented Jun 27, 2019

Hi, NumPy maintainer here. Interesting PR, thanks for the ping @ezyang. I'll just make some high-level comments for now:

the goal of this patch is just to make it so that you can pass a PyTorch tensor straight to numpy operations, and get out (non-backwards-supporting) PyTorch tensors at the end. This seems like a pretty harmless affordance that end users might like.

A variation of this works for ufuncs with __array_ufunc__ (new in numpy 1.13), and for other numpy functions with __array_function__ (new in numpy 1.17). CuPy and Dask already support both, it would be nice indeed for PyTorch to support these protocols too. However, I don't think that this PR provides the full answer. In general there are some behavior differences and API mismatches between NumPy and PyTorch, and this PR doesn't really deal with that. So things will only work for matching function names and parameters/keywords for now. It would be useful to document explicitly what is expected to work and what isn't.

More importantly, the logic in this PR seems exactly reversed from what it should be. This PR does something like:

  1. Create ndarray instance out of Tensor instance.
  2. Call numpy ufunc with that ndarray
  3. Transform the result back into Tensor (or a Tensor subclass)

It should be:

  1. Pass Tensor instance to numpy ufunc (e.g. numpy.abs)
  2. numpy detects the presence of Tensor.__array_ufunc__ and delegates execution to it.
  3. The Tensor.__array_ufunc__ implementation then should forward that function call to the right implementation (torch.abs or Tensor.abs).

The whole idea of the NumPy array protocols is to allow using the NumPy API with minimal overhead, so users can write generic functions with only numpy calls in them, and those then work for ndarrays as well as for other array-like objects like torch.Tensor. There is not supposed to be any coercion to/from ndarrays (that's very expensive).

@dylanbespalko
Copy link
Contributor Author

@rgommers, @ezyang,

Thanks for your much needed perspective on Numpy architecture. I think __numpy_ufunc__ was renamed to __array_ufunc__ when released in v1.13. As for __array_function__, I had not heard of this one and I've been reading NEP 18. It looks like a proposal for super-universal functions that perform cross-compatible operations on the following type:

  • gpu arrays (CuPy),
  • scipy.sparse, pydata/sparse
  • and parallel arrays (Dask array)
  • numpy
  • tensors TensorFlow and PyTorch.

I'm someone who is more likely to apply this stuff, rather than design it. My original goal here was to create two functions:

  1. __tensor_ufunc__ to perform torch universal functions on tensors.
  2. __array_ufunc__ to perform numpy universal functions on tensors (by converting inputs to ndarray and outputs back to tensor).

Since these types are not related by inheritance, neither solution is indeed "universal" in that:

  1. torch.Tensor + np.NDArray will call torch.add
  2. np.NDArray + torch.Tensor will call numpy.add

__array_function__ looks like it would handle this case by maintaining a mapping between torch functions and numpy functions so that you could "redirect" all function calls to torch or numpy. This is well beyond my abilities as it requires:
1. A mapping between numpy and torch function names.
2. A mapping between numpy and torch function arguments.

The latter problem was the bottleneck for me. I decided that I needed to maintain a 90/10 torch/numpy mix in my own code, because it is impossible to deal with compatibility issues on a daily basis. I'm sure that the age of distributed hardware will make these decisions very tough, so any compatibility between these frameworks would be highly appreciated.

@rgommers
Copy link
Collaborator

rgommers commented Jun 27, 2019

Thanks for your much needed perspective on Numpy architecture. I think numpy_ufunc was renamed to array_ufunc when released in v1.13.

yes indeed. that was a typo, I've edited my comment.

__tensor_ufunc__ to perform torch universal functions on tensors.

The question this raises is why you need a protocol for that. The __array_ufunc__ is needed for things that are not ndarrays (we call them "duck arrays" for now, as in ducktyping arrays). So I'd expect __tensor_ufunc__ to be for things that are not Tensors or Tensor subclasses. I.e. using the PyTorch API but dispatching to outside torch (is there a need for that?).

__array_ufunc__ to perform numpy universal functions on tensors (by converting inputs to ndarray and outputs back to tensor).

Okay then I understood your PR right. I think it's just not something that you should want - there's likely a better solution.

Since these types are not related by inheritance, neither solution is indeed "universal" in that:

torch.Tensor + np.NDArray will call torch.add
np.NDArray + torch.Tensor will call numpy.add

Mixing types of arrays/tensors like that is going to be inherently messy and inefficient. You're probably better off converting explicitly.

@ezyang
Copy link
Contributor

ezyang commented Jun 27, 2019

There is not supposed to be any coercion to/from ndarrays (that's very expensive).

Just a quick clarification: an ndarray can be converted into a Tensor without necessitating a data copy. So it's not as expensive as you seem to imply here.

@rgommers
Copy link
Collaborator

Just a quick clarification: an ndarray can be converted into a Tensor without necessitating a data copy. So it's not as expensive as you seem to imply here.

I was thinking mainly about tensors that live on the GPU. The __array_ufunc__ protocol as intended should work for those too without data copies, and they're probably the more interesting use case (cause the performance benefits are larger).

but yes, point taken: should have said "potentially expensive"

@t-vi
Copy link
Collaborator

t-vi commented Jul 7, 2019

I'm late to the party, but isn't the method @perone explores in http://blog.christianperone.com/2019/07/numpy-dispatcher-when-numpy-becomes-a-protocol-for-an-ecosystem/ a way good to implement many of the functions without needing much from PyTorch?

@rgommers
Copy link
Collaborator

rgommers commented Jul 7, 2019

@t-vi indeed, and that's discussed in gh-22402 as you already discovered.

@gchanan gchanan removed their request for review June 3, 2020 15:53
@facebook-github-bot
Copy link
Contributor

Hi @dylanbespalko!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at cla@fb.com. Thanks!

@github-actions
Copy link
Contributor

Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as Stale.
Feel free to remove the Stale label if you feel this was a mistake.
If you are unable to remove the Stale label please contact a maintainer in order to do so.
If you want the bot to never mark this PR stale again, add the no-stale label.
Stale pull requests will automatically be closed after 30 days of inactivity.

@github-actions github-actions bot added the Stale label May 28, 2022
@github-actions github-actions bot closed this Jun 27, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module: autograd Related to torch.autograd, and the autograd engine in general module: internals Related to internal abstractions in c10 and ATen module: nn Related to torch.nn module: numpy Related to numpy support, and also numpy compatibility of our operators module: pybind Related to our Python bindings / interactions with other Python libraries open source Stale triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants