Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-41756: Introduce _PyGen_SendNoStopIteration to provide alternative way to return results from coroutines #22196

Merged
merged 8 commits into from Sep 19, 2020

Conversation

vladima
Copy link
Contributor

@vladima vladima commented Sep 11, 2020

Introduce _PyGen_SendNoStopIteration to allow returning values from async functions without raising StopIteration exception

https://bugs.python.org/issue41756

Copy link
Member

@1st1 1st1 left a comment

This is a great start. I copied a few other core devs to kick start a (hopefully) quick discussion on the API.

}
else {
/* Async generators cannot return anything but None */
assert(!PyAsyncGen_CheckExact(gen));
_PyGen_SetStopIterationValue(result);
if (is_return_value) {
Copy link
Member

@1st1 1st1 Sep 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (is_return_value) {
if (is_return_value != NULL) {

{
PyThreadState *tstate = _PyThreadState_GET();
PyFrameObject *f = gen->gi_frame;
PyObject *result;

Copy link
Member

@1st1 1st1 Sep 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is unnecessary.

@@ -225,7 +224,7 @@ gen_send_ex(PyGenObject *gen, PyObject *arg, int exc, int closing)

/* If the generator just returned (as opposed to yielding), signal
* that the generator is exhausted. */
if (result && _PyFrameHasCompleted(f)) {
Copy link
Member

@1st1 1st1 Sep 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd keep the unrelated style fixes out of this PR.

@@ -235,13 +234,19 @@ gen_send_ex(PyGenObject *gen, PyObject *arg, int exc, int closing)
/* Set exception if not called by gen_iternext() */
PyErr_SetNone(PyExc_StopIteration);
}
Py_CLEAR(result);
Copy link
Member

@1st1 1st1 Sep 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If is_return_value is non-NULL, I'd expect us to never use StopIteration, even if it's a bare return or return None.

}

PyObject *
_PyGen_SendNoStopIteration(PyGenObject *gen, PyObject *arg, int *is_return_value)
Copy link
Member

@1st1 1st1 Sep 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd design this differently:

PyGenState PyGen_Send(PyGenObject *gen, PyObject *arg, PyObject **result);

where PyGetState is defined as follows:

typedef enum {PYGEN_RETURN, PYGEN_ERROR, PYGEN_NEXT} PyGetState;

I think we can make this a public API (remove the underscore) and declare the current _PyGen_Send deprecated. I can certainly see Cython and third-party async C modules benefiting from this new API.

cc @scoder @vstinner @asvetlov @ambv @markshannon

Copy link
Contributor

@scoder scoder Sep 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah… I think this is unprecedented in the C-API, but it feels better than the normal "return a PyObject*" behaviour for non-binary return states. The return state is the first thing you have to look at in most cases, even in some "pass-through delegate" use case. So reserving the function return value for it seems the most helpful mechanism.

Remember that PyIter_Next() is pretty much the same, though, and it has the standard "return PyObject*" interface. These two would diverge then.

Probably something to discuss on the bug tracker, more than this PR, right?

Copy link
Member

@markshannon markshannon Sep 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with Yuri. The status code should be the primary return value, and it should be an enum.

But I don't think it should be public, as a possible further improvement is to stop passing exceptions through a side channel, but in result. Maybe we don't want to do that, but lets' not add to the (already rather large) C-API.

This is for "coroutines" right, not generators? Please name it accordingly. I'd like to avoid further mixing of generator and coroutine code. It makes improvements to either harder.

_PyCoroSendResult  _PyCoro_Send(PyGenObject *gen, PyObject *value, PyObject **returned_value);

Try to avoid "state", as this is a transient value, not a state.

Copy link
Contributor

@scoder scoder Sep 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for "coroutines" right, not generators?

Isn't coroutine the term for async coroutines? This is for sending data into generators, and coroutines use that mechanism, too. So PyGen_… seems appropriate. (Edit: I guess 'pure' generators don't accept input this way, so … what's the right term? :) )

Try to avoid "state", as this is a transient value, not a state.

Yury and I probably both meant "status" rather than "state".

I don't think it should be public

Public or not, Cython will definitely want to use this, and so would any native async library out there. So why force them to use something non-public?

Copy link
Member

@1st1 1st1 Sep 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yury and I probably both meant "status" rather than "state".

Agree, let's use "status".

Continuing the discussion in https://bugs.python.org/issue41756.

@markshannon
Copy link
Member

markshannon commented Sep 11, 2020

Please don't add a new C-API function.
The original issue was about the performance of pure Python code.
The C-API should be designed as an API for third-party code.
How the implementation works internally should be separate from that.

@vladima vladima closed this Sep 14, 2020
@vladima vladima reopened this Sep 14, 2020
@@ -2695,10 +2696,13 @@ task_step_impl(TaskObj *task, PyObject *exc)
}
}

if (result == NULL) {
if ((gen_status == PYGEN_RETURN) || result == NULL) {
Copy link
Member

@serhiy-storchaka serhiy-storchaka Sep 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it could be just gen_status != PYGEN_NEXT (or gen_status <= 0 if accept my proposition for returned values) if _PyGen_Send_NameTBD() was used.

if (PyGen_CheckExact(coro) || PyCoro_CheckExact(coro)) {
result = _PyGen_Send((PyGenObject*)coro, Py_None);
gen_status = _PyGen_Send_NameTBD((PyGenObject*)coro, Py_None, &result);
Copy link
Member

@serhiy-storchaka serhiy-storchaka Sep 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would not be better if _PyGen_Send_NameTBD() would call method send() internally for non-generator and non-coroutines?

if (result != NULL) {
o = result;
}
if (o || (_PyGen_FetchStopIterationValue(&o) == 0)) {
Copy link
Member

@serhiy-storchaka serhiy-storchaka Sep 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should not _PyGen_Send_NameTBD() handle this internally and automatically fetch the StopIteration value?

Copy link
Contributor Author

@vladima vladima Sep 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

StopIteration could still be raised by send case

*is_return_value = 1;
}
else {
/* Set exception if not called by gen_iternext() */
Copy link
Member

@serhiy-storchaka serhiy-storchaka Sep 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is not a caller responsible for raising StopIteration?

Copy link
Contributor Author

@vladima vladima Sep 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is used in multiple places and some of them currently assume exception will be raised on their behalf - updating them is a possibility but will cause more churn in the code

Copy link
Member

@1st1 1st1 Sep 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is used in multiple places and some of them currently assume exception will be raised on their behalf

Yes, let's not change that.

#define PYGEN_ERROR -1
#define PYGEN_NEXT 1

PyAPI_FUNC(int) PyGen_Send(PyGenObject *, PyObject *, PyObject **);
Copy link
Member

@1st1 1st1 Sep 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like you to add some docs here (as a comment) and to the actual C API docs. You'll also need to update refcounts.dat

Python/ceval.c Outdated
}
else {
_Py_IDENTIFIER(send);
if (v == Py_None)
Copy link
Member

@1st1 1st1 Sep 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add braces (that's what pep7 suggests)

@1st1
Copy link
Member

1st1 commented Sep 16, 2020

@vladima Did you try to run tests in refleak hunting mode?

1st1
1st1 approved these changes Sep 16, 2020
Copy link
Member

@1st1 1st1 left a comment

I'd let this sit for a couple of days to let other devs review but looks good overall. Pretty excited about this one.

@1st1
Copy link
Member

1st1 commented Sep 16, 2020

@markshannon
Copy link
Member

markshannon commented Sep 17, 2020

I recommend that you stop modifying this PR until the discussion on https://bugs.python.org/issue41756 has reached a conclusion. You'll save yourself a lot of rewriting.

@vladima vladima closed this Sep 18, 2020
@vladima vladima reopened this Sep 18, 2020
1st1
1st1 approved these changes Sep 18, 2020

.. c:function:: PySendResult PyGen_Send(PyGenObject *gen, PyObject *arg, PyObject **presult)

Sends the *arg* value into the generator *gen*. Returns:
Copy link
Member

@1st1 1st1 Sep 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please mention that it's possible to use this on native coroutine objects, with a cast.

@@ -15,6 +15,11 @@ than explicitly calling :c:func:`PyGen_New` or :c:func:`PyGen_NewWithQualName`.
The C structure used for generator objects.


.. c:type:: PySendResult

The enum value used to represent different results of :c:func:`PyGen_Send`
Copy link
Member

@1st1 1st1 Sep 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The enum value used to represent different results of :c:func:`PyGen_Send`
The enum value used to represent different results of :c:func:`PyGen_Send`.

PYGEN_NEXT = 1,
} PySendResult;

/* Sends the value into the generator. Returns:
Copy link
Member

@1st1 1st1 Sep 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/* Sends the value into the generator. Returns:
/* Sends the value into the generator or the coroutine. Returns:

@@ -0,0 +1,2 @@
Add PyGen_Send function to allow sending value to generator without
Copy link
Member

@1st1 1st1 Sep 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Add PyGen_Send function to allow sending value to generator without
Add PyGen_Send function to allow sending value into generator/coroutine without

if (_PyGen_FetchStopIterationValue(result) == 0) {
return PYGEN_RETURN;
}
return PYGEN_ERROR;
Copy link
Member

@1st1 1st1 Sep 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd add an assert(PyErr_Occurred()); before the final return.

@1st1 1st1 merged commit 2b05361 into python:master Sep 19, 2020
3 checks passed
@1st1
Copy link
Member

1st1 commented Sep 19, 2020

Thanks, Vladimir! Also huge thanks to Mark, Serhiy, and Stefan. If there are any nits to fix let's do that in follow up PRs.

xzy3 pushed a commit to xzy3/cpython that referenced this pull request Oct 18, 2020
The new API allows to efficiently send values into native generators
and coroutines avoiding use of StopIteration exceptions to signal 
returns.

ceval loop now uses this method instead of the old "private"
_PyGen_Send C API. This translates to 1.6x increased performance
of 'await' calls in micro-benchmarks.

Aside from CPython core improvements, this new API will also allow 
Cython to generate more efficient code, benefiting high-performance
IO libraries like uvloop.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants