Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PyErr_SetObject() behavior is strange and not as documented. #101578

Open
markshannon opened this issue Feb 5, 2023 · 25 comments · Fixed by #101607
Open

PyErr_SetObject() behavior is strange and not as documented. #101578

markshannon opened this issue Feb 5, 2023 · 25 comments · Fixed by #101607
Labels
3.7 (EOL) end of life 3.8 only security fixes 3.9 only security fixes 3.10 only security fixes 3.11 bug and security fixes 3.12 bugs and security fixes topic-C-API type-bug An unexpected behavior, bug, or error

Comments

@markshannon
Copy link
Member

markshannon commented Feb 5, 2023

Briefly:
PyErr_SetObject(exc_type, exc_val) does not create a new exception iff isinstance(exc_val, BaseException), but uses exc_val instead.

Callers of PyErr_SetObject() need various workarounds to handle this.

The long version:

Internally CPython handles exceptions as a triple (type, value, traceback), but the language treats exceptions as a single value.

This a legacy of the olden days before proper exceptions.
To handle adding proper exceptions to Python, various error handling functions, specifically _PyErr_SetObject still treat exceptions as triples, with the convention that if the value is an exception, then the exception is already normalized.

One other oddity is that if exc_val is a tuple, it is treated as the * arguments to exc_type when calling it. So, if isinstance(exc_val, BaseException) the desired behavior can be achieved by wrapping exc_val in a one-tuple.

As a consequence, both _PyErr_SetKeyError and _PyGen_SetStopIterationValue are a lot more complex than they should be to workaround this behavior.

We could make PyErr_SetObject act as documented, but that is likely to break C extensions, given how old this behavior is, and that it is relied on throughout CPython.

Code that does the following is common:

    exc = new_foo_exception();
    PyErr_SetObject(&PyFooException_Type, exc);

We could just document the current behavior, but the current behavior is strange.
What I suggest is this:

  • Create a new API function, PyErr_SetException(exc)` that takes a single exception object.
  • Document PyErr_SetObject() accurately
  • Deprecate the old function

This is an old bug going back to the 2 series.

Linked PRs

Also relevant:

@markshannon markshannon added type-bug An unexpected behavior, bug, or error 3.11 bug and security fixes 3.10 only security fixes 3.9 only security fixes 3.8 only security fixes 3.7 (EOL) end of life topic-C-API 3.12 bugs and security fixes labels Feb 5, 2023
@markshannon
Copy link
Member Author

The only case I can find of strange behavior leaking out is this:

import sys

def wrap_in_sys_exit(ex):
    try:
        try:
            raise ex(-1)
        except ex as err:
            sys.exit(err)
    except BaseException as err:
        return err

print(repr(wrap_in_sys_exit(ValueError)))
print(repr(wrap_in_sys_exit(SystemExit)))

Which prints

SystemExit(ValueError(-1))
SystemExit(-1)

instead of the expected

SystemExit(ValueError(-1))
SystemExit(SystemExit(-1))

@rhettinger
Copy link
Contributor

rhettinger commented Feb 5, 2023

This API goes by to 1997. It is remarkable how simple and clear the code used to be:

void
PyErr_Restore(PyObject *type, PyObject *value, PyObject *traceback)
{
	PyThreadState *tstate = PyThreadState_GET();
	PyObject *oldtype, *oldvalue, *oldtraceback;

	if (traceback != NULL && !PyTraceBack_Check(traceback)) {
		/* XXX Should never happen -- fatal error instead? */
		Py_DECREF(traceback);
		traceback = NULL;
	}

	/* Save these in locals to safeguard against recursive
	   invocation through Py_XDECREF */
	oldtype = tstate->curexc_type;
	oldvalue = tstate->curexc_value;
	oldtraceback = tstate->curexc_traceback;

	tstate->curexc_type = type;
	tstate->curexc_value = value;
	tstate->curexc_traceback = traceback;

	Py_XDECREF(oldtype);
	Py_XDECREF(oldvalue);
	Py_XDECREF(oldtraceback);
}

void
PyErr_SetObject(PyObject *exception, PyObject *value)
{
	Py_XINCREF(exception);
	Py_XINCREF(value);
	PyErr_Restore(exception, value, (PyObject *)NULL);
}

@markshannon
Copy link
Member Author

The code may have been simple, but it can leave the VM in a precarious state if the value is not a legal argument for calling the type.

markshannon added a commit that referenced this issue Feb 8, 2023
* Make sure that the current exception is always normalized.

* Remove redundant type and traceback fields for the current exception.

* Add new API functions: PyErr_GetRaisedException, PyErr_SetRaisedException

* Add new API functions: PyException_GetArgs, PyException_SetArgs
carljm added a commit to carljm/cpython that referenced this issue Feb 9, 2023
* main: (82 commits)
  pythongh-101670: typo fix in PyImport_ExtendInittab() (python#101723)
  pythonGH-99293: Document that `Py_TPFLAGS_VALID_VERSION_TAG` shouldn't be used. (#pythonGH-101736)
  no-issue: Add Dong-hee Na as the cjkcodecs codeowner (pythongh-101731)
  pythongh-101678: Merge math_1_to_whatever() and math_1() (python#101730)
  pythongh-101678: refactor the math module to use special functions from c11 (pythonGH-101679)
  pythongh-85984: Remove legacy Lib/pty.py code. (python#92365)
  pythongh-98831: Use opcode metadata for stack_effect() (python#101704)
  pythongh-101283: Version was just released, so should be changed in 3.11.3 (pythonGH-101719)
  pythongh-101283: Fix use of unbound variable (pythonGH-101712)
  pythongh-101283: Improved fallback logic for subprocess with shell=True on Windows (pythonGH-101286)
  pythongh-101277: Port more itertools static types to heap types (python#101304)
  pythongh-98831: Modernize CALL and family (python#101508)
  pythonGH-101696: invalidate type version tag in `_PyStaticType_Dealloc` (python#101697)
  pythongh-100221: Fix creating dirs in `make sharedinstall` (pythonGH-100329)
  pythongh-101670: typo fix in PyImport_AppendInittab() (pythonGH-101672)
  pythongh-101196: Make isdir/isfile/exists faster on Windows (pythonGH-101324)
  pythongh-101614: Don't treat python3_d.dll as a Python DLL when checking extension modules for incompatibility (pythonGH-101615)
  pythongh-100933: Improve `check_element` helper in `test_xml_etree` (python#100934)
  pythonGH-101578: Normalize the current exception (pythonGH-101607)
  pythongh-47937: Note that Popen attributes are read-only (python#93070)
  ...
@erlend-aasland
Copy link
Contributor

The new C APIs PyErr_GetRaisedException and PyErr_SetRaisedException are IMO problematic:

  • the former returns a borrowed reference
  • the latter steals a reference

This goes against our recommendations for adding new public C APIs.

I also find the added documentation inadequate. It is not documented that the former function returns a borrowed reference. The documentation for the latter function is not exactly friendly written, and I'm specifically thinking of the last parenthesised sentence:

exc must be a valid exception. (Violating this rules will cause subtle problems later.) This call consumes a reference to the exc object: you must own a reference to that object before the call and after the call you no longer own that reference. (If you don’t understand this, don’t use this function. I warned you.)

Suggesting to considering reopening this issue until at least the docs have been improved. IMO, we should also fix the newly added C APIs to be consistent in their reference handling, which means:

  • PyErr_GetRaisedException should return a new reference
  • PyErr_SetRaisedException should not steal a reference

@markshannon
Copy link
Member Author

The remark "(If you don’t understand this, don’t use this function. I warned you.)" comes directly from the documentation for PyErr_Restore and PyErr_Fetch which this API is meant to replace, as does the reference count behavior.

PyErr_GetRaisedException should return a new reference

How does that differ from the current behavior?

@encukou
Copy link
Member

encukou commented Feb 14, 2023

AFAICS, PyErr_GetRaisedException doesn't return a borrowed reference. The caller owns the return value.
It's just PyErr_SetRaisedException stealing its argument.

@erlend-aasland
Copy link
Contributor

AFAICS, PyErr_GetRaisedException doesn't return a borrowed reference. The caller owns the return value.
It's just PyErr_SetRaisedException stealing its argument.

Huh? In that case, I'm reading this completely wrong:

cpython/Python/errors.c

Lines 448 to 460 in 81e3aa8

PyObject *
_PyErr_GetRaisedException(PyThreadState *tstate) {
PyObject *exc = tstate->current_exception;
tstate->current_exception = NULL;
return exc;
}
PyObject *
PyErr_GetRaisedException(void)
{
PyThreadState *tstate = _PyThreadState_GET();
return _PyErr_GetRaisedException(tstate);
}

Anyway, I still think it smells to introduce a new API that steals its argument. There should be no performance hit in making the caller instead of the callee having to decref that argument.

@erlend-aasland
Copy link
Contributor

The remark "(If you don’t understand this, don’t use this function. I warned you.)" comes directly from the documentation for PyErr_Restore and PyErr_Fetch which this API is meant to replace, as does the reference count behavior.

If we introduce a new API, why copy the IMO inadequate docs and the smelly ref count behaviour? It's a new API; we can choose to follow the convention; we can choose to document it properly.

@markshannon
Copy link
Member Author

I'm not saying we shouldn't document it better, just don't blame me for the docs 🙂

@erlend-aasland
Copy link
Contributor

I'm not saying we shouldn't document it better, just don't blame me for the docs 🙂

Sorry, if it came out that way; that was not my intent! Let's just amend the docs; I can propose a patch 🙂

(Regarding the reference counting, I'd still like to see the public APIs following the established conventions.)

@erlend-aasland
Copy link
Contributor

(Re-opening till we've patched the docs)

erlend-aasland added a commit to erlend-aasland/cpython that referenced this issue Feb 16, 2023
erlend-aasland added a commit that referenced this issue Feb 19, 2023
Co-authored-by: C.A.M. Gerlach <CAM.Gerlach@Gerlach.CAM>
erlend-aasland added a commit to erlend-aasland/cpython that referenced this issue Feb 19, 2023
erlend-aasland added a commit that referenced this issue Feb 21, 2023
Co-authored-by: C.A.M. Gerlach <CAM.Gerlach@Gerlach.CAM>
@erlend-aasland
Copy link
Contributor

We've fixed the docs; I'm fine with closing this now.

@erlend-aasland
Copy link
Contributor

OOI, why are all the version labels applied?

@markshannon
Copy link
Member Author

OOI, why are all the version labels applied?

because the behavior of PyErr_SetObject() doesn't match the documentation in all those versions.
I suspect it has been wrong forever.

The fix for 3.12+ is a better API. For older versions, the docs need fixing.

@erlend-aasland
Copy link
Contributor

erlend-aasland commented Feb 21, 2023

So let's keep this issue open until we've patched the docs even more.

The fix for 3.12+ is a better API.

Furthermore, I still consider that the set-function should not steal a reference to its argument. A better API is a consistent API.

@erlend-aasland
Copy link
Contributor

erlend-aasland commented Feb 21, 2023

Before we can close this, we need...:

  • What's New entry
  • Tests for the get/set args C APIs
  • Amend PyErr_SetObject docs for 3.12 through 3.10. (I don't think we can justify this as important enough to backport to 3.9 and 3.8. Backporting to 3.7 is definitely out of the question.)

erlend-aasland added a commit to erlend-aasland/cpython that referenced this issue Feb 22, 2023
carljm added a commit to carljm/cpython that referenced this issue Feb 22, 2023
* main: (225 commits)
  pythongh-102056: Fix a few bugs in error handling of exception printing code (python#102078)
  pythongh-102011: use sys.exception() instead of sys.exc_info() in docs where possible (python#102012)
  pythongh-101566: Sync with zipp 3.14. (pythonGH-102018)
  pythonGH-99818: improve the documentation for zipfile.Path and Traversable (pythonGH-101589)
  pythongh-88233: zipfile: handle extras after a zip64 extra (pythonGH-96161)
  pythongh-101981: Apply HOMEBREW related environment variables (pythongh-102074)
  pythongh-101907: Stop using `_Py_OPCODE` and `_Py_OPARG` macros (pythonGH-101912)
  pythongh-101819: Adapt _io types to heap types, batch 1 (pythonGH-101949)
  pythongh-101981: Build macOS as recommended by the devguide (pythonGH-102070)
  pythongh-97786: Fix compiler warnings in pytime.c (python#101826)
  pythongh-101578: Amend PyErr_{Set,Get}RaisedException docs (python#101962)
  Misc improvements to the float tutorial (pythonGH-102052)
  pythongh-85417: Clarify behaviour on branch cuts in cmath module (python#102046)
  pythongh-100425: Update tutorial docs related to sum() accuracy (FH-101854)
  Add missing 'is' to `cmath.log()` docstring (python#102049)
  pythongh-100210: Correct the comment link for unescaping HTML (python#100212)
  pythongh-97930: Also include subdirectory in makefile. (python#102030)
  pythongh-99735: Use required=True in argparse subparsers example (python#100927)
  Fix incorrectly documented attribute in csv docs (python#101250)
  pythonGH-84783: Make the slice object hashable (pythonGH-101264)
  ...
carljm added a commit to carljm/cpython that referenced this issue Feb 23, 2023
* main: (76 commits)
  Fix syntax error in struct doc example (python#102160)
  pythongh-99108: Import MD5 and SHA1 from HACL* (python#102089)
  pythonGH-101777: `queue.rst`: use 2 spaces after a period to be consistent. (python#102143)
  Few coverage nitpicks for the cmath module (python#102067)
  pythonGH-100982: Restrict `FOR_ITER_RANGE` to a single instruction to allow instrumentation. (pythonGH-101985)
  pythongh-102135: Update turtle docs to rename wikipedia demo to rosette (python#102137)
  pythongh-99942: python.pc on android/cygwin should link to libpython per configure.ac (pythonGH-100356)
  pythongh-95672 fix typo SkitTest to SkipTest (pythongh-102119)
  pythongh-101936: Update the default value of fp from io.StringIO to io.BytesIO (pythongh-102100)
  pythongh-102008: simplify test_except_star by using sys.exception() instead of sys.exc_info() (python#102009)
  pythongh-101903: Remove obsolete undefs for previously removed macros Py_EnterRecursiveCall and Py_LeaveRecursiveCall (python#101923)
  pythongh-100556: Improve clarity of `or` docs (python#100589)
  pythongh-101777: Make `PriorityQueue` docs slightly clearer (python#102026)
  pythongh-101965: Fix usage of Py_EnterRecursiveCall return value in _bisectmodule.c (pythonGH-101966)
  pythongh-101578: Amend exception docs (python#102057)
  pythongh-101961 fileinput.hookcompressed should not set the encoding value for the binary mode (pythongh-102068)
  pythongh-102056: Fix a few bugs in error handling of exception printing code (python#102078)
  pythongh-102011: use sys.exception() instead of sys.exc_info() in docs where possible (python#102012)
  pythongh-101566: Sync with zipp 3.14. (pythonGH-102018)
  pythonGH-99818: improve the documentation for zipfile.Path and Traversable (pythonGH-101589)
  ...
@erlend-aasland
Copy link
Contributor

@markshannon, after looking through the implementation again, I agree that this is indeed a better API. I'm very sorry for the noise I've created in trying to push for a change in semantics for PyErr_SetRaisedException.

Regarding #101578 (comment): I'll add some more tests soon, and I'll also draft a proposal for the PyErr_SetObject doc changes.

@erlend-aasland
Copy link
Contributor

One more thing: did you deliberately not apply the Py_DEPRECATED macro to the deprecated APIs, @markshannon?

@encukou
Copy link
Member

encukou commented Feb 27, 2023

One more thing: did you deliberately not apply the Py_DEPRECATED macro to the deprecated APIs

Let's be careful with loud deprecations. The macros aren't particularly dangerous, so pushing users to rewrite their code wouldn't help much: this edge case most likely doesn't affect them, and if it does they “just” get the wrong exception. And they aren't much of a burden to keep in CPython.
IMO, a docs note is appropriate in this case. There's no plan to remove them.

@erlend-aasland
Copy link
Contributor

There's no plan to remove them.

Sure; I don't think anyone is planning to do that.

@gvanrossum
Copy link
Member

So if the plan is to deprecate PyErr_SetObject, why change its semantics? The change in semantics may still trip folks up, which is not so nice -- they may need three versions of their calling code, one for the old PyErr_SetObject semantics (3.11 and before), one for the 3.12 semantics, and one using the new API. (PyErr_SetObject is in the stable ABI so they may want to use that in 3.12 in some configurations.)

@markshannon
Copy link
Member Author

markshannon commented Mar 13, 2023

There is no plan to change the semantics of PyErr_SetObject(). We plan to deprecate it.

We do seem to have to changed its semantics accidentally, as described in #102594, but the semantics in that case made no sense and was setting up a trap for whatever code tried to normalize the exception.

@gvanrossum
Copy link
Member

From offline conversation with Irit it looks like it's better to change the behavior of PyErr_SetObject (but in an acceptable way) without deprecating it. @markshannon We should discuss this at the Wednesday meeting.

@markshannon
Copy link
Member Author

Why not deprecate it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.7 (EOL) end of life 3.8 only security fixes 3.9 only security fixes 3.10 only security fixes 3.11 bug and security fixes 3.12 bugs and security fixes topic-C-API type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants