Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PyErr_SetObject() behavior is strange and not as documented. #101578

Open
markshannon opened this issue Feb 5, 2023 · 12 comments · Fixed by #101607
Open

PyErr_SetObject() behavior is strange and not as documented. #101578

markshannon opened this issue Feb 5, 2023 · 12 comments · Fixed by #101607
Labels

Comments

@markshannon
Copy link
Member

markshannon commented Feb 5, 2023

Briefly:
PyErr_SetObject(exc_type, exc_val) does not create a new exception iff isinstance(exc_val, BaseException), but uses exc_val instead.

Callers of PyErr_SetObject() need various workarounds to handle this.

The long version:

Internally CPython handles exceptions as a triple (type, value, traceback), but the language treats exceptions as a single value.

This a legacy of the olden days before proper exceptions.
To handle adding proper exceptions to Python, various error handling functions, specifically _PyErr_SetObject still treat exceptions as triples, with the convention that if the value is an exception, then the exception is already normalized.

One other oddity is that if exc_val is a tuple, it is treated as the * arguments to exc_type when calling it. So, if isinstance(exc_val, BaseException) the desired behavior can be achieved by wrapping exc_val in a one-tuple.

As a consequence, both _PyErr_SetKeyError and _PyGen_SetStopIterationValue are a lot more complex than they should be to workaround this behavior.

We could make PyErr_SetObject act as documented, but that is likely to break C extensions, given how old this behavior is, and that it is relied on throughout CPython.

Code that does the following is common:

    exc = new_foo_exception();
    PyErr_SetObject(&PyFooException_Type, exc);

We could just document the current behavior, but the current behavior is strange.
What I suggest is this:

  • Create a new API function, PyErr_SetException(exc)` that takes a single exception object.
  • Document PyErr_SetObject() accurately
  • Deprecate the old function

This is an old bug going back to the 2 series.

Linked PRs

@markshannon
Copy link
Member Author

The only case I can find of strange behavior leaking out is this:

import sys

def wrap_in_sys_exit(ex):
    try:
        try:
            raise ex(-1)
        except ex as err:
            sys.exit(err)
    except BaseException as err:
        return err

print(repr(wrap_in_sys_exit(ValueError)))
print(repr(wrap_in_sys_exit(SystemExit)))

Which prints

SystemExit(ValueError(-1))
SystemExit(-1)

instead of the expected

SystemExit(ValueError(-1))
SystemExit(SystemExit(-1))

@rhettinger
Copy link
Contributor

rhettinger commented Feb 5, 2023

This API goes by to 1997. It is remarkable how simple and clear the code used to be:

void
PyErr_Restore(PyObject *type, PyObject *value, PyObject *traceback)
{
	PyThreadState *tstate = PyThreadState_GET();
	PyObject *oldtype, *oldvalue, *oldtraceback;

	if (traceback != NULL && !PyTraceBack_Check(traceback)) {
		/* XXX Should never happen -- fatal error instead? */
		Py_DECREF(traceback);
		traceback = NULL;
	}

	/* Save these in locals to safeguard against recursive
	   invocation through Py_XDECREF */
	oldtype = tstate->curexc_type;
	oldvalue = tstate->curexc_value;
	oldtraceback = tstate->curexc_traceback;

	tstate->curexc_type = type;
	tstate->curexc_value = value;
	tstate->curexc_traceback = traceback;

	Py_XDECREF(oldtype);
	Py_XDECREF(oldvalue);
	Py_XDECREF(oldtraceback);
}

void
PyErr_SetObject(PyObject *exception, PyObject *value)
{
	Py_XINCREF(exception);
	Py_XINCREF(value);
	PyErr_Restore(exception, value, (PyObject *)NULL);
}

@markshannon
Copy link
Member Author

The code may have been simple, but it can leave the VM in a precarious state if the value is not a legal argument for calling the type.

markshannon added a commit that referenced this issue Feb 8, 2023
* Make sure that the current exception is always normalized.

* Remove redundant type and traceback fields for the current exception.

* Add new API functions: PyErr_GetRaisedException, PyErr_SetRaisedException

* Add new API functions: PyException_GetArgs, PyException_SetArgs
carljm added a commit to carljm/cpython that referenced this issue Feb 9, 2023
* main: (82 commits)
  pythongh-101670: typo fix in PyImport_ExtendInittab() (python#101723)
  pythonGH-99293: Document that `Py_TPFLAGS_VALID_VERSION_TAG` shouldn't be used. (#pythonGH-101736)
  no-issue: Add Dong-hee Na as the cjkcodecs codeowner (pythongh-101731)
  pythongh-101678: Merge math_1_to_whatever() and math_1() (python#101730)
  pythongh-101678: refactor the math module to use special functions from c11 (pythonGH-101679)
  pythongh-85984: Remove legacy Lib/pty.py code. (python#92365)
  pythongh-98831: Use opcode metadata for stack_effect() (python#101704)
  pythongh-101283: Version was just released, so should be changed in 3.11.3 (pythonGH-101719)
  pythongh-101283: Fix use of unbound variable (pythonGH-101712)
  pythongh-101283: Improved fallback logic for subprocess with shell=True on Windows (pythonGH-101286)
  pythongh-101277: Port more itertools static types to heap types (python#101304)
  pythongh-98831: Modernize CALL and family (python#101508)
  pythonGH-101696: invalidate type version tag in `_PyStaticType_Dealloc` (python#101697)
  pythongh-100221: Fix creating dirs in `make sharedinstall` (pythonGH-100329)
  pythongh-101670: typo fix in PyImport_AppendInittab() (pythonGH-101672)
  pythongh-101196: Make isdir/isfile/exists faster on Windows (pythonGH-101324)
  pythongh-101614: Don't treat python3_d.dll as a Python DLL when checking extension modules for incompatibility (pythonGH-101615)
  pythongh-100933: Improve `check_element` helper in `test_xml_etree` (python#100934)
  pythonGH-101578: Normalize the current exception (pythonGH-101607)
  pythongh-47937: Note that Popen attributes are read-only (python#93070)
  ...
@erlend-aasland
Copy link
Contributor

The new C APIs PyErr_GetRaisedException and PyErr_SetRaisedException are IMO problematic:

  • the former returns a borrowed reference
  • the latter steals a reference

This goes against our recommendations for adding new public C APIs.

I also find the added documentation inadequate. It is not documented that the former function returns a borrowed reference. The documentation for the latter function is not exactly friendly written, and I'm specifically thinking of the last parenthesised sentence:

exc must be a valid exception. (Violating this rules will cause subtle problems later.) This call consumes a reference to the exc object: you must own a reference to that object before the call and after the call you no longer own that reference. (If you don’t understand this, don’t use this function. I warned you.)

Suggesting to considering reopening this issue until at least the docs have been improved. IMO, we should also fix the newly added C APIs to be consistent in their reference handling, which means:

  • PyErr_GetRaisedException should return a new reference
  • PyErr_SetRaisedException should not steal a reference

@markshannon
Copy link
Member Author

The remark "(If you don’t understand this, don’t use this function. I warned you.)" comes directly from the documentation for PyErr_Restore and PyErr_Fetch which this API is meant to replace, as does the reference count behavior.

PyErr_GetRaisedException should return a new reference

How does that differ from the current behavior?

@encukou
Copy link
Member

encukou commented Feb 14, 2023

AFAICS, PyErr_GetRaisedException doesn't return a borrowed reference. The caller owns the return value.
It's just PyErr_SetRaisedException stealing its argument.

@erlend-aasland
Copy link
Contributor

AFAICS, PyErr_GetRaisedException doesn't return a borrowed reference. The caller owns the return value.
It's just PyErr_SetRaisedException stealing its argument.

Huh? In that case, I'm reading this completely wrong:

cpython/Python/errors.c

Lines 448 to 460 in 81e3aa8

PyObject *
_PyErr_GetRaisedException(PyThreadState *tstate) {
PyObject *exc = tstate->current_exception;
tstate->current_exception = NULL;
return exc;
}
PyObject *
PyErr_GetRaisedException(void)
{
PyThreadState *tstate = _PyThreadState_GET();
return _PyErr_GetRaisedException(tstate);
}

Anyway, I still think it smells to introduce a new API that steals its argument. There should be no performance hit in making the caller instead of the callee having to decref that argument.

@erlend-aasland
Copy link
Contributor

The remark "(If you don’t understand this, don’t use this function. I warned you.)" comes directly from the documentation for PyErr_Restore and PyErr_Fetch which this API is meant to replace, as does the reference count behavior.

If we introduce a new API, why copy the IMO inadequate docs and the smelly ref count behaviour? It's a new API; we can choose to follow the convention; we can choose to document it properly.

@markshannon
Copy link
Member Author

I'm not saying we shouldn't document it better, just don't blame me for the docs 🙂

@erlend-aasland
Copy link
Contributor

I'm not saying we shouldn't document it better, just don't blame me for the docs 🙂

Sorry, if it came out that way; that was not my intent! Let's just amend the docs; I can propose a patch 🙂

(Regarding the reference counting, I'd still like to see the public APIs following the established conventions.)

@erlend-aasland
Copy link
Contributor

(Re-opening till we've patched the docs)

erlend-aasland added a commit to erlend-aasland/cpython that referenced this issue Feb 16, 2023
erlend-aasland added a commit that referenced this issue Feb 19, 2023
Co-authored-by: C.A.M. Gerlach <CAM.Gerlach@Gerlach.CAM>
erlend-aasland added a commit to erlend-aasland/cpython that referenced this issue Feb 19, 2023
@erlend-aasland
Copy link
Contributor

OOI, is there plans for a single-arg variant of _PyErr_ChainExceptions? Or is that not needed anymore?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants