Skip to content

[WIP] bpo-42671: Make Python finalization deterministic #23826

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed

[WIP] bpo-42671: Make Python finalization deterministic #23826

wants to merge 1 commit into from

Conversation

vstinner
Copy link
Member

@vstinner vstinner commented Dec 17, 2020

Make the Python finalization more deterministic:

  • First, clear the main module.
  • Then, clear modules from the most recently imported to the least
    recently imported: reversed(sys.modules).
  • builtins and sys modumes are always cleared last.
  • Module attributes are set to None from the most recently defined to the
    least recently defined: reversed(module.dict).

Changes:

  • finalize_modules() no longer uses a list of weak references to
    modules while clearing sys.modules dict.
  • When -vv command line option is used, the module name is now also
    logged, not only the attribute name.
  • test_module.test_module_finalization_at_shutdown(): final_a.x is
    now None when final_a.c is cleared.
  • test_sys.test_sys_ignores_cleaning_up_user_data(): the exception is
    no longer silently ignored. Rename the test to
    test_sys_cleaning_up_user_data().
  • test_threading.test_main_thread_during_shutdown() keeps a reference
    to threading functions since threading module variables are cleared
    before RefCycle object is deleted by the garbage collector.

https://bugs.python.org/issue42671

Make the Python finalization more deterministic:

* First, clear the __main__ module.
* Then, clear modules from the most recently imported to the least
  recently imported: reversed(sys.modules).
* builtins and sys modumes are always cleared last.
* Module attributes are set to None from the most recently defined to the
  least recently defined: reversed(module.__dict__).

Changes:

* finalize_modules() no longer uses a list of weak references to
  modules while clearing sys.modules dict.
* When -vv command line option is used, the module name is now also
  logged, not only the attribute name.
* test_module.test_module_finalization_at_shutdown(): final_a.x is
  now None when final_a.c is cleared.
* test_sys.test_sys_ignores_cleaning_up_user_data(): the exception is
  no longer silently ignored. Rename the test to
  test_sys_cleaning_up_user_data().
* test_threading.test_main_thread_during_shutdown() keeps a reference
  to threading functions since threading module variables are cleared
  before RefCycle object is deleted by the garbage collector.

int verbose = _Py_GetConfig()->verbose;
for (int step=1; step <= 2; step++) {
PyObject *reversed = PyObject_CallOneArg((PyObject*)&PyReversed_Type, dict);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PyDict_Keys() can be used.

In future we can also add _PyDict_Prev().

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PyDict_Keys() returns keys in their creation order, no? I would like to get them in the reverse order. Do you suggest to iterate on the list backwards?

PyObject *key, *value;

int verbose = _Py_GetConfig()->verbose;
for (int step=1; step <= 2; step++) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that it may be better now to clear all objects in the same order, without prioritizing underscored names.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh. I didn't know, so I chose the conservative option: keep the existing "heuristic". I'm fine with making the cleanup simpler :-)

PyErr_WriteUnraisable(NULL);
return;
}
PyObject *iter = PyObject_GetIter(keys);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is easier to iterate list by index.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, you're right. My first attempt avoided the creation of a list, but then I get annoying RuntimeError when the dict was modified while we iterate on it. Deleting a variable can have many side effects, it's way safer to copy the list of keys.

PyErr_Clear();
}

PyObject *reversed = PyObject_CallOneArg((PyObject*)&PyReversed_Type, modules);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PyMapping_Keys() + iterating by index.

@vstinner
Copy link
Member Author

I will update my PR later to take Serhiy's comment in account.

@github-actions
Copy link

This PR is stale because it has been open for 30 days with no activity.

@github-actions github-actions bot added the stale Stale PR or inactive for long period of time. label Jan 18, 2021
@vstinner vstinner closed this Sep 21, 2021
@vstinner vstinner deleted the finalize_modules branch September 21, 2021 22:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting core review stale Stale PR or inactive for long period of time.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants