New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpo-27144: concurrent.futures as_complete and map iterators do not keep reference to returned object #1560
Conversation
@grzgrzgrz3, thanks for your PR! By analyzing the history of the files in this pull request, we identified @brianquinlan, @asvetlov and @ezio-melotti to be potential reviewers. |
Lib/concurrent/futures/_base.py
Outdated
@@ -170,6 +170,17 @@ def _create_and_install_waiters(fs, return_when): | |||
|
|||
return waiter | |||
|
|||
|
|||
def _yield_future(fs, waiter, ref_collect=()): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you give this function a more descriptive name?
Lib/concurrent/futures/_base.py
Outdated
|
||
def _yield_future(fs, waiter, ref_collect=()): | ||
while fs: | ||
with fs[0]._condition: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain why you need to do this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If your question is about line 176. Based on issue20319, changes on future waiters list should be locked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess I don't understand why you need to remove the waiter. Previous code didn't do this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we do not remove waiter, second result set on future will trigger waiter and cause KeyError, future is already returned and reference cleared.
For example:
>>> from concurrent.futures import Future, as_completed
>>> fs_finished = Future()
>>> fs = Future()
>>> fs_finished.set_result("")
>>> for x in as_completed([fs_finished, fs, Future()]):
... fs.set_result(None)
...
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/grzegorz/cpython/Lib/concurrent/futures/_base.py", line 235, in as_completed
pending.remove(future)
KeyError: <Future at 0x7f566ff6fc20 state=finished returned NoneType>
This error occurs in both version.
However docstring for Future.set_result
and Future.set_exception
says:
Should only be used by Executor implementations and unit tests.
So maybe we should ignore this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I'm understanding correctly then, yes, I would certainly consider it an error to call set_result
twice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anyway, this part is not relate to origin issue, so i ll remove discussed code.
Maybe in future someone encounters it and create new issue, so it can be discussed there.
Lib/concurrent/futures/_base.py
Outdated
fs[0]._waiters.remove(waiter) | ||
|
||
for future_list in ref_collect: | ||
future_list.remove(fs[0]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If future_list
is a list, then this is O(n)...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this use case future_list will be always a set.
set().remove
is O(1).
I will rename future_list
to futures_collection
Lib/concurrent/futures/_base.py
Outdated
else: | ||
yield future.result(end_time - time.time()) | ||
yield fs.pop(0).result(end_time - time.time()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If fs
is a list, then this is O(n).
Lib/concurrent/futures/_base.py
Outdated
|
||
for future_list in ref_collect: | ||
future_list.remove(fs[0]) | ||
yield fs.pop(0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case fs is always a list, however I can modify this function to yield last element. I think order here does not matter. If order matter, reversing list is the option. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Either reverse the list or make it a deque, as you prefer.
Lib/concurrent/futures/process.py
Outdated
@@ -357,6 +357,12 @@ def _check_system_limits(): | |||
raise NotImplementedError(_system_limited) | |||
|
|||
|
|||
def _chain_from_iterable(iterable): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you need this? Please add a comment and/or docstring.
Lib/test/test_concurrent_futures.py
Outdated
@@ -59,6 +59,10 @@ def my_method(self): | |||
pass | |||
|
|||
|
|||
def _map_fn(_): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please give this function a more descriptive name, new_object
perhaps?
Thanks for caring about this issue. I think the proposed fix needs improvements, see comments. |
Thank you for review Please review new version. I have pushed with force, delete branch before pulling. |
Lib/concurrent/futures/_base.py
Outdated
@@ -170,6 +170,17 @@ def _create_and_install_waiters(fs, return_when): | |||
|
|||
return waiter | |||
|
|||
|
|||
def _yield_and_decref(fs, waiter, ref_collect=()): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is my best, i could not came up with anything better. Maybe you can suggest name more descriptive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps add a comment explaining why this function exists, then?
Lib/test/test_concurrent_futures.py
Outdated
# We don't particularly care what the default name is, just that | ||
# it has a default name implying that it is a ThreadPoolExecutor | ||
# followed by what looks like a thread number. | ||
self.assertRegex(t.name, r'^.*ThreadPoolExecutor.*_[0-4]$') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity, why does your PR affect this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My mistake, when rebasing.
Lib/test/test_concurrent_futures.py
Outdated
for result_object in self.executor.map(_dummy_object_fn, range(10)): | ||
self.assertEqual(sys.getrefcount(result_object), 2) | ||
|
||
def test_map_result_order(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this already tested by test_map
? Or perhaps you just want to augment test_map
with a chunksize
-using test.
Any update on review. Please have a look at last revision. |
Sorry for the delay @grzgrzgrz3. The PR now has conflicts, could you please resolve them? |
reference to returned object
Done. |
I'm trying to push some small changes to your branch. Crossing fingers. |
…ying on sys.getrefcount() in tests.
I'm merging this. Thank you very much! |
…not keep reference to returned object (pythonGH-1560) * bpo-27144: concurrent.futures as_complie and map iterators do not keep reference to returned object * Some nits. Improve wordings in docstrings and comments, and avoid relying on sys.getrefcount() in tests. (cherry picked from commit 97e1b1c)
* 'master' of https://github.com/python/cpython: (601 commits) remove check for bug last seem in Solaris 9 (python#3285) Change code owners for hashlib and ssl to the crypto team (python#3284) bpo-31281: Fix pathlib.Path incompatibility in fileinput (pythongh-3208) remove autoconf check for select() (python#3283) remove configure check for 'volatile' (python#3281) Add missing _sha3 module to Setup.dist (python#2395) bpo-12383: Also ignore __PYVENV_LAUNCHER__ (python#3278) bpo-9146: add the missing NEWS entry. (python#3275) Fix a c.f.as_completed() refleak previously introduced in bpo-27144 (python#3270) bpo-31185: Fixed miscellaneous errors in asyncio speedup module. (python#3076) remove a redundant lower in urllib.parse.urlsplit (python#3008) bpo-31323: Fix reference leak in test_ssl (python#3263) bpo-31250, test_asyncio: fix EventLoopTestsMixin.tearDown() (python#3264) bpo-31326: ProcessPoolExecutor waits for the call queue thread (python#3265) bpo-27144: concurrent.futures as_complete and map iterators do not keep reference to returned object (python#1560) bpo-31250, test_asyncio: fix dangling threads (python#3252) bpo-31217: Fix regrtest -R for small integer (python#3260) bpo-30096: Use ABC in abc reference examples (python#1220) bpo-30737: Update DevGuide links to new URL (pythonGH-3228) [Trivial] Remove now redundant assert (python#3245) ...
…ep reference to returned object (python#1560) * bpo-27144: concurrent.futures as_complie and map iterators do not keep reference to returned object * Some nits. Improve wordings in docstrings and comments, and avoid relying on sys.getrefcount() in tests.
@@ -191,16 +205,18 @@ def as_completed(fs, timeout=None): | |||
if timeout is not None: | |||
end_time = timeout + time.time() | |||
|
|||
total_futures = len(fs) | |||
|
|||
fs = set(fs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a regression. The ordering of instructions here is wrong. Now fs
must be a sequence to support the len(fs)
.
…ted()` This was possible before. pythonGH-1560 introduced a regression after 3.6.2 got released where only sequences were accepted now. This commit addresses this problem.
…completed()` (pythonGH-3830) This was possible before. pythonGH-1560 introduced a regression after 3.6.2 got released where only sequences were accepted now. This commit addresses this problem. (cherry picked from commit 574562c)
…completed()` (pythonGH-3830) (python#3831) This was possible before. pythonGH-1560 introduced a regression after 3.6.2 got released where only sequences were accepted now. This commit addresses this problem. (cherry picked from commit 574562c)
https://bugs.python.org/issue27144