Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upPool never closes if acquire with timeout is cancelled #547
Comments
Closes MagicStack#547 When wait_for is cancelled, there is a chance that the waited task has already been completed, leaving the connection looking like it is in use. This fix ensures that the connection is returned to the pool in this situation. For context, see: https://bugs.python.org/issue37658 MagicStack#467
`asyncio.wait_for()` currently has a bug where it raises a `CancelledError` even when the wrapped awaitable has completed. The upstream fix is in python/cpython#37658. This adds a workaround until the aforementioned PR is merged, backported and released. Fixes: #467 Fixes: #547 Related: #468 Supersedes: #548
`asyncio.wait_for()` currently has a bug where it raises a `CancelledError` even when the wrapped awaitable has completed. The upstream fix is in python/cpython#21894. This adds a workaround until the aforementioned PR is merged, backported and released. Fixes: #467 Fixes: #547 Related: #468 Supersedes: #548
`asyncio.wait_for()` currently has a bug where it raises a `CancelledError` even when the wrapped awaitable has completed. The upstream fix is in python/cpython#21894. This adds a workaround until the aforementioned PR is merged, backported and released. Co-authored-by: Adam Liddell <git@aliddell.com> Fixes: #467 Fixes: #547 Related: #468 Supersedes: #548
uvloop?: Yes
Here's a demo:
This code creates a pool, then starts a task that tries to acquire a connection with a timeout. This task is allowed to start, but is then cancelled. Following this, the pool is closed. The expected behaviour is that the pool should close cleanly, as no connections are in use.
When running this code, I see 'Closing pool' printed, then the script hangs indefinitely. It appears the pool is waiting for the release of the connection, which has not correctly handled the cancellation. When observing this in actual code, printing the pending tasks shows one stuck at
PoolConnectionHolder.wait_until_released()
(line 229). This suggests that the_in_use
future is not being resolved correctly during cancellation.However, if I remove the
timeout=200
on the acquire, the bug goes away. Additionally, the0.0000000001
second sleep is somewhat important:Effectively, there is a very short 'critical' window (~10us seconds on this machine) during which a cancellation arriving will lead to the pool being unclosable. Hence this bug has been a total pain to hunt for.