Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-81536: For nonblocking sockets, add SSLSocket.eager_recv to call SSL_read in a loop #31492

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

Safihre
Copy link

@Safihre Safihre commented Feb 22, 2022

This PR is a continuation of #25478, that was no longer updated by the author @hashbrowncipher.
I implemented all the requested changes.

However, the optimization changes behavior of an SSLSocket.recv. Therefore I introduced a parameter to enable this new behavior only when desired.

The change in behavior if we don't add an opt-in:

Right now callers can rely on fact that either SSLSocket.read returns bytes OR it reads a TLS close_notify message, producing an exception; it never does both. With this change, it becomes possible to read application data and also read a TLS close_notify message, which will be processed by OpenSSL. select/poll/epoll will all now indicate that the socket has no bytes available (which is true), but OpenSSL knows that no additional bytes will ever arrive on the socket. I validated my theory by patching the test_ftplib code to check for EOF after successful reads. With this change, the tests all passed.

That said, I don't imagine it's acceptable to change behavior like this. I think that to roll out this optimization would require calling code to opt in by passing a flag indicating "Please read eagerly from the socket; I know that I am responsible for explicitly checking for EOF before calling select() on the underlying OS socket."

It was requested in the review to also implement this for write(), however, that is not needed as OpenSSL's SSL_write_ex already writes the whole buffer at once. Only reading OpenSSL does in the 16k segments.


Original PR-text:

Continue looping until data is exhausted, and only then reacquire the GIL. This makes it possible to perform multi-threaded TLS downloads without saturating the GIL. On a test workload performing HTTPS download with 32 threads pinned to 16 cores, this produces a 4x speedup.

                          before     after
wall clock time (s)  :    29.637     7.116
user time (s)        :     8.793    12.584
system time (s)      :   105.118    30.010
voluntary switches   : 1,653,065   248,484
speed (MB/s)         :      4733     19712

https://bugs.python.org/issue37355

Closes #25478.
Closes #81536

https://bugs.python.org/issue37355

@Safihre
Copy link
Author

@Safihre Safihre commented Mar 3, 2022

Added the documentation.

@Safihre
Copy link
Author

@Safihre Safihre commented Mar 8, 2022

@tiran @pitrou Do you maybe have time for a review? (you reviewed the original PR)
Thank you 🙂

@cpython-cla-bot
Copy link

@cpython-cla-bot cpython-cla-bot bot commented Apr 14, 2022

All commit authors signed the Contributor License Agreement.
CLA signed

@Safihre Safihre changed the title bpo-37355: For nonblocking sockets, add SSLSocket.eager_recv to call SSL_read in a loop gh-81536: For nonblocking sockets, add SSLSocket.eager_recv to call SSL_read in a loop Apr 14, 2022
@ambv ambv removed the CLA signed label Apr 14, 2022
@ambv
Copy link
Contributor

@ambv ambv commented Apr 14, 2022

@hashbrowncipher could you please re-sign the CLA using the button in this PR?

@jakirkham
Copy link

@jakirkham jakirkham commented Apr 14, 2022

Something weird looks like it happened with their original PR ( #81536 ), which makes the user appear deleted. So maybe there is a GitHub issue going on?

@Safihre Safihre force-pushed the saf-bpo37355 branch 2 times, most recently from 7bb1705 to 507742c Compare Apr 19, 2022
@Safihre
Copy link
Author

@Safihre Safihre commented Apr 21, 2022

@vstinner You have made many revisions to the _ssl.c file and the specific function modified here, would you be able to review the change? Thank you very much!

@vstinner
Copy link
Member

@vstinner vstinner commented Apr 21, 2022

I'm not available to review this change.

Safihre added 3 commits Jun 8, 2022
Continue looping until data is exhausted, and only then reacquire the GIL. This
makes it possible to perform multi-threaded TLS downloads without saturating
the GIL. On a test workload performing HTTPS download with 32 threads pinned
to 16 cores, this produces a 4x speedup.

                              before     after
    wall clock time (s)  :    29.637     7.116
    user time (s)        :     8.793    12.584
    system time (s)      :   105.118    30.010
    voluntary switches   : 1,653,065   248,484
    speed (MB/s)         :      4733     19712

Original author: Josh Snyder - hashbrowncipher
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants