Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SSLContext.load_verify_locations leaks memory on Linux #84904

Open
Recursing mannequin opened this issue May 22, 2020 · 16 comments
Open

SSLContext.load_verify_locations leaks memory on Linux #84904

Recursing mannequin opened this issue May 22, 2020 · 16 comments
Labels
3.7 (EOL) end of life 3.8 only security fixes 3.9 only security fixes stdlib Python modules in the Lib dir topic-SSL type-bug An unexpected behavior, bug, or error

Comments

@Recursing
Copy link
Mannequin

Recursing mannequin commented May 22, 2020

BPO 40727
Nosy @tiran, @asvetlov, @1st1, @Recursing

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = None
created_at = <Date 2020-05-22.09:29:10.619>
labels = ['expert-SSL', '3.8', '3.9', 'performance', '3.7', 'library', 'expert-asyncio']
title = 'SSLContext.load_verify_locations leaks memory on Linux in async code'
updated_at = <Date 2020-05-22.10:32:24.242>
user = 'https://github.com/Recursing'

bugs.python.org fields:

activity = <Date 2020-05-22.10:32:24.242>
actor = 'Recursing'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = ['Library (Lib)', 'asyncio', 'SSL']
creation = <Date 2020-05-22.09:29:10.619>
creator = 'Recursing'
dependencies = []
files = []
hgrepos = []
issue_num = 40727
keywords = []
message_count = 6.0
messages = ['369573', '369578', '369582', '369583', '369584', '369586']
nosy_count = 4.0
nosy_names = ['christian.heimes', 'asvetlov', 'yselivanov', 'Recursing']
pr_nums = []
priority = 'normal'
resolution = None
stage = None
status = 'open'
superseder = None
type = 'resource usage'
url = 'https://bugs.python.org/issue40727'
versions = ['Python 3.6', 'Python 3.7', 'Python 3.8', 'Python 3.9']

@Recursing
Copy link
Mannequin Author

Recursing mannequin commented May 22, 2020

Minimal code to reproduce:

import ssl
import certifi
import gc
import asyncio


ca_path = certifi.where()
async def make_async_context() -> None:
    context = ssl.SSLContext(ssl.PROTOCOL_TLS)
    context.load_verify_locations(ca_path)
    await asyncio.sleep(1)


async def main(n: int) -> None:
    await asyncio.wait([make_async_context() for _ in range(n)])


gc.collect()
asyncio.run(main(2000))
input("Finished run, still using lots of memory :(")
gc.collect()
input("gc.collect() does not help :(")

Running this code on several linux machines (with python from 3.6.9 to 3.9.0a5, and openSSL from 1.1.1 11 Sep 2018 to 1.1.1g 21 Apr 2020) causes a significant memory leak, while on windows memory usage peaks around 1 GB but gets freed

@Recursing Recursing mannequin assigned tiran May 22, 2020
@Recursing Recursing mannequin added 3.7 (EOL) end of life 3.8 only security fixes 3.9 only security fixes stdlib Python modules in the Lib dir topic-asyncio topic-SSL performance Performance or resource usage labels May 22, 2020
@Recursing Recursing mannequin assigned tiran May 22, 2020
@Recursing Recursing mannequin added 3.7 (EOL) end of life 3.8 only security fixes 3.9 only security fixes labels May 22, 2020
@tiran
Copy link
Member

tiran commented May 22, 2020

Does it also leak without asyncio?

@Recursing
Copy link
Mannequin Author

Recursing mannequin commented May 22, 2020

Removing the await asyncio.sleep(1) removes the leak, while changing it to await asyncio.sleep(0) seems to keep it

@tiran
Copy link
Member

tiran commented May 22, 2020

Without asyncio memory consumption stays low and stable for me:

$ ./python -m venv venv
$ ./venv/bin/pip install psutil
$ ./venv/bin/python
>>> ssl.OPENSSL_VERSION
'OpenSSL 1.1.1g FIPS  21 Apr 2020'
>>> import psutil, ssl, os
>>> p = psutil.Process(os.getpid())
>>> cafile = ssl.get_default_verify_paths().cafile
>>> p.memory_info()
pmem(rss=14811136, vms=237223936, shared=8138752, text=2125824, lib=0, data=6701056, dirty=0)
>>> for i in range(1000):
...     ssl.SSLContext(ssl.PROTOCOL_TLS).load_verify_locations(cafile)
... 
>>> p.memory_info()
pmem(rss=17489920, vms=238170112, shared=9863168, text=2125824, lib=0, data=7647232, dirty=0)
>>> for i in range(1000):
...     ssl.SSLContext(ssl.PROTOCOL_TLS).load_verify_locations(cafile)
... 
>>> p.memory_info()
pmem(rss=17489920, vms=238170112, shared=9863168, text=2125824, lib=0, data=7647232, dirty=0)

@tiran
Copy link
Member

tiran commented May 22, 2020

When I run your example, RSS jumps from 20 MB to about 1,600 MB. There is almost no increase when I run the look several more times.

>>> p.memory_info()
pmem(rss=19902464, vms=240513024, shared=10014720, text=2125824, lib=0, data=9887744, dirty=0)
>>> asyncio.run(main(2000))
<stdin>:2: DeprecationWarning: The explicit passing of coroutine objects to asyncio.wait() is deprecated since Python 3.8, and scheduled for removal in Python 3.11.
>>> p.memory_info()
pmem(rss=1608568832, vms=1829105664, shared=10014720, text=2125824, lib=0, data=1598480384, dirty=0)
>>> asyncio.run(main(2000))
>>> p.memory_info()
pmem(rss=1608835072, vms=1829367808, shared=10014720, text=2125824, lib=0, data=1598742528, dirty=0)
>>> asyncio.run(main(2000))
>>> p.memory_info()
pmem(rss=1608601600, vms=1829367808, shared=10014720, text=2125824, lib=0, data=1598742528, dirty=0)

Why are you creating so many SSLContext objects any way? It's very inefficient and really not necessary. I recommend that you create one context in your application and reuse for all connection. You only ever need additional contexts for different configuration (protocol, verification, trust anchors, ...).

@tiran tiran removed their assignment May 22, 2020
@Recursing
Copy link
Mannequin Author

Recursing mannequin commented May 22, 2020

Without asyncio memory consumption stays low and stable for me

Same for me

RSS jumps from 20 MB to about 1,600 MB.

That is the memory consumption I observe as well, the issue is that it doesn't get freed on Linux

There is almost no increase when I run the look several more times.

Same for me, but of course only if I exit the "async" context between runs

Why are you creating so many SSLContext objects any way? It's very inefficient and really not necessary.

The original issue was observed in a very long running process (months), that occasionally needed a context and it was convenient to just create one every time (actually it creates an AsyncClient context encode/httpx#978) even if it is relatively inefficient, it didn't really matter, but memory usage unexpectedly slowly grew to 1 GB which was very unexpected

@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
@mhils
Copy link
Contributor

mhils commented May 11, 2022

FWIW I've been running into the same issue independently with pyOpenSSL. One potentially relevant observation is that calling

ctypes.CDLL('libc.so.6').malloc_trim(0)

returns most memory back to the OS. I suspect that load_verify_locations may only be a strawman here and we're observing a much smaller memory leak with some pathological heap allocation patterns.

@iritkatriel iritkatriel added type-bug An unexpected behavior, bug, or error and removed performance Performance or resource usage labels Aug 17, 2022
@sw55555
Copy link

sw55555 commented Nov 17, 2022

Leaks happen on Python3.11 too, but only on linux(compiled with openssl 1.1.1q, same as python on Windows), after running test, memory usage:

Ubuntu 22.04.1 LTS :

Line #    Mem usage    Increment  Occurrences   Line Contents
=============================================================
     8     21.6 MiB     21.6 MiB           1   @profile
     9                                         def leak_test():
    10     21.6 MiB      0.0 MiB           1       ca_certs = certifi.where()
    11     26.7 MiB      5.2 MiB         303       contexts = [SSLContext(PROTOCOL_TLS_CLIENT) for _ in range(300)]
    12    238.5 MiB      0.0 MiB         301       for context in contexts:
    13    238.5 MiB    211.7 MiB         300           context.load_verify_locations(ca_certs)
    14
    15    238.5 MiB      0.0 MiB           1       del contexts
    16    238.5 MiB      0.0 MiB           1       gc.collect()
    17
    18    238.5 MiB      0.0 MiB           1       print('Test complete!')

Windows 10 :

Line #    Mem usage    Increment  Occurrences   Line Contents
=============================================================
     8     22.7 MiB     22.7 MiB           1   @profile
     9                                         def leak_test():
    10     22.9 MiB      0.3 MiB           1       ca_certs = certifi.where()
    11     24.6 MiB      1.6 MiB         303       contexts = [SSLContext(PROTOCOL_TLS_CLIENT) for _ in range(300)]
    12    251.8 MiB      0.0 MiB         301       for context in contexts:
    13    251.8 MiB    227.2 MiB         300           context.load_verify_locations(ca_certs)
    14
    15     35.5 MiB   -216.3 MiB           1       del contexts
    16     35.5 MiB      0.0 MiB           1       gc.collect()
    17
    18     35.5 MiB      0.0 MiB           1       print('Test complete!')

My code to reproduce this bug:

import gc
import certifi
from ssl import SSLContext, PROTOCOL_TLS_CLIENT

from memory_profiler import profile


@profile
def leak_test():
    ca_certs = certifi.where()
    contexts = [SSLContext(PROTOCOL_TLS_CLIENT) for _ in range(300)]
    for context in contexts:
        context.load_verify_locations(ca_certs)

    del contexts
    gc.collect()

    print('Test complete!')


if __name__ == "__main__":
    leak_test()

@sw55555
Copy link

sw55555 commented Nov 17, 2022

Test on Python 3.10.6(Ubuntu 22.04.1 LTS), it leaks too

@gvanrossum gvanrossum changed the title SSLContext.load_verify_locations leaks memory on Linux in async code SSLContext.load_verify_locations leaks memory on Linux Nov 17, 2022
@gvanrossum
Copy link
Member

On macOS (Ventura 13.1) I also see most memory being returned, like on Windows 10 above. So it does seem to be Linux specific. Since the latest repro doesn't involve async code I removed that from the subject and removed the expert-asyncio label.

@mhils may well be right that this is some other leak together with a weird allocation pattern.

@vminfant
Copy link

Python 3.10.12 and 3.11.2 on (Debian GNU/Linux 12 (bookworm), has memory leak as well. Is there any progress or any recommendation to avoid this issue?

@vEpiphyte
Copy link

vEpiphyte commented Feb 20, 2024

This is still reproducing in 3.12.2 on ubuntu 22.04, compiled against openssl 3.0.2

@gvanrossum
Copy link
Member

@vEpiphyte

This is still reproducing in 3.12.2 on ubuntu 22.04, compiled against openssl 3.0.2

What exactly did you try?

@vEpiphyte
Copy link

vEpiphyte commented Feb 21, 2024

@gvanrossum Here is the code I used. It is a variant of the original - I am holding an reference to the SSLContext objects in a dictionary to avoid any nebulous object lifetime issues in the original make_async_context; which is then cleared(). The resident memory usage was observed with external tools.

With the malloc_trim(0) call from @mhils the resident memory decreases when the cache is cleared. Without it, the memory stays allocated after clearing the dictionary. This was consistent with and without del statements on the SSLContext objects.

import gc
import sys
import ssl
import time
import ctypes

import certifi

N = 1000

ca_path = certifi.where()

context_cache = {}

def clear_cache(cache):
    cache.clear()
    print('Calling maclloc_trim(0)')
    # Per https://github.com/python/cpython/issues/84904#issuecomment-1123760269 this helps...
    # ctypes.CDLL('libc.so.6').malloc_trim(0)

def make_context(i) -> None:
    context = ssl.SSLContext(ssl.PROTOCOL_TLS)
    context.load_verify_locations(ca_path)
    context_cache[i] = context
    return None

def main(n: int) -> None:
    for i in range(n):
        if i % 100 == 0:
            print(f'{i=}')
        make_context(i)
    print('done gather, sleeping 6')
    time.sleep(6)
    print('clearing cache')
    clear_cache(context_cache)
    print('done clearing, sleeping 6')
    time.sleep(6)


gc.collect()
main(N)
input("Finished run, still using lots of memory :(")
gc.collect()
input("Does gc.collect() help?")
time.sleep(6)
sys.exit(0)

@gvanrossum
Copy link
Member

But if malloc_trim() releases the memory, the problem you are experiencing is not leakage but fragmentation.

@vEpiphyte
Copy link

vEpiphyte commented Feb 22, 2024

I was just wanting to report that the s/leak/fragmentation/ was still observable with Linux on 3.12.2. I do not know enough about python memory management to know if that behavior is a linux specific artifact or not; but it does feel a bit odd to only be observed on that platform.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.7 (EOL) end of life 3.8 only security fixes 3.9 only security fixes stdlib Python modules in the Lib dir topic-SSL type-bug An unexpected behavior, bug, or error
Projects
Status: Todo
Development

No branches or pull requests

7 participants