Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-23689: re module, fix memory leak when a match is terminated by a signal or memory allocation failure #32283

Merged
merged 11 commits into from Apr 3, 2022

Conversation

animalize
Copy link
Contributor

@animalize animalize commented Apr 3, 2022

@serhiy-storchaka

This time, I checked several rounds carefully, it should be in good state.

Tested with VERBOSE/VVERBOSE macros defined, it builds and runs well.

https://bugs.python.org/issue23689

Copy link
Member

@serhiy-storchaka serhiy-storchaka left a comment

LGTM.

I have only one question: how to prove that we need only one SRE_REPEAT structure per the REPEAT code?

Modules/_sre.c Outdated Show resolved Hide resolved
self.assertEqual(get_debug_out(r'(?:ab)*(?:cd)*'), '''\
MAX_REPEAT 0 MAXREPEAT
self.assertEqual(get_debug_out(r'(?:ab)*?(?:cd)*'), '''\
MIN_REPEAT 0 MAXREPEAT
Copy link
Member

@serhiy-storchaka serhiy-storchaka Apr 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You just read my mind! I was going to propose such a change, but I thought that I was already bothering you too much.

Copy link
Contributor Author

@animalize animalize Apr 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just thought this after posting this PR.

I thought that I was already bothering you too much.

As an inactive contributor, this is not a matter.
I'm not practised, so need continuously improve the patch to get to a good state.
When I think it's good, I can always find its shortcomings afterwards.

I have only one question: how to prove that we need only one SRE_REPEAT structure per the REPEAT code?

I have to think about how to answer your question.

@animalize
Copy link
Contributor Author

@animalize animalize commented Apr 3, 2022

I have only one question: how to prove that we need only one SRE_REPEAT structure per the REPEAT code?

At any time, an SRE_OP_REPEAT only has one in the stack.

  • When executing this OP, it's pushed in the stack. Although SRE_OP_MAX_UNTIL / SRE_OP_MIN_UNTIL may generate many backtracking points in the stack (above SRE_OP_REPEAT in the stack).
  • When backtracking, it's poped from the stack.

This wouldn't work if re engine could memorize some backtracking states to optimize performance, but then the code of re module would be much more complicated.

@serhiy-storchaka serhiy-storchaka merged commit 6e3eee5 into python:main Apr 3, 2022
12 checks passed
@animalize animalize deleted the repeat_array2 branch Apr 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants