Skip to content

gh-103656: Transfer f-string buffers to parser to avoid use-after-free #103896

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

lysnikolaou
Copy link
Member

@lysnikolaou lysnikolaou commented Apr 26, 2023

@lysnikolaou
Copy link
Member Author

After off-line discussions with @pablogsal, we felt that the best way forward is to add metadata to tokens and use it to pass the fstring debug buffer from the tokenizer to the parser, store it in the arena and ensure that its lifetime is okay for us to use in later passes as well.

cc @sunmy2019 whose review I can't request.

@@ -432,6 +433,15 @@ update_fstring_expr(struct tok_state *tok, char cur)
case ':':
if (tok_mode->last_expr_end == -1) {
tok_mode->last_expr_end = strlen(tok->start);
PyObject *res = PyUnicode_DecodeUTF8(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hummm, not sure if this is the place where we should be doing this. Isn't it better to set the metadata when we create the token itself? This way we don't need to pass down the token all the way to here. This function just updates the pointers and at the point when we have the token ready we set the metadata (because the pointers are ready).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me push a commit to see what I mean

@pablogsal pablogsal force-pushed the fstring-fix-last-expr-buffer-use-after-free branch 2 times, most recently from 02ab63e to 4cbd070 Compare April 26, 2023 23:25
@pablogsal pablogsal force-pushed the fstring-fix-last-expr-buffer-use-after-free branch from 4cbd070 to 51d7d83 Compare April 26, 2023 23:25
@pablogsal
Copy link
Member

pablogsal commented Apr 26, 2023

This is leaking refs in its current state. My guess is that we are not cleaning the python objects for f-strings that don't get debug info.

@pablogsal pablogsal force-pushed the fstring-fix-last-expr-buffer-use-after-free branch from 5920b9f to 326059e Compare April 26, 2023 23:52
@pablogsal
Copy link
Member

Btw as an optimization we should only set the metadata if the f-string has a '='

@pablogsal
Copy link
Member

Found the refleaks, the only thing missing is to fix #103896 (comment)

@lysnikolaou
Copy link
Member Author

Found the refleaks, the only thing missing is to fix #103896 (comment)

I can take that.

@pablogsal
Copy link
Member

Found the refleaks, the only thing missing is to fix #103896 (comment)

I can take that.

I just did it 😉

@pablogsal
Copy link
Member

Found the refleaks, the only thing missing is to fix #103896 (comment)

I can take that.

I just did it wink

Although I may have messed something in Windows?

@lysnikolaou
Copy link
Member Author

Thanks for the help on this @pablogsal. All changes look good.

@lysnikolaou lysnikolaou enabled auto-merge (squash) April 27, 2023 01:22
@lysnikolaou lysnikolaou disabled auto-merge April 27, 2023 01:22
@lysnikolaou lysnikolaou enabled auto-merge (squash) April 27, 2023 01:22
@lysnikolaou lysnikolaou merged commit 9169a56 into python:main Apr 27, 2023
@sunmy2019
Copy link
Member

Looks great!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Out-of-bounds read in unicodeobject.c ascii_decode()
4 participants