New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpo-30416: Protect the optimizer during constant folding. #4860
bpo-30416: Protect the optimizer during constant folding. #4860
Conversation
It no longer spends much time doing complex calculations and no longer consumes much memory for creating large constants that will be dropped later. This fixes also bpo-21074.
Python/ast_opt.c
Outdated
return limit; | ||
} | ||
|
||
#define MAX_INT_SIZE 128 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you mind to specify the unit? (bytes? bits? digits?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bits.
Python/ast_opt.c
Outdated
|
||
#define MAX_INT_SIZE 128 | ||
#define MAX_COLLECTION_SIZE 256 | ||
#define MAX_STR_SIZE 4096 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto: number of characters?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Characters.
Python/ast_opt.c
Outdated
#define MAX_INT_SIZE 128 | ||
#define MAX_COLLECTION_SIZE 256 | ||
#define MAX_STR_SIZE 4096 | ||
#define MAX_TOTAL_ITEMS 1000000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, that's a lot. What's the rationale for so big objects? The old peephole optimizer limited to tuples of 20 items or something like that, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The old peephole optimizer doesn't have such limit (see bpo-30416). This is a limit for the total number of items, including nested tuples and sets. For example ((((((((1,)*20,)*20,)*20,)*20,)*20,)*20,)*20,)*20
contains around 21**8
items.
The size of every tuple and set is limited to 256 (was 20).
What the limit looks reasonable to you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a matter of tradeoff between CPU (compute at runtime) vs disk space/memory.
If the optimized tuple is in a function which is almost never called, it would be waste to allocate 1000000 in memory, whereas it's never used.
IHMO the maximum size of "a constant" (a constant could be a tuple with nested tuples) should be 4 KB.
sys.getsizeof((None,)*500)
4048
So I suggest to limit to 500 items, not 1000000.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
500 is too small. A tuple of 256 pairs contains 256*3 items total. I have decreased the limit to 1024.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you want to build a constant for a tuple of 768 items? What is the use case? Is it a performance issue to build it at runtime?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Constant tuples or strings of size 256 are used in decoders. Constant dicts of tens elements are used in encoders. In sre_compile.py there is a constant expression b'0' + b'1' * 255
. I think the size 256 is the reasonable limit of the size of constant collections. If the collection contains not primary types, but tuples, the total number of items is multiple times the size of the collection.
Using too small limit can cause a regression. Note that currently there is no limit. 1024 is much lesser than infinity.
Python/ast_opt.c
Outdated
if (vbits == (size_t)-1 || wbits == (size_t)-1) { | ||
return NULL; | ||
} | ||
if (vbits + wbits > MAX_INT_SIZE) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a risk of integer overflow, no? May check "vbits+wbits < SIZE_MAX" with "vbits < SIZE_MAX - wbits" ?
Ok for 1024 items, but why limiting a single tuple to 256? Maybe 1024 should be used for all limits? |
Oh, it already is increased from 20 to 256. I think this is enough. |
It no longer spends much time doing complex calculations and no longer consumes much memory for creating large constants that will be dropped later.
This fixes also bpo-21074.
https://bugs.python.org/issue30416