-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Python: Fix bad fastTC
in ASTNode::contains
#8028
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Between 2.8.0 and 2.8.1, the performance of `ASTNode::contains` degraded considerably for some queries, one such example being the `py/comparison-of-identical-expressions` query. In 2.8.0, the `contains` relation was inlined into the helper predicates for that query, whereas in 2.8.1, the `contains` relation was evaluated in isolation using a `boundedFastTC`: ``` (4s) Starting to evaluate predicate boundedFastTC(AstExtended::AstNode::getAChildNode_dispred#ff,AstExtended::AstNode#class#f)/2@c4bc3flu = HOP boundedFastTC(3646779,2616172) (887s) >>> Relation boundedFastTC(AstExtended::AstNode::getAChildNode_dispred#ff,AstExtended::AstNode#class#f): 12307755 rows using 115 MB ``` I tried rewriting the `contains` predicate to use a manual TC, but this also had poor performance (though still better than `fastTC`). Because `contains` is used in a lot of places, it's inconvenient to manually fix up and magick all of these places, and so I opted to `inline` this predicate instead.
Performance test not looking too promising..? |
Yeah, the trouble with I have a further fix for one of the queries that did this. I don't know if it's sufficient, but I'll push it and rerun the performance tests. |
This was one instance where inlining didn't actually affect the join order (and so the compiler decided to materialise the full `fastTC` without using any of the local context). In this case, it seems to be sufficient to simply manually specialise the `contains` relation to the particular setup needed here. (Note that the first occurrence of `contains` in this query is _not_ the same predicate -- it's a version that works on `StmtList`s.)
Ok, that fix looks sensible, let's see what effect is has.. :-) |
This definition of `getCase` was inlined in `getScope` and then evaluated in its entirety (without any magic) which turned out to be costly. This meant that any use of `getScope` would pay this penalty if not already present in the cache. The rewrite in this commit _should_ have the same outwards behaviour, since the parent of a `Pattern` is either a pattern itself, or a `Case`. However, by restricting to just these classes, we never materialise the entire `contains` relation.
This one had the same bad performance: ``` NestedLoopsSameVariableWithReuse.ql-3:boundedFastTC(AstExtended::AstNode::getAChildNode_dispred#ff,py_stmts_0#higher_order_body) ........ 14m24s ``` Compared to now: ``` NestedLoopsSameVariableWithReuse.ql-3:NestedLoopsSameVariableWithReuse::forLoopContainsName#ff ......... 10ms ```
Hm. Seems my |
Turns out that in isolation, `getCase` is fine. What was killing the performance was the fact that it got inlined into `getScope`, where it resulted in the full evaluation of the flipped `contains` relation. I think `nomagic` is sensible here -- I would not expect there to be so many results for this predicate that evaluating it would become a problem (note for instance that it already is restricted to relating `Pattern`s and `Case`s).
I think the latest evaluation looks good. There are a few individual queries that got slower overall, but the slowdown there is on the order of 10 seconds in total (whereas the overall speedup is between 5% and 9%). |
I am tempted, however, to just do a PR where I add the |
Yes, that could well be a simple fix, avoiding splitting out predicates. |
This is a simplified version of github#8028 consisting of just the `nomagic` fix.
Was #8255 a sufficient fix for this performance problem? |
I think so. I'll close this for now. If we start seeing degradation in the calculation of |
Between 2.8.0 and 2.8.1, the performance of
ASTNode::contains
degraded considerably for some queries (and some databases), one such example being thepy/comparison-of-identical-expressions
query.In 2.8.0, the
contains
relation was inlined into the helper predicates for that query, whereas in 2.8.1, thecontains
relation was evaluated in isolation using aboundedFastTC
:I tried rewriting the
contains
predicate to use a manual TC, but this also had poor performance (though still better thanfastTC
). Becausecontains
is used in a lot of places, it's inconvenient to manually fix up and magic all of these places, and so I opted toinline
this predicate instead.