-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Tree sitter update #8909
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tree sitter update #8909
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Found 20 vulnerabilities.
ruby/ql/lib/upgrades/9fdd1d40fd3c3f8f9db8fabf5a353580d14c663a/ruby_ast_node_info.ql
Fixed
Show fixed
Hide fixed
ruby/ql/lib/upgrades/9fdd1d40fd3c3f8f9db8fabf5a353580d14c663a/ruby_call_arguments.ql
Fixed
Show fixed
Hide fixed
ruby/ql/lib/upgrades/9fdd1d40fd3c3f8f9db8fabf5a353580d14c663a/ruby_call_arguments.ql
Fixed
Show fixed
Hide fixed
ruby/ql/lib/upgrades/9fdd1d40fd3c3f8f9db8fabf5a353580d14c663a/ruby_call_arguments.ql
Fixed
Show fixed
Hide fixed
ruby/ql/lib/upgrades/9fdd1d40fd3c3f8f9db8fabf5a353580d14c663a/ruby_call_arguments.ql
Fixed
Show fixed
Hide fixed
ruby/ql/lib/upgrades/9fdd1d40fd3c3f8f9db8fabf5a353580d14c663a/ruby_call_receiver.ql
Fixed
Show fixed
Hide fixed
ruby/ql/lib/upgrades/9fdd1d40fd3c3f8f9db8fabf5a353580d14c663a/ruby_call_receiver.ql
Fixed
Show fixed
Hide fixed
ruby/ql/lib/upgrades/9fdd1d40fd3c3f8f9db8fabf5a353580d14c663a/ruby_scope_resolution_def.ql
Fixed
Show fixed
Hide fixed
ruby/ql/lib/upgrades/9fdd1d40fd3c3f8f9db8fabf5a353580d14c663a/ruby_call_receiver.ql
Fixed
Show fixed
Hide fixed
ruby/ql/lib/upgrades/9fdd1d40fd3c3f8f9db8fabf5a353580d14c663a/ruby_call_receiver.ql
Fixed
Show fixed
Hide fixed
bb52ca1
to
99b77c6
Compare
925cf49
to
ccc1864
Compare
Co-authored-by: intrigus-lgtm <60750685+intrigus-lgtm@users.noreply.github.com>
@@ -11,7 +11,7 @@ flate2 = "1.0" | |||
node-types = { path = "../node-types" } | |||
tree-sitter = "0.19" | |||
tree-sitter-embedded-template = "0.19" | |||
tree-sitter-ruby = { git = "https://github.com/tree-sitter/tree-sitter-ruby.git", rev = "1ebfdb288842dae5a9233e2509a135949023dd82" } | |||
tree-sitter-ruby = { git = "https://github.com/tree-sitter/tree-sitter-ruby.git", rev = "1a3936a3545c0bd9344a0bf983fafc7e17443e39" } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively, we could just remove the rev
field here and rely on Cargo.lock
to pin the exact revision. Assuming we want to stay relatively up-to-date with tree-sitter-ruby
changes, this would help us do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would work, although I'd like to be explicit. Things won't actually work if you refreshed the lock file because a minor grammar change might require dbscheme, library and upgrade/downgrade scripts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah but then tests would fail right? Either approach works, but I do worry that we will forget to keep this up-to-date over time unless there's some automation around it.
@@ -252,10 +252,24 @@ class Lambda extends Callable, BodyStmt, TLambda { | |||
|
|||
/** A block. */ | |||
class Block extends Callable, StmtSequence, Scope, TBlock { | |||
/** | |||
* Get a local variable declared by this block. | |||
* For example `local` in `{ | param; local| puts param }`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had no idea this was a feature. Is it new?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a very old feature, just rarely used.
@@ -248,9 +252,6 @@ private module Cached { | |||
casePattern(g) | |||
) | |||
} or | |||
TScopeResolutionMethodCall(Ruby::ScopeResolution g, Ruby::Identifier i) { | |||
isScopeResolutionMethodCall(g, i) | |||
} or |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we removing this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The parser now correctly distinguishes between calls using ::
and scope resolutions for constants, so this code is now dead. expr::identifier
is a call with ::
as the call operator, and expr::Constant
is a scope resolution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah so expr::identifer
is now considered equivalent to expr.identifier
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes indeed. Both are a Call
but their getOperator()
will return a different token.
or | ||
toGenerated(result) = g.getMethod().(Ruby::ArgumentList).getChild(n) | ||
} | ||
final override Expr getArgumentImpl(int n) { toGenerated(result) = g.getArguments().getChild(n) } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a nice cleanup - weird that Ruby::ArgumentList
was previously considered a "method"...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that made no sense at all, and had been on my list of things to fix for a long time ;-)
# 6| getStmt: [BooleanLiteral] true | ||
# 7| getStmt: [BooleanLiteral] TRUE | ||
# 7| getStmt: [ConstantReadAccess] TRUE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice 👍
I didn't know we were incorrectly interpreting these as booleans before. Good to see it fixed.
On the upgrade/downgrade scripts: I have a sense of what they're doing but don't know how to ensure that they work properly. Do we have any tests that ensure they give the expected result? We have a "Check DB upgrade/downgrade script" test but I don't know if it does much more than just run them and make sure nothing throws an error. |
The I validated the behaviour of the upgrade scripts by building the These are the steps I used:
For testing the downgrade scripts you can do the opposite, compile a new extractor, but run the QL tests with the old library.
Note that there are some differences in the test output, but these should be benign or otherwise expected. |
ruby/ql/lib/upgrades/9fdd1d40fd3c3f8f9db8fabf5a353580d14c663a/ruby_call_method.ql
Outdated
Show resolved
Hide resolved
RubyScopeResolution() { | ||
exists(RubyConstant name | ruby_scope_resolution_def(this, name)) and | ||
not ruby_call_def(_, this) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was expecting that we'd simply drop rows where the second column is not a @ruby_token_constant
, so this charpred seems like it could be deleted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The not ruby_call_def(_, this)
is still important. It says that things like expr::Foo()
are not scope resolutions. Even though Foo
looks like a constant name, it is not.
ruby_ast_node_info(node, _, _, loc) and | ||
parent instanceof ScopeResolutionMethodCall and | ||
node = | ||
rank[index + 1](RubyAstNode child, int x, int oldIndex | | ||
exists(RubyAstNodeParent oldParent | | ||
ruby_ast_node_info(child, oldParent, oldIndex, _) and | ||
child != parent.(ScopeResolutionMethodCall).getScopeResolution() | ||
| | ||
oldParent = parent and x = 1 | ||
or | ||
oldParent = parent.(ScopeResolutionMethodCall).getScopeResolution() and x = 0 | ||
) | ||
| | ||
child order by x, oldIndex | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain what this is doing, and why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A call like expr::Foo(args) block
used to be represented as Call{ method: ScopeResolution(expr, Foo), arguments: args, block: block}
. The children of the Call node were:
- ScopeResolution(expr, Foo)
- args
- block
The new parse tree is like Call{ receiver: expr, method: Foo, arguments: args }
. For this Call the children should be:
- expr
- Foo
- args
- block
To fix the parent child relation for the Call we simply take the children of the ScopeResolution
first (x=0
) ordered by oldIndex
followed by the other children (x=1
) of the original Call in their oldIndex
order. The rank
is used to ensure the new index is consecutive even if some of the children are missing.
042bd24
to
ba5b715
Compare
Co-authored-by: Nick Rolfe <nickrolfe@github.com>
ba5b715
to
907c3db
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice!
This pull request updates the tree-sitter-ruby grammar used by our extractor. The pull request implements up- and downgrade scripts that are semantics preserving except in case of
complex
literals (1i
,3.14i
,3ri
,etc). In the new grammarcomplex
literals are no longer a simple token but a node with a float, integer, or rational literal as child. As a result an upgrade script would have to create new nodes, which is not possible. Instead the upgrade script simply convertscomplex
literals tofloat
literals. This is wrong, but preserves the structure of the AST so there are no gaps in the control and data flow graphs.