Skip to content

Python: All dict constructor args are relevant #17566

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Oct 4, 2024

Conversation

yoff
Copy link
Contributor

@yoff yoff commented Sep 24, 2024

Pull Request checklist

All query authors

Internal query authors only

  • Autofixes generated based on these changes are valid, only needed if this PR makes significant changes to .ql, .qll, or .qhelp files. See the documentation (internal access required).
  • Changes are validated at scale (internal access required).
  • Adding a new query? Consider also adding the query to autofix.

tausbn
tausbn previously approved these changes Sep 24, 2024
Copy link
Contributor

@tausbn tausbn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. 👍

joefarebrother
joefarebrother previously approved these changes Sep 25, 2024
This is handled by parameter-argument matching
@yoff yoff dismissed stale reviews from joefarebrother and tausbn via 9357762 September 30, 2024 22:03
@yoff
Copy link
Contributor Author

yoff commented Oct 1, 2024

I was not comfortable with the dataflow consistency violations. It turned out we had superflous code, leading to many arguments in the same position. That is fixed now, and I added a test showing that we do see taint through later arguments.

@yoff yoff marked this pull request as ready for review October 1, 2024 08:03
@yoff yoff requested a review from a team as a code owner October 1, 2024 08:03
@yoff yoff added the no-change-note-required This PR does not need a change note label Oct 1, 2024
Comment on lines -375 to -378
exists(int index1, int index2 | ppos.isPositionalLowerBound(index1) and index2 >= index1 |
result.getParameter() = func.getArg(index2 + this.positionalOffset())
)
or
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why you are removing this bit of code.

From my point of view, it looks like it does the right thing. If you ask for a parameter at position 1.. it will give you all the positional parameters except the first one. Why would we not want that?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see a comment about dataflow inconsistencies. I would recommend to include them in the git history next time, so it's possible to know what they look like... We do currently use arguments/parameters in a slightly different way than most other dataflow libraries, which means we get inconsistency errors. (which is still something we should solve 😮‍💨)

Anyway, can you please show the inconsistency warnings, then we can discuss the right approach?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anyway, can you please show the inconsistency warnings, then we can discuss the right approach?

Sure, you can see some here: https://github.com/github/semmle-code/actions/runs/11020219625/job/30604524761#step:15:830
(Found via the red x on the commit before they were fixed. But that will expire at some point.)

Interesting idea to quote them in the commit message, I will remember that 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the same token, I should quote them here rather than link...
For this snippet:

def use_of_apply(func, args):
    apply(func, args)

we get alerts

callable pos parameter msg
Function use_of_apply position 0.. ControlFlowNode for func Parameters with overlapping positions.
Function use_of_apply position 0.. ControlFlowNode for args Parameters with overlapping positions.

stating that all parameters are now in conflict.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting idea to quote them in the commit message, I will remember that 👍

You can commit the files in the CONSISTENCY directory as part of your commit when the warnings are introduced, then you don't need to quote anything in the commit message. (and obviously remove them once you resolve the problems).

(this is all assuming you run the consistency checks locally, which you can do by setting the --consistency-queries flag when using the CLI)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that is even more structured :-)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I would have liked is some details around why we need this. Here's the details I've gathered myself

  1. we only create these parameter positions when Argument[<n>..] is used in MaD
    TPositionalParameterLowerBoundPosition(int pos) {
    FlowSummaryImpl::ParsePositions::isParsedArgumentLowerBoundPosition(_, pos)
    } or
  2. Currently we only use that in a test-file (see this search)
  3. So the addition of Argument[0..] in this PR caused us to start having a TPositionalParameterPosition available, which highlighted some problems.

We use DataFlowCallable.getParameter in multiple places, for example in ParameterNodeImpl:

predicate isParameterOf(DataFlowCallable c, ParameterPosition ppos) {
this = c.getParameter(ppos)
}

dataflow library generally expects parameters to only be found at one position. (which is why we're now getting the consistency warnings on all positional arguments)

The one place that I could find where removing isPositionalLowerBound support from DataFlowFunction.getParameter could end up having impact is in type-tracking implementation:

/**
* Holds if `nodeFrom` steps to `nodeTo` by being passed as a parameter in a call.
*
* Flow into summarized library methods is not included, as that will lead to negative
* recursion (or, at best, terrible performance), since identifying calls to library
* methods is done using API graphs (which uses type tracking).
*/
predicate callStep(Node nodeFrom, LocalSourceNode nodeTo) {
exists(
DataFlowPrivate::DataFlowCall call, DataFlowPrivate::DataFlowCallable callable,
DataFlowPrivate::ArgumentPosition apos, DataFlowPrivate::ParameterPosition ppos
|
nodeFrom = call.getArgument(apos) and
nodeTo = callable.getParameter(ppos) and
DataFlowPrivate::parameterMatch(ppos, apos) and
callable = call.getCallable()
)
}

However, since the assumption is that isPositionalLowerBound is ONLY used for "library callables", and those are not supported in callStep anyway, the change doesn't have any implications. (and flow-summary-impl uses parameterMatch internally, so no problem there)

That made me think, what about the "simple" library-callables defined with getACallSimple (not using API graph)? I think I found the logic that handles parameter positions for those, and I don't see anything about isPositionalLowerBound. I guess this issue hasn't come up since (a) we haven't used isPositionalLowerBound for anything, and (b) we only use getACallSimple in tests -- but the details are a bit foggy, so I might have gotten something wrong in my reasoning. _(I see some test failures in the PR that introduced isPositionalLowerBound, although the details are not available any longer (and not part of the git history)... I assume that must have been for the handling of flow-summaries for "library callables" that uses the API graph, for which we have tests).


Overall I think it looks good to remove this code from getParameter, but it wasn't really clear to me before spending almost an hour looking through these things. I think we might have a bug in our handling of getACallSimple, which honestly doesn't seem super important.

However, I would strongly recommend a comment explaining why isPositionalLowerBound shouldn't be handled in getParameter (so it doesn't just look like an omission when someone checks if all parameter positions are handled in there)

Co-authored-by: Rasmus Wriedt Larsen <rasmuswriedtlarsen@gmail.com>
@yoff yoff requested a review from RasmusWL October 1, 2024 11:13
tausbn
tausbn previously approved these changes Oct 1, 2024
@yoff yoff requested a review from RasmusWL October 3, 2024 10:19
yoff and others added 2 commits October 4, 2024 14:01
Co-authored-by: Rasmus Wriedt Larsen <rasmuswriedtlarsen@gmail.com>
@yoff yoff merged commit 306b087 into github:main Oct 4, 2024
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
no-change-note-required This PR does not need a change note Python
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants