Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upSERVER-41854 Add meta projection $shardName. #1317
Conversation
Thanks for the pull request! We'll look into it. |
Hi @krk , Thanks for submitting this pull request! I've left a few comments inline, but I'll try to summarize the major points of feedback here:
I'd be happy to work with you in order to get these changes in shape for merge, if you're interested! As I mentioned in my message in SERVER-14423, I'd also be happy to suggest other tickets that may be more straightforward to implement and therefore easier for us to accept. Please feel free to reach out if you have any questions about my comments, or if anything is unclear! Best, |
@@ -77,6 +77,18 @@ double textScore(const WorkingSetMember& member) { | |||
} | |||
} | |||
|
|||
StringData shardName(const WorkingSetMember& member) { |
This comment has been minimized.
This comment has been minimized.
dstorch
Oct 9, 2019
Contributor
We are currently rewriting the way that projections are executed for find commands, so that find commands and aggregate commands share a projection executor implementation. Unfortunately, that means that there has been a lot of code change in this file, and I'm afraid this part of the change won't merge.
@@ -119,6 +119,9 @@ ProjectionExec::ProjectionExec(OperationContext* opCtx, | |||
_needsGeoNearDistance = true; | |||
} else if (e2.valuestr() == QueryRequest::metaIndexKey) { | |||
_hasReturnKey = true; | |||
} else if (e2.valuestr() == QueryRequest::metaShardName) { |
This comment has been minimized.
This comment has been minimized.
dstorch
Oct 9, 2019
Contributor
This is another example of code that is going to be deleted soon, as part of our efforts to consolidate the implementations of find projection and agg projection. Ideally, the only place where we would have to change projection execution code would in the $meta agg expression:
mongo/src/mongo/db/pipeline/expression.cpp
Lines 2601 to 2634 in 97ffc69
We would need new code there that would know how to handle a $meta:"shardName" expression.
@@ -0,0 +1,104 @@ | |||
/** |
This comment has been minimized.
This comment has been minimized.
dstorch
Oct 9, 2019
Contributor
The design of the execution engine is to have as few PlanStages as possible for only the most fundamental query execution concepts. (On the other hand, the number of expressions and accumulators should be quite large and should grow to meet the needs of applications.) Therefore, the query team is reluctant to add a new PlanStage just for the purpose of attaching shard name metadata.
This brings up an interesting question of who should be responsible for attaching the shardName metadata. One idea is to attach this in the ShardFilterStage. We should only attach this metadata, however, if static analysis of the query shows that there is a subsequent $meta:"shardName" expression. There are some changes in the works as part of SERVER-43146 that should make it easier to determine whether the query has this "shardName" metadata dependency.
namespace mongo { | ||
|
||
ShardNamerImpl::ShardNamerImpl(ScopedCollectionMetadata md) : _metadata(std::move(md)), | ||
_shardName(isCollectionSharded() ? _metadata->shardId() : StringData("")) { |
This comment has been minimized.
This comment has been minimized.
dstorch
Oct 9, 2019
•
Contributor
I have consulted with one of the lead engineers on the sharding team, and we think that the way in which we obtain the shard name should be slightly different. Rather than getting the name from the ScopedCollectionMetadata, it can be obtained directly from the mongod's sharding state:
mongo/src/mongo/db/s/sharding_state.h
Line 99 in 6978f67
This will work so long as the code which attaches the "shardName" always runs on a mongod, which it should.
} | ||
|
||
private: | ||
ScopedCollectionMetadata _metadata; |
This comment has been minimized.
This comment has been minimized.
dstorch
Oct 9, 2019
Contributor
Another reason to avoid the ShardNamer
interface is that it results in the query plan holding two copies of the ScopedCollectionMetadata
. As initially conceived, the design is for a query plan to hold a single ScopedCollectionMetadata
inside of the ShardFilterStage
.
@@ -622,9 +623,10 @@ Status QueryRequest::validate() const { | |||
BSONObjIterator projIt(_proj); | |||
while (projIt.more()) { | |||
BSONElement projElt = projIt.next(); | |||
if (isTextScoreMeta(projElt)) { | |||
if (isTextScoreMeta(projElt) || isShardNameMeta(projElt)) { |
This comment has been minimized.
This comment has been minimized.
dstorch
Oct 9, 2019
Contributor
I'm not sure that this part of the change is needed. If I'm reading the code correctly, this PR does not add the ability to perform a $meta-sort by "shardName", as in {$sort: {foo: {$meta: "shardName"}}}. Furthermore, the query team is interested in relaxing the rule for "textScore" which says that a "textScore" $meta sort must have a corresponding "textScore" $meta projection. See SERVER-43816.
This comment has been minimized.
This comment has been minimized.
krk
Oct 18, 2019
Author
I have duplicated the implementation of textScore
in most places, that is why this condition was added, removing.
Hi @dstorch, Thank you for the detailed and helpful review. I am interested in making the necessary changes to this PR after the rewriting of the implementation of projection execution is complete. This PR was already created when you offered to suggest areas to work on in SERVER-14423. I will try to get this mergeable if there are not too many conflicts. Otherwise, I would prefer to wait until the rewrite is complete. Also, I am open to other mentored tasks that can be completed in less time than the current PR. Cheers, |
This comment was marked as outdated.
This comment was marked as outdated.
Hi @dstorch, I have removed the The In It is not clear to me if Also, if you could give some pointers on what kind of tests to write first, it would be very helpful. |
Maybe this year this PR will get some love :) @dstorch |
krk commentedJul 6, 2019
•
edited
I have added a new stage called
ShardNameStage
to add shard name to a$meta
projection. It made sense to meld the functionality of the existing stagesShardFilterStage
andTextOrStage
intoShardNameStage
and to modify it to output the current shard name. This might be an incorrect route for implementation, please provide guidance.To start sharded instances:
I have tested it locally with mlaunch. There are no unit tests in this PR as of now, as I might be completely lost while implementing it. If functionality is as expected and modifications fit or be made to fit the expected architecture, I can add some unit tests.