-
Notifications
You must be signed in to change notification settings - Fork 18.7k
Refresh containerd remotes on containerd restarted #36173
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
ping @thaJeztah |
LGTM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 🐯
|
@thaJeztah Windows doesn't use |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Well, something here seems to not like CI.... |
The windows CI is definitely unrelated, since this code is not even built on windows. I see that |
Janky is also stuck waiting for the daemon to start, multiple times now. |
The stack trace would help, I can't see how this is causing it so far :? |
d11d95b
to
2247756
Compare
Before this patch, when containerd is restarted (due to a crash, or kill, whatever), the daemon would keep trying to process the event stream against the old socket handles. This would lead to a CPU spin due to the error handling when the client can't connect to containerd. This change makes sure the containerd remote client is updated for all registered libcontainerd clients. This is not neccessarily the ideal fix which would likely require a major refactor, but at least gets things to a working state with a minimal patch. Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2247756
to
400126f
Compare
Found the issue. It was due to recursive locking in |
And green. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 🐸
Install docker 17.12.1-ce-rc2 to fix moby/moby#36173
… feature/merge-in-v2.3.5 to feature/add_latest_vault_plugin * commit 'bcb6fe5980595d336a08c9b16333c58e733a1758': (52 commits) use stable agent use ami that doesn't require usage agreement Update changelog for 2.3.5-rc2 Bump changelog for v2.3.5 Show docker logs if docker isn't running Fix linting issue in docker installer Install 17.12.1-ce-rc2 to fix moby/moby#36173 Revert "Merge pull request buildkite#360 from buildkite/vpc-rewrite-for-all-available-subnets" Update changelog for v2.3.1-v2.3.4 Add some troubleshooting tips to README Configure docker config in boothook to avoid race conditions Check docker is running in environment Remove du output, will be too slow Try cleaning up disk in the environment hook if needed Trim leading whitespace from disk space output 💅🏻 Make disk space output more human friendly Append to elastic-stack.log rather than overwrite Only redirect cron output if caller is root Tests should run cron as root Add logging to cron tasks and show disk usage in stack environment ...
… feature/add_latest_vault_plugin to use-vault-plugin * commit 'f50fdfc20190313bd80ddabe1b69fa21d09b3e29': (55 commits) use stable agent use ami that doesn't require usage agreement specify vault-backend branch for vault plugin update to latest vault plugin update to latest vault plugin Update changelog for 2.3.5-rc2 Bump changelog for v2.3.5 Show docker logs if docker isn't running Fix linting issue in docker installer Install 17.12.1-ce-rc2 to fix moby/moby#36173 Revert "Merge pull request buildkite#360 from buildkite/vpc-rewrite-for-all-available-subnets" Update changelog for v2.3.1-v2.3.4 Add some troubleshooting tips to README Configure docker config in boothook to avoid race conditions Check docker is running in environment Remove du output, will be too slow Try cleaning up disk in the environment hook if needed Trim leading whitespace from disk space output 💅🏻 Make disk space output more human friendly Append to elastic-stack.log rather than overwrite ...
In case we are killing and restarting containerd, do try to reconnect and use the new connection. Loosely based on moby#36173 Might help https://bugzilla.redhat.com/show_bug.cgi?id=1746435 Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Before this patch, when containerd is restarted (due to a crash, or
kill, whatever), the daemon would keep trying to process the event
stream against the old socket handles. This would lead to a CPU spin due
to the error handling when the client can't connect to containerd.
This change makes sure the containerd remote client is updated for all
registered libcontainerd clients.
This is not necessarily the ideal fix which would likely require a
major refactor, but at least gets things to a working state with a
minimal patch.
Fixes #36002