Skip to content

Systemd-resolved proper handling #37485

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Aug 1, 2018
Merged

Systemd-resolved proper handling #37485

merged 3 commits into from
Aug 1, 2018

Conversation

fcrisciani
Copy link
Contributor

- What I did
Added code to handle case where the OS is using systemd-resolved
Cleaned up some of the code in the resolv.conf area
Moved option NetworkControlPlaneMTU under networking options
Added a library to fetch the pid of a process, coming from (https://github.com/kubernetes/kubernetes/tree/master/pkg/util/procfs) minus, things not useful and avoiding other dependencies

- How to verify it
Did not change functionality so current testing should cover it already (moby and libnetwork)

- Description for the changelog

Handle systemd-resolved case providing appropriate resolv.conf to networking layer

- A picture of a cute animal (not mandatory but encouraged)
cute-baby-animals-05

@fcrisciani fcrisciani changed the title Resolv Systemd-resolved proper handling Jul 17, 2018
@fcrisciani
Copy link
Contributor Author

will update the vendoring as the libnetwork side gets merged
cc @yongtang, because some of the recent code changes were done by you
@msabansal @johnstep to verify that moving some of the code I'm not removing things that are needed form windows
@thaJeztah just as a general review and to check on the import of the procfs from kube

Copy link
Member

@johnstep johnstep left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good from a Windows perspective.

@fcrisciani fcrisciani force-pushed the resolv branch 3 times, most recently from d1b4b19 to 643787c Compare July 17, 2018 23:23
@codecov
Copy link

codecov bot commented Jul 17, 2018

Codecov Report

❗ No coverage uploaded for pull request base (master@c3a0207). Click here to learn what that means.
The diff coverage is 40.98%.

@@            Coverage Diff            @@
##             master   #37485   +/-   ##
=========================================
  Coverage          ?   34.96%           
=========================================
  Files             ?      611           
  Lines             ?    44939           
  Branches          ?        0           
=========================================
  Hits              ?    15714           
  Misses            ?    27097           
  Partials          ?     2128

Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some questions/thoughts 🤗


func getPids(re *regexp.Regexp) []int {
pids := []int{}
filepath.Walk("/proc", func(path string, info os.FileInfo, err error) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering; would shelling out to pidof work? (would it be slower?), e.g.;

pids, err := exec.Command("pidof", "some-binary").Output()
# check error and len(pids) > 0

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

more worried if the pidof is available on every platform and if it is reachable from the daemon (like is in the $PATH) etc

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure Walk is really what we want here since it traverses every dir recursively.

What do you think about some either hitting the systemd-resolved API or checking /etc/resolv.conf for 127.0.0.53 instead?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

more worried if the pidof is available on every platform and if it is reachable from the daemon (like is in the $PATH) etc

Good point; looking at http://man7.org/linux/man-pages/man1/pidof.1.html, it's part of https://gitlab.com/procps-ng/procps. I think it's fairly common to be installed (at least on common distros), but we may want to check.

I had the same concerns as @cpuguy83; recursively traversing /proc just for detection looks like a lot of overhead. Not sure if string-matching for 127.0.0.53 in /etc/resolv.conf is a good solution though; sounds a bit brittle.

We should consider to what extend we want the automatic-detection to be "perfect", and possibly add a manual configuration, as discussed in moby/libnetwork#2232 (comment)

(...) is checking for a particular binary running the right way to do this? K8s is mainly doing it to issue a warning so that's "safe" in a sense. But transparently substituting in libnetwork doesn't seem robust. Maybe we should do what k8s is doing even more so: require the user to provide an option for a default resolv.conf file when starting up the engine. (...)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I opened a PR on k8s repo here to optimize the pid lookup: kubernetes/kubernetes#66367

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thaJeztah being in the config, does not automatically exposing it as an option? so if the user is not satisfied on the auto selection can specify it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cpuguy83 will wait for that PR to be merged and will update, thanks for opening it


var isSystemdResolvedPresent bool

func init() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was thinking if we should do this "on demand", e.g.:

var systemdResolvedRunning *bool

func systemdResolvedPresent() bool {
	if systemdResolvedRunning != nil {
		return systemdResolvedRunning
	}
	systemdResolvedRunning = false
	pids, err := procfs.PidOf("systemd-resolved")
	if err != nil {
		logrus.Errorf("unable to check systemd-resolved status: %s", err)
	}
	if len(pids) > 0 && pids[0] > 0 {
		isSystemdResolvedPresent = true
	}
	return systemdResolvedRunning
}

But also noticed that would only need this if conf.ResolvConf == ''; which will be only once (once GetResolvConf() is called, and it's not set, it will be set to conf.ResolvConf = defaultResolvConf)

So perhaps we can just as well inline this in GetResolvConf() ? (or a combination of the above)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think is expensive to check: procfs.PidOf("systemd-resolved") for every container creation

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we do this during daemon initialization instead of package init?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure don't have a, is there a specific point that you can suggest?
Just drop something like this one: https://github.com/moby/moby/blob/master/daemon/daemon.go#L569

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As good a place as any, I suppose.

vendor.conf Outdated
@@ -37,7 +37,7 @@ github.com/mitchellh/hashstructure 2bca23e0e452137f789efbc8610126fd8b94f73b
#get libnetwork packages

# When updating, also update LIBNETWORK_COMMIT in hack/dockerfile/install/proxy accordingly
github.com/docker/libnetwork d00ceed44cc447c77f25cdf5d59e83163bdcb4c9
github.com/docker/libnetwork 05b69afbdd26121d695f9c5a41c34316ebb242bf https://github.com/fcrisciani/libnetwork
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this needs vendoring; let me add that label, otherwise people may overlook that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, I put it in this comment: #37485 (comment)

@fcrisciani
Copy link
Contributor Author

@cpuguy83 wdyt of this one?

logrus.Infof("systemd-resolved is running, so using resolvconf: %s", alternateResolvConf)
config.ResolvConf = alternateResolvConf
}
logrus.Infof("systemd-resolved is running, but %s is not present, fallback to %s", alternateResolvConf, defaultResolvConf)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

else?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

even if alternateResolvConf is present, "alternateResolvConf is not present" is printed here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry was looking at the wrong line

@fcrisciani fcrisciani requested a review from tianon as a code owner July 25, 2018 18:11
@fcrisciani fcrisciani force-pushed the resolv branch 5 times, most recently from 22bd786 to cbf69f0 Compare July 26, 2018 18:10
Flavio Crisciani added 2 commits July 26, 2018 11:17
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
Handle the case of systemd-resolved, and if in place
use a different resolv.conf source.
Set appropriately the option on libnetwork.
Move unix specific code to container_operation_unix

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

two questions/notes, but otherwise looks good

@@ -394,13 +394,10 @@ func (n *networkNamespace) InvokeFunc(f func()) error {
// InitOSContext initializes OS context while configuring network resources
func InitOSContext() func() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did this file end up in the wrong commit?

@@ -63,21 +63,13 @@ func (daemon *Daemon) buildSandboxOptions(container *container.Container) ([]lib

if container.HostConfig.NetworkMode.IsHost() {
sboxOptions = append(sboxOptions, libnetwork.OptionUseDefaultSandbox())
if len(container.HostConfig.ExtraHosts) == 0 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is not used for LCOW?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, just see @johnstep's comment; so ignore this one; only check if the vendored file ended up in the right commit

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@johnstep can you confirm the LCOW behavior here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code path cannot be hit on Windows (including LCOW) because IsHost always returns false on Windows.

Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let me LGTM

@thaJeztah
Copy link
Member

z failure is unrelated; merging

@nickbroon
Copy link

Is this likely to appear in the next release of docker-ce?

@nickbroon
Copy link

nickbroon commented Oct 23, 2018

Answering my own question: Yes, this appears to be in the 18.09 beta.

@kenden
Copy link

kenden commented Nov 5, 2018

List of commits in the beta: https://github.com/docker/docker-ce/releases/tag/v18.09.0-rc1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.