Skip to content

seccomp: allow ptrace(2) for 4.8+ kernels #38137

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 5, 2019

Conversation

tonistiigi
Copy link
Member

4.8+ kernels have fixed the ptrace security issues
so we can allow ptrace(2) on the default seccomp
profile if we do the kernel version check.

torvalds/linux@93e35ef

Signed-off-by: Tonis Tiigi tonistiigi@gmail.com

@tonistiigi
Copy link
Member Author

cc @justincormack @jessfraz

@codecov
Copy link

codecov bot commented Nov 4, 2018

Codecov Report

Merging #38137 into master will decrease coverage by 0.43%.
The diff coverage is 30.76%.

@@            Coverage Diff             @@
##           master   #38137      +/-   ##
==========================================
- Coverage   36.55%   36.12%   -0.44%     
==========================================
  Files         610      610              
  Lines       45275    45242      -33     
==========================================
- Hits        16551    16342     -209     
- Misses      26394    26656     +262     
+ Partials     2330     2244      -86

4.8+ kernels have fixed the ptrace security issues
so we can allow ptrace(2) on the default seccomp
profile if we do the kernel version check.

torvalds/linux@93e35ef

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
"args": null,
"comment": "",
"includes": {
"minKernel": "4.8.0"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know generally we tried to avoid kernel version checks; Will this be an issue if distros backport this fix? (i.e., would there be another approach so that the default could manually be overridden at compile time?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a distro backports they will just not get the ptrace support even if it may be secure for their kernel as well. User can still fix it in that case by providing a custom profile or upgrading the kernel. 4.8 is documented the switch point on the man page. Even if I would dynamically detect the behavior of the current system there is no way to mark that condition in the profile.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, not sure if there's a cleaner solution. Was just thinking if (e.g.) Red Hat decides to backport, and Docker wants to ship Docker EE packages for RHEL with a correct profile

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can ship a different default profile anyway, so I don't think this is an issue.

Copy link

@abudnik abudnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In theory, we might want to specify a range or list of ranges of kernel versions, e.g. ["4.4.0", null], [null, "4.4.0"], [["3.16.0", "4.4.0"], ...]. It would be great to consider adding ranges into the current Seccomp profile format, so that it will be easier to implement ranges in the future in a backward-compatible way.

"ptrace"
],
"action": "SCMP_ACT_ALLOW",
"args": null,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using empty array [] instead of null for consistency.

@tonistiigi
Copy link
Member Author

I can change it to any syntax that maintainers like to have. The current one came from discussing different variants (including the range one) with @justincormack

@cpuguy83
Copy link
Member

I'm fine with this.

@abudnik
Copy link

abudnik commented Nov 26, 2018

Ranges format makes sense if some severe bug appeared on some kernel version A and fixed later on some version B.

If adding ranges format in the future is not hard and is backward-compatible, then I have nothing against the current format : )

@tonistiigi
Copy link
Member Author

ping @justincormack @tiborvass

Copy link
Contributor

@tiborvass tiborvass left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving this after talking with @ncopa. He's also warning about abuse of the kernel version field.

@@ -95,6 +96,21 @@ func setupSeccomp(config *types.Seccomp, rs *specs.Spec) (*specs.LinuxSeccomp, e

newConfig.DefaultAction = specs.LinuxSeccompAction(config.DefaultAction)

var currentKernelVersion *kernel.VersionInfo
kernelGreaterEqualThan := func(v string) (bool, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure I like defining local closures like this, can you just make it a helper function external to this? (yes, Go should allow local functions...)

@@ -110,6 +126,13 @@ Loop:
}
}
}
if call.Excludes.MinKernel != "" {
if ok, err := kernelGreaterEqualThan(call.Excludes.MinKernel); err != nil {
return nil, err
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the error case should we not just include the rule anyway? ie fail closed not error.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should fail only in the case where profile had a version that couldn't be parsed. I think that case is similar to the other validation, eg. the call.Names check.

Names: []string{"ptrace"},
Action: types.ActAllow,
Includes: types.Filter{
MinKernel: "4.8.0",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically the kernel version is 4.8, there is not a 4.8.0 release, ir goes 4.8, 4.8.1...

@justincormack
Copy link
Contributor

This should have a test.

@justincormack
Copy link
Contributor

Some comments but LGTM generally. Note that some uses for ptrace will still be blocked by CAP_PTRACE and maybe Yama kernel module, but basic debug etc should work.

@tonistiigi
Copy link
Member Author

@justincormack For the test, what kind are you expecting. A unit test of some component? Integration seems tricky as this behavior is very much host dependent.

@tonistiigi
Copy link
Member Author

@justincormack And just to confirm, no ranges like #38137 (comment) , right? I'd be fine with adding it. It could be useful in the case where a syscall becomes broken in a future version, and a seccomp profile could block that without upgrading the daemon.

@tonistiigi
Copy link
Member Author

ping @justincormack

@abudnik
Copy link

abudnik commented Feb 26, 2019

@tonistiigi Does the current default Docker Seccomp profile allows ptrace syscall on kernel 4.8+ even thought a container user does not have CAP_SYS_PTRACE capability?

		{
			"names": [
				"ptrace"
			],
			"action": "SCMP_ACT_ALLOW",
			"args": null,
			"comment": "",
			"includes": {
				"minKernel": "4.8"
			},
			"excludes": {}
		},
...
		{
			"names": [
				"kcmp",
				"process_vm_readv",
				"process_vm_writev",
				"ptrace"
			],
			"action": "SCMP_ACT_ALLOW",
			"args": [],
			"comment": "",
			"includes": {
				"caps": [
					"CAP_SYS_PTRACE"
				]
			},
			"excludes": {}
		},

@tonistiigi
Copy link
Member Author

@abudnik This PR is not in any release version of Docker yet. Next version 19.03 should allow it.

@abudnik
Copy link

abudnik commented Feb 26, 2019

@tonistiigi Previously, a default profile allowed ptrace syscall only when CAP_SYS_PTRACE capability was enabled for a process. After this change, it seems that ptrace is allowed for a non-privileged containers on Linux 4.8+. Can this lead to a security issue?

@tonistiigi
Copy link
Member Author

@abudnik No it shouldn't. ptrace was blocked in seccomp because kernels before 4.8 allowed to use it for insecure things. But these have been fixed now (see linked commit) so if we know we are on the updated kernel we can allow it. ptrace is still protected by the regular kernel checks for specific arguments(eg. it will not give you same level of controls as CAP_SYS_PTRACE). If you think we might have missed something, let us know.

@AkihiroSuda
Copy link
Member

Wondering this should be disabled when --pid=host
WDYT?

@tonistiigi
Copy link
Member Author

tonistiigi commented Mar 19, 2019

I guess pid=host is already considered insecure as far as visibility into host processes is concerned. (cc @justincormack )

edit: and actually I'm not sure if this exposes anything very bad. You still don't have CAP_PTRACE .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants