Skip to content

seccomp: whitelist io-uring related system calls #39415

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Sep 11, 2019
Merged

seccomp: whitelist io-uring related system calls #39415

merged 1 commit into from
Sep 11, 2019

Conversation

omegacoleman
Copy link
Contributor

closes #39396

- What I did

I added the io-uring related system call introduced in kernel 5.1 to the seccomp whitelist.

- How I did it

This is merely a follow-up to the upstream

- How to verify it

Build docker with the master branch libseccomp, and will be run programs using io-uring without trouble.

With older kernels or older versions of libseccomp, the configure will be omitted.

- Description for the changelog

seccomp: whitelist io-uring related system calls

- A picture of a cute animal (not mandatory but encouraged)

Would be a waste for such a simple change..

@thaJeztah
Copy link
Member

Looks like CI is failing, because the json version was not updated as part of this PR;

17:32:36 The result of go generate ./profiles/seccomp/ differs
17:32:36 
17:32:36  M profiles/seccomp/default.json
17:32:36 
17:32:36 Please re-run go generate ./profiles/seccomp/

@thaJeztah
Copy link
Member

ping @justincormack @crosbymichael PTAL

@omegacoleman
Copy link
Contributor Author

Looks like CI is failing, because the json version was not updated as part of this PR;

17:32:36 The result of go generate ./profiles/seccomp/ differs
17:32:36 
17:32:36  M profiles/seccomp/default.json
17:32:36 
17:32:36 Please re-run go generate ./profiles/seccomp/

I made a mistake thinking the json file is manually written, and it should be generated. I have fixed that now but some of the ci seems to be failing because of removing some non-existing folder (C:\go), what does that mean?

@thaJeztah
Copy link
Member

Thanks! The RS5 failure can be ignored (see #39403)

@omegacoleman
Copy link
Contributor Author

Thanks! The RS5 failure can be ignored (see #39403)

The only failure left is TestAPISwarmServicesMultipleAgents in janky, which does not seem to be relevant. I read the code but it is not about seccomp.

@thaJeztah
Copy link
Member

yes, looks like that one is sometimes flaky; #23626

I'll restart CI

@omegacoleman
Copy link
Contributor Author

yes, looks like that one is sometimes flaky; #23626

I'll restart CI

This time it's only a few SKIPs & XFAILs, why is it marked as failing again?

@thaJeztah
Copy link
Member

Looks to be an issue with the docker-py tests;

15:06:25 _______________________ ExecTest.test_exec_command_demux _______________________
15:06:25 /docker-py/tests/integration/api_exec_test.py:182: in test_exec_command_demux
15:06:25     assert next(exec_log) == (b'hello out\r\n', None)
15:06:25 E   AssertionError: assert ('hello out\r...rr\r\n', None) == ('hello out\r\n', None)
15:06:25 E     At index 0 diff: 'hello out\r\nhello err\r\n' != 'hello out\r\n'
15:06:25 E     Use -v to get the full diff
15:06:25 - generated xml file: /go/src/github.com/docker/docker/bundles/test-docker-py/results.xml -

I'll kick that build

@@ -160,6 +160,9 @@ func DefaultProfile() *types.Seccomp {
"ioprio_set",
"io_setup",
"io_submit",
"io_uring_setup",
"io_uring_enter",
"io_uring_register",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you sort these alphabetically like the rest please

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done that

@justincormack
Copy link
Contributor

Are there any other new syscalls since the last update?

@thaJeztah
Copy link
Member

ping @omegacoleman could you answer @justincormack's question (and sort the list #39415 (comment))?

@omegacoleman
Copy link
Contributor Author

Are there any other new syscalls since the last update?

The precise answer is "I don't know" since I'm just a network dev not kernel dev, but even though there are, the new system call has to be added to the upstream libseccomp's syscall translation (name -> number) before it could be added to the config here, and these calls were added.

Sorry for the delay as it is well, trival & as the upstream has added the calls I was able to just override the config myself..

@codecov
Copy link

codecov bot commented Sep 5, 2019

Codecov Report

❗ No coverage uploaded for pull request base (master@4ce0402). Click here to learn what that means.
The diff coverage is 0%.

@@            Coverage Diff            @@
##             master   #39415   +/-   ##
=========================================
  Coverage          ?   37.34%           
=========================================
  Files             ?      609           
  Lines             ?    45200           
  Branches          ?        0           
=========================================
  Hits              ?    16882           
  Misses            ?    26030           
  Partials          ?     2288

@GordonTheTurtle GordonTheTurtle added the dco/no Automatically set by a bot when one of the commits lacks proper signature label Sep 5, 2019
@GordonTheTurtle
Copy link

Please sign your commits following these rules:
https://github.com/moby/moby/blob/master/CONTRIBUTING.md#sign-your-work
The easiest way to do this is to amend the last commit:

$ git clone -b "master" git@github.com:omegacoleman/moby.git somewhere
$ cd somewhere
$ git rebase -i HEAD~842354606984
editor opens
change each 'pick' to 'edit'
save the file and quit
$ git commit --amend -s --no-edit
$ git rebase --continue # and repeat the amend for each commit
$ git push -f

Amending updates the existing PR. You DO NOT need to open a new one.

@GordonTheTurtle GordonTheTurtle removed the dco/no Automatically set by a bot when one of the commits lacks proper signature label Sep 5, 2019
Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for updating; could you try rebasing the PR to get rid of the merge commit?

Signed-off-by: youcai <omegacoleman@gmail.com>
@omegacoleman
Copy link
Contributor Author

Thanks for updating; could you try rebasing the PR to get rid of the merge commit?

done

Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@thaJeztah
Copy link
Member

windows failure is unrelated; let me get this one in

@thaJeztah thaJeztah merged commit 746eab2 into moby:master Sep 11, 2019
@thaJeztah thaJeztah added this to the 20.03.0 milestone Apr 2, 2020
@cyphar
Copy link
Contributor

cyphar commented Jul 15, 2020

This seems like it's a bad idea -- io_uring allows programs to execute certain syscalls without being limited by seccomp. Currently all of the syscalls you can access through io_uring are permitted in the default profile, but I'm a little worried that this wasn't brought up at the time.

@justincormack
Copy link
Contributor

@cyphar it looks like io_uring will get its own restriction mechanisms see eg https://lwn.net/SubscriberLink/826053/c58110f9c3493661/. I really don't think we can restrict access to a syscall that lots of people want to use based on the fact it might in future add support for unsafe actions.

I am reviewing the whole seccomp policy based on what has been useful in the past.

@cyphar
Copy link
Contributor

cyphar commented Jul 15, 2020

Funnily enough, I found this PR after commenting on that LWN article. The problem with that restriction mechanism is that it doesn't really fulfil the seccomp usecase (as long as you can create a new io_uring ring, you can opt to not enable restrictions for it). But Kees has said he's interested in figuring out how to get seccomp to limit io_uring, so I am hopeful this will happen.

@omegacoleman
Copy link
Contributor Author

Funnily enough, I found this PR after commenting on that LWN article. The problem with that restriction mechanism is that it doesn't really fulfil the seccomp usecase (as long as you can create a new io_uring ring, you can opt to not enable restrictions for it). But Kees has said he's interested in figuring out how to get seccomp to limit io_uring, so I am hopeful this will happen.

From what I got from the news they're working on sth like using eBPF to filter the SQEs upon ktail update. Will try to make that configurable after kernel & libseccomp get that feature.

@AkihiroSuda
Copy link
Member

Shall we revert this until we find the solution?

@cyphar
Copy link
Contributor

cyphar commented Jul 17, 2020

I don't think it's necessary to revert. The default profile allows everything that you can do with io_uring -- and custom profiles should be modified accordingly. I was more just surprised that this wasn't mentioned at the time. io_uring will grow support for more syscalls in the future so we should keep an eye on this.

akerouanton added a commit to akerouanton/docker that referenced this pull request Nov 2, 2023
This syncs the seccomp profile with changes made to containerd's default
profile in [1].

The original containerd issue and PR mention:

> Security experts generally believe io_uring to be unsafe. In fact
> Google ChromeOS and Android have turned it off, plus all Google
> production servers turn it off. Based on the blog published by Google
> below it seems like a bunch of vulnerabilities related to io_uring can
> be exploited to breakout of the container.
>
> [2]
>
> Other security reaserchers also hold this opinion: see [3] for a
> blackhat presentation on io_uring exploits.

For the record, these syscalls were added to the allowlist in [4].

[1]: containerd/containerd@a48ddf4
[2]: https://security.googleblog.com/2023/06/learnings-from-kctf-vrps-42-linux.html
[3]: https://i.blackhat.com/BH-US-23/Presentations/US-23-Lin-bad_io_uring.pdf
[4]: moby#39415

Signed-off-by: Albin Kerouanton <albinker@gmail.com>
@thaJeztah
Copy link
Member

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

io-uring support
7 participants