Skip to content

zfs: fix ebusy on umount etc #35674

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Dec 28, 2017
Merged

zfs: fix ebusy on umount etc #35674

merged 1 commit into from
Dec 28, 2017

Conversation

kolyshkin
Copy link
Contributor

This commit is a set of fixes and improvement for zfs graph driver,
in particular:

  1. Remove mount point after umount in Get() error path, as well
    as in Put() (with MNT_DETACH flag). This should solve "ebusy
    on umount" error reported in Container removal broken with ZFS as of 17.11.0-ce #35642.

  2. Fix/improve error handling in Get() -- do not try to umount
    if refcount > 0

  3. Simplifies unmount in Get(). Specifically, remove call to
    graphdriver.Mounted() (which checks if fs is mounted using
    statfs() and check for fs type) and mount.Unmount() (which
    parses /proc/self/mountinfo). Calling unix.Unmount() is
    simple and sufficient.

  4. Add unmounting of driver's home to Cleanup().

Signed-off-by: Kir Kolyshkin kolyshkin@gmail.com

- How to verify it

- Description for the changelog

  • zfs: fix busy error on container stop

This commit is a set of fixes and improvement for zfs graph driver,
in particular:

1. Remove mount point after umount in `Get()` error path, as well
   as in `Put()` (with `MNT_DETACH` flag). This should solve "failed
   to remove root filesystem for <ID> .... dataset is busy" error
   reported in Moby issue 35642.

To reproduce the issue:

   - start dockerd with zfs
   - docker run -d --name c1 --rm busybox top
   - docker run -d --name c2 --rm busybox top
   - docker stop c1
   - docker rm c1

Output when the bug is present:
```
Error response from daemon: driver "zfs" failed to remove root
filesystem for XXX : exit status 1: "/sbin/zfs zfs destroy -r
scratch/docker/YYY" => cannot destroy 'scratch/docker/YYY':
dataset is busy
```

Output when the bug is fixed:
```
Error: No such container: c1
```
(as the container has been successfully autoremoved on stop)

2. Fix/improve error handling in `Get()` -- do not try to umount
   if `refcount` > 0

3. Simplifies unmount in `Get()`. Specifically, remove call to
   `graphdriver.Mounted()` (which checks if fs is mounted using
   `statfs()` and check for fs type) and `mount.Unmount()` (which
   parses `/proc/self/mountinfo`). Calling `unix.Unmount()` is
   simple and sufficient.

4. Add unmounting of driver's home to `Cleanup()`.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
mountpoint := d.mountPath(id)
if count := d.ctr.Increment(mountpoint); count > 1 {
return containerfs.NewLocalContainerFS(mountpoint), nil
}
defer func() {
if retErr != nil {
if c := d.ctr.Decrement(mountpoint); c <= 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any locks here to prevent another process from increment at the same time this decrements?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, d.ctr is of graphdriver.RefCounter type which does inc/dec under a mutex (see https://github.com/moby/moby/blob/master/daemon/graphdriver/counter.go)

@BlinkyStitt
Copy link

Do you think this PR will fix my issue with ntp or is that different? openzfs/zfs#1810 (comment)

@kolyshkin
Copy link
Contributor Author

kolyshkin commented Dec 4, 2017

Do you think this PR will fix my issue with ntp or is that different? openzfs/zfs#1810 (comment)

@wysenynja yes, it might (if the kernel you use is sufficiently recent). Update: yes, it appears to be fixing it.

@vieux
Copy link
Contributor

vieux commented Dec 5, 2017

SGTM

if mntErr := unix.Unmount(mountpoint, 0); mntErr != nil {
logrus.Errorf("Error unmounting %v: %v", mountpoint, mntErr)
}
if rmErr := unix.Rmdir(mountpoint); rmErr != nil && !os.IsNotExist(rmErr) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe don't rmdir if the unmount failed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unmount may failure for a handful of reasons, not just EBUSY, and Rmdir is harmless (i.e. it won't remove any data, just an empty directory), so it's slightly more robust to try it anyway.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kolyshkin my bad, I read it as a RemoveAll

@kolyshkin
Copy link
Contributor Author

I believe I have addressed all the review comments.

@tonistiigi
Copy link
Member

LGTM

@tonistiigi tonistiigi merged commit daded8d into moby:master Dec 28, 2017
@BlinkyStitt
Copy link

Will this be in the next release? If not, what version should I wait for to get this fix?

@thaJeztah
Copy link
Member

This will be included in the Docker 18.01 release

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants