Skip to content

Adding OS version info to nodes' Info struct and to the system info's API #38349

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 7, 2019

Conversation

wk8
Copy link
Contributor

@wk8 wk8 commented Dec 11, 2018

- What I did

This patch adds the OS' version to nodes' Info struct and to the system info's API.

This is needed so that we can add OS version constraints in Swarmkit, which
does require the engine to report its host's OS version (see
moby/swarmkit#2770 (comment)).

- How I did it

The OS version is parsed from the os-release file on Linux, and from the
ReleaseId string value of the SOFTWARE\Microsoft\Windows NT\CurrentVersion
registry key on Windows.

- How to verify it

Unit test included.

- Description for the changelog

Adding OS version info to the system info's API (/info)

- A picture of a cute animal (not mandatory but encouraged)

Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit concerned about using this for scheduling as we'll now end up with two vectors to schedule (image manifest / platform as advertised by engine info), which may not provide the same information (e.g. Architecture may also have to be taken into account, as well as isolation (hyper-v / process isolation).

In addition; we should consider having a different way to obtain this info, because the /info endpoint is far from lightweight to call

@wk8
Copy link
Contributor Author

wk8 commented Dec 12, 2018

Thanks for the quick review, @thaJeztah , much appreciated :)

In addition; we should consider having a different way to obtain this info, because the /info endpoint is far from lightweight to call

That endpoint seems to be the one, central, place where we gather info on hosts (and is tied directly to the types.Info struct, that is also the one, central place for this); so I'm not sure where else it would make sense to do this; could you please elaborate a bit on where you'd rather have this? Or are you suggesting we should start breaking up types.Info into multiple structs?

I'm a bit concerned about using this for scheduling as we'll now end up with two vectors to schedule (image manifest / platform as advertised by engine info), which may not provide the same information (e.g. Architecture may also have to be taken into account, as well as isolation (hyper-v / process isolation).

Agreed that there is food for thought here, as there is a lot of info we could use to match images & nodes that we don't really make use of just yet (we pretty much only use the architecture for now); however, that seems to be a discussion more relevant on the upcoming swarmkit change. This patch here is only concerned with providing more info to swarmkit.
We've been discussing that scheduling change with @wsong @anshulpundir and @dhiltgen , notably on moby/swarmkit#2770, and this is the best we've come up with. Happy to brain storm more with with you any time though :)

@thaJeztah
Copy link
Member

That endpoint seems to be the one, central, place where we gather info on hosts (and is tied directly to the types.Info struct, that is also the one, central place for this)

Yes, and I think that's a problem, because the endpoint just about touches every subsystem of dockerd (gathering information about containers (and their state), images, which plugins are available, low-level information from the storage-driver). Some of those have caused issues in the past (e.g. due to locks when determining container state).

I think we need a way to get more fine-grained information out of the daemon, especially if this endpoint is gonna be used more.

Agreed that there is food for thought here, as there is a lot of info we could use to match images & nodes that we don't really make use of just yet

I think the key discussion here is (also see #34875) that we need to finish the standardisation of the "platform" information on images, and set the same information on the daemon. I think the last discussion ended with;

  • review/define the fields that are included in the platform
  • make sure that there's a well-defined format to serialize the platform in a string
  • the platform (and string representation) should allow optional fields (e.g. OS version, and OS features may not be present on all platforms)
  • when parsing the string, those fields can be ignored by a scheduler (and when matching (multi-arch) images with the daemon/runtime)

Related issues;

There was a lengthy discussion on the original PR that added the OS and Architecture; #13921

This patch here is only concerned with providing more info to swarmkit.

Correct; it's probably fine to show this information for informational purposes; in which case the field should carry a big disclaimer that it's meant as-such, and that its content (or format) is not part of the API specification. Similar to what we added to the DriverStatus field;

moby/api/swagger.yaml

Lines 3579 to 3592 in bb7de1f

DriverStatus:
description: |
Information specific to the storage driver, provided as
"label" / "value" pairs.
This information is provided by the storage driver, and formatted
in a way consistent with the output of `docker info` on the command
line.
<p><br /></p>
> **Note**: The information returned in this field, including the
> formatting of values and labels, should not be considered stable,
> and may change without notice.

however (glancing over the discussion on the swarmkit issue), IIUC, the purpose of this change is to influence scheduling decisions (manually or automatically). If the information used here does not match the direction that image-selection will use (the discussion on #34875), there is a risk;

  • user specifies constraint to run service on Windows version Foobar.X.Y.Z nodes
  • service fails to deploy, because manifest selection uses a different algorithm
  • or service fails to deploy because the image is not compatible
  • confusion..

Note that for scheduling, another approach would be to set up engine (or node-) labels, and set constraints based on those labels.

@wk8
Copy link
Contributor Author

wk8 commented Dec 13, 2018

@thaJeztah thanks for your very interesting & extensive explanation, I understand much better where you're coming from now.

May I suggest then that while the specs solidify on the image manifest side of things as to how to normalize how to report arch & OS info, we still get this in with a big caveat in the doc pointing out this field is temporary and subject to change without notice? Then when that's normalized I'll come back and change both this and swarmkit accordingly.

I think we need a way to get more fine-grained information out of the daemon, especially if this endpoint is gonna be used more.

Agreed on this, too. We should start doing this as part of the long-term fix too, when it's clear how we want to report host info.

Addressing the rest of your comments ASAP.

@codecov
Copy link

codecov bot commented Dec 13, 2018

Codecov Report

❗ No coverage uploaded for pull request base (master@1d5748d). Click here to learn what that means.
The diff coverage is 63.63%.

@@            Coverage Diff            @@
##             master   #38349   +/-   ##
=========================================
  Coverage          ?   37.06%           
=========================================
  Files             ?      611           
  Lines             ?    45548           
  Branches          ?        0           
=========================================
  Hits              ?    16881           
  Misses            ?    26384           
  Partials          ?     2283

@thaJeztah
Copy link
Member

Discussing if we can come up with alternatives for this use-case, but will have to do a bit more thinking about that

@thaJeztah
Copy link
Member

Discussing in the maintainers meeting (again 😅 );

  • can you add this to the prometheus metrics?
  • can you cache the results?

Other than that, we're ok with this change

@wk8
Copy link
Contributor Author

wk8 commented Jan 24, 2019

Great, thanks @thaJeztah ! :)

@GordonTheTurtle GordonTheTurtle added the dco/no Automatically set by a bot when one of the commits lacks proper signature label Feb 1, 2019
@GordonTheTurtle
Copy link

Please sign your commits following these rules:
https://github.com/moby/moby/blob/master/CONTRIBUTING.md#sign-your-work
The easiest way to do this is to amend the last commit:

$ git clone -b "wk8/os_version" git@github.com:wk8/moby.git somewhere
$ cd somewhere
$ git rebase -i HEAD~842354462472
editor opens
change each 'pick' to 'edit'
save the file and quit
$ git commit --amend -s --no-edit
$ git rebase --continue # and repeat the amend for each commit
$ git push -f

Amending updates the existing PR. You DO NOT need to open a new one.

@GordonTheTurtle GordonTheTurtle removed the dco/no Automatically set by a bot when one of the commits lacks proper signature label Feb 1, 2019
@wk8 wk8 force-pushed the wk8/os_version branch 5 times, most recently from aa4ff36 to 032df29 Compare February 2, 2019 01:32
@wk8
Copy link
Contributor Author

wk8 commented Feb 2, 2019

@thaJeztah : amended as requested, could you please have another look? Thanks!

@wk8
Copy link
Contributor Author

wk8 commented May 30, 2019

@thaJeztah : amended as requested per @cpuguy83 . Thanks.

@wk8 wk8 force-pushed the wk8/os_version branch 11 times, most recently from 2a68360 to ab5ec75 Compare June 6, 2019 18:10
Copy link
Member

@cpuguy83 cpuguy83 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@wk8 wk8 force-pushed the wk8/os_version branch 5 times, most recently from 17723b8 to 9e663e2 Compare June 6, 2019 22:35
This is needed so that we can add OS version constraints in Swarmkit, which
does require the engine to report its host's OS version (see
moby/swarmkit#2770).

The OS version is parsed from the `os-release` file on Linux, and from the
`ReleaseId` string value of the `SOFTWARE\Microsoft\Windows NT\CurrentVersion`
registry key on Windows.

Added unit tests when possible, as well as Prometheus metrics.

Signed-off-by: Jean Rouge <rougej+github@gmail.com>
@wk8 wk8 force-pushed the wk8/os_version branch from 9e663e2 to d363a18 Compare June 6, 2019 22:40
@thaJeztah thaJeztah merged commit 28678f2 into moby:master Jun 7, 2019
thaJeztah added a commit to thaJeztah/docker that referenced this pull request Jun 21, 2019
This was added in d363a18 (moby#38349),
but not yet added to the API history.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
docker-jenkins pushed a commit to docker-archive/docker-ce that referenced this pull request Jul 3, 2019
This was added in d363a1881ec256e1be5a48a90046ff16e84ddc02 (moby/moby#38349),
but not yet added to the API history.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: 53430f5fc14feb6d47d4bab740cf9d376e1c74aa
Component: engine
burnMyDread pushed a commit to burnMyDread/moby that referenced this pull request Oct 21, 2019
This was added in d363a18 (moby#38349),
but not yet added to the API history.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Signed-off-by: zach <Zachary.Joyner@linux.com>
@thaJeztah thaJeztah added this to the 20.03.0 milestone Apr 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants