Description
Description
With docker run --rm -d -p 127.0.0.1:8081:80 traefik/whoami
, it is not expected that hosts on a common network could access the service, but they can due to iptables
rules managed by Docker:
For example, on the docker host (192.168.42.10
), the container could be reached at 127.0.0.1:8081
or 172.17.0.3:80
from a separate host at 192.168.42.20
.
Reproduce
Run the following commands:
-
Docker host (
192.168.42.10
):Technically only the 2nd container (
8081
) is relevant for reproduction:- Public is expected to be accessible via LAN IP.
- Internal is not expected to be accessible outside of the docker host.
- Private is not accessible due to no published port (unless
FORWARD
chain is set toACCEPT
)
# Default binding address: `0.0.0.0`: docker run --rm -d -p 8080:80 --name public traefik/whoami # Explicitly only accessible internally via localhost (or container IP: 172.17.0.3:80): docker run --rm -d -p 127.0.0.1:8081:80 --name internal traefik/whoami # Only reachable via container IP (172.17.0.4:80): docker run --rm -d --name private traefik/whoami
-
Neighbour host (
192.168.42.20
, same LAN):# NOTE: Firewalld prevents access via docker zone # Route to `docker0` bridge at the docker host: ip route add 172.17.0.0/16 via 192.168.42.10 # LAN host successfully connects to container at docker host via published container port: curl 172.17.0.3:80
# Route to 127.0.0.1 at the docker host: ip addr add 127.0.0.2/8 dev lo ip addr del 127.0.0.1/8 dev lo ip route add 127.0.0.1 via 192.168.42.10 # NOTE: Alternatively `all` could instead be the common LAN interface (eg: `eth1`): sysctl net.ipv4.conf.all.route_localnet=1 # LAN host successfully connects to container at docker host via published host port: curl 127.0.0.1:8081
nmap
can be used to identify what ports are reachable at the docker host (only the published ports from Docker).
Expected behavior
I did not expect hosts on the same network to be able to reach private subnets of a separate host like 127.0.0.1
or 172.17.0.0/16
.
Ports bound on 127.0.0.1
via services not running in containers were not reachable due to no equivalent iptables
rules permitting it AFAIK? Unclear if Docker could be more restrictive on that routing requirement.
docker version
Click to view
Client: Docker Engine - Community
Version: 24.0.1
API version: 1.43
Go version: go1.20.4
Git commit: 6802122
Built: Fri May 19 18:07:52 2023
OS/Arch: linux/amd64
Context: default
Server: Docker Engine - Community
Engine:
Version: 24.0.1
API version: 1.43 (minimum version 1.12)
Go version: go1.20.4
Git commit: 463850e
Built: Fri May 19 18:06:17 2023
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.6.21
GitCommit: 3dce8eb055cbb6872793272b4f20ed16117344f8
runc:
Version: 1.1.7
GitCommit: v1.1.7-0-g860f061
docker-init:
Version: 0.19.0
GitCommit: de40ad0
docker info
Click to view
Client: Docker Engine - Community
Version: 24.0.1
Context: default
Debug Mode: false
Plugins:
buildx: Docker Buildx (Docker Inc.)
Version: v0.10.4
Path: /usr/libexec/docker/cli-plugins/docker-buildx
compose: Docker Compose (Docker Inc.)
Version: v2.18.1
Path: /usr/libexec/docker/cli-plugins/docker-compose
Server:
Containers: 3
Running: 3
Paused: 0
Stopped: 0
Images: 1
Server Version: 24.0.1
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Using metacopy: false
Native Overlay Diff: true
userxattr: false
Logging Driver: json-file
Cgroup Driver: systemd
Cgroup Version: 2
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: io.containerd.runc.v2 runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 3dce8eb055cbb6872793272b4f20ed16117344f8
runc version: v1.1.7-0-g860f061
init version: de40ad0
Security Options:
seccomp
Profile: builtin
cgroupns
Kernel Version: 6.2.12-200.fc37.x86_64
Operating System: Fedora Linux 37 (Server Edition)
OSType: linux
Architecture: x86_64
CPUs: 1
Total Memory: 947.4MiB
Name: vpc-fedora
ID: 91d9ebe9-0988-4d55-9030-e7cff48f5dd2
Docker Root Dir: /var/lib/docker
Debug Mode: false
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
Additional Info
Reproduced with:
userland-proxy
enabled / disabled.- Ubuntu 23.04 (UFW) and Fedora 37 (Firewalld,
docker
zone prevents bridge access) - Two VMware guests (Arch Linux, no firewall) on a NAT network.
- Two VPS instances on Vultr connected with a VPC private network (some vendors apparently prevent this traffic).
The scope is apparently limited to layer 2 network switching (not my area of expertise). Some other cloud providers like AWS aren't affected with their VPC by default, requiring opt-in config.
Cause and mitigation options
Docker changes (sysctl net.ipv4.ip_forward=1
+ iptables
chain rules for FORWARD
=> DOCKER
) to support the published ports, this is a side-effect.
- Routing to
127.0.0.1
can be mitigated via additional constraint to thePREROUTING
NAT rule. - Routing to
172.16.0.0/12
(or similar docker networks) can at least be mitigated via Firewallddocker
zone.
Related past vulnerability with ip_forward=1
A related issue was reported in the past years ago and resolved. The vulnerability is present at a smaller scope (only access to containers via explicitly published ports, instead of all container ports), but may pose risk (indirect access to such containers on the docker hosts VPN network?).
This is presumably still a valid concern on untrusted networks (cafe / airport wifi), or trusted networks (home / corporate) if a LAN host were compromised.
You can find comments within existing issues from many years ago detailing how to perform this (a recent example (Nov 2021)), as well as other sources outside of the moby repo that discuss it. Thus the public report, as this had already been disclosed publicly?
Related past vulnerability from route_localnet=1
elsewhere
This was initially for investigating when userland-proxy: false
sets sysctl net.ipv4.conf.docker0.route_localnet=1
. Which doesn't appear to be a risk (as detailed in point 3 here).
I had seen a similar vulnerability in kubernetes which I wanted to verify (I had less familiarity with route_localnet
at the time):
- CVE-2020-8558: Node setting allows for neighboring hosts to bypass localhost boundary kubernetes/kubernetes#92315
- net.ipv4.conf.all.route_localnet=1 opens security issue kubernetes/kubernetes#90259 (comment)
- net.ipv4.conf.all.route_localnet=1 opens security issue kubernetes/kubernetes#90259 (comment)
- Do not set sysctlRouteLocalnet (CVE-2020-8558) kubernetes/kubernetes#92938 (comment)
- Do not set sysctlRouteLocalnet (CVE-2020-8558) kubernetes/kubernetes#92938 (comment)
- Do not set sysctlRouteLocalnet (CVE-2020-8558) kubernetes/kubernetes#92938 (comment)
- kubelet: block non-forwarded packets from crossing the localhost boundary kubernetes/kubernetes#91569
- Ipvs: NodePort entries for localhost (127.0.0.1, ::1) are created but can't be used kubernetes/kubernetes#96879
That vulnerability although similar differs:
- It allowed reaching any port on
127.0.0.1
via the common LAN interface route, while the moby one is constrained to published ports only. moby
is only settingroute_localnet=1
for it's own bridge networks (not all interfaces withsysctl net.ipv4.conf.all.route_localnet=1
on the docker host - although narrowing that down to the common LAN interface would be sufficient).- Both Firewall frontends protect against the attack (unpublished ports on
127.0.0.1
):- UFW (sets
INPUT
default policyDROP
) - Firewalld (LAN interface in a zone with
target: default
)
- UFW (sets