From a container point of view, networking on a plain Docker Host is simple. A running container is nothing more than a Linux process which is namespaced and constrained with regards to access (SELinux) and resource consumption (cgroups). In each namespace, there is a single (virtual) network interface called eth0 which is assigned an IP address chosen by the docker daemon (and a MAC address guaranteed to be conflict-free to match). Docker allocates these IP addresses from the RFC1918 private 172.17.0.0/16 IP range. All ingress and egress network packets of this container use this interface. In the container’s file system, the docker daemon overlays the files /etc/hostname, /etc/hosts and /etc/resolv.conf to ensure that network related services such as DNS behave as expected.
This single virtual network interface does NOT mean that there cannot be dedicated NICs, e.g. for administrative purposes or to isolate storage traffic. Rather those NICs exist on the host: access to NAS/SAN storage from a container is performed via Linux filesystem and device API calls against the Linux kernel. From there, network packets are sent via the appropriate NIC. The same applies for administrative networks – here the administrator connects to the docker host, from where he/she applies administrative commands via the docker daemon.
The container’s virtual eth0 network interface is connected via a peer interface to a virtual bridge device outside of the container’s namespace, usually called docker0. All containers running on the same host can communicate with each other via this bridge. This peer interface is given a random name starting with veth.
Also the docker hosts’ physical NICs are connected to this bridge via peer interface, with IPTables masquerading enabled to allow outbound communication into the host network’s IP address space. For inbound traffic (when a container has been launched with the -p or -P options), destination NAT is set up to forward specific host ports to a container.
Network flows on a plain docker host are therefore as follows. Most of these steps are not routing hops, but passing of a packet on between virtual layer2 devices on the same node. This means that the overhead is comparatively low:
- Administrative traffic: admin network → Docker Host
- SAN/NAS traffic: Container API call → Linux Kernel → storage network
- Network traffic between Containers: ContainerA eth0 → vethXXX → docker0 bridge → vethYYYY → containerB eth0
- Outbound from Container: Container eth0 → vethXXX → docker0 bridge → IPTables NAT → host network
- Inbound to Container (“host port”): Host network → IPTables DNAT → docker0 bridge → vethXXXX → Container eth0
A practical example
The following listing shows the network interfaces of a plain docker host (1) loopback, (2) eth0 external, (3) eth1 administration and (4) the docker0 bridge:
[code language=”plain”][root@test-rhel7 ~]# ip a
1: lo: mtu 65536 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 52:54:00:11:0d:aa brd ff:ff:ff:ff:ff:ff
inet 192.168.100.22/24 brd 192.168.100.255 scope global eth0
valid_lft forever preferred_lft forever
inet6 fe80::5054:ff:fe11:daa/64 scope link
valid_lft forever preferred_lft forever
3: eth1: mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 52:54:00:17:5d:ae brd ff:ff:ff:ff:ff:ff
inet 192.168.103.22/24 brd 192.168.103.255 scope global eth1
valid_lft forever preferred_lft forever
inet6 fe80::5054:ff:fe17:5dae/64 scope link
valid_lft forever preferred_lft forever
4: docker0: mtu 1500 qdisc noqueue state DOWN
link/ether 02:42:a9:b8:55:0b brd ff:ff:ff:ff:ff:ff
inet 172.17.0.1/16 scope global docker0
valid_lft forever preferred_lft forever
[root@test-rhel7 ~]#
[/code]
A container’s namespace only knows (1) loopback and (7) eth0 as shown in the following listing. Notice that the eth0 has “@if8” appended, which indicates the number of the network device outside the docker namespace.
[code language=”plain”][root@test-rhel7 ~]# docker run -i -t centos /bin/sh
[…truncated…]
sh-4.2# ip a
1: lo: mtu 65536 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
7: eth0@if8: mtu 1500 qdisc noqueue state UP
link/ether 02:42:ac:11:00:02 brd ff:ff:ff:ff:ff:ff link-netnsid 0
inet 172.17.0.2/16 scope global eth0
valid_lft forever preferred_lft forever
inet6 fe80::42:acff:fe11:2/64 scope link
valid_lft forever preferred_lft forever
sh-4.2#[/code]
In the hosts namespace, launching the container creates an additional interface veth268c73c@if7. The first part is randomly chosen, the second part “@if7” indicates that it connects directly to the eth0 interface in the container which had the number 7.
[code language=”plain”][root@test-rhel7 ~]# ip a
1: lo: mtu 65536 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 52:54:00:11:0d:aa brd ff:ff:ff:ff:ff:ff
inet 192.168.100.22/24 brd 192.168.100.255 scope global eth0
valid_lft forever preferred_lft forever
inet6 fe80::5054:ff:fe11:daa/64 scope link
valid_lft forever preferred_lft forever
3: eth1: mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 52:54:00:17:5d:ae brd ff:ff:ff:ff:ff:ff
inet 192.168.103.22/24 brd 192.168.103.255 scope global eth1
valid_lft forever preferred_lft forever
inet6 fe80::5054:ff:fe17:5dae/64 scope link
valid_lft forever preferred_lft forever
4: docker0: mtu 1500 qdisc noqueue state UP
link/ether 02:42:a9:b8:55:0b brd ff:ff:ff:ff:ff:ff
inet 172.17.0.1/16 scope global docker0
valid_lft forever preferred_lft forever
inet6 fe80::42:a9ff:feb8:550b/64 scope link
valid_lft forever preferred_lft forever
8: veth268c73c@if7: mtu 1500 qdisc noqueue master docker0 state UP
link/ether ae:c9:d3:7c:55:36 brd ff:ff:ff:ff:ff:ff link-netnsid 0
inet6 fe80::acc9:d3ff:fe7c:5536/64 scope link
valid_lft forever preferred_lft forever
[root@test-rhel7 ~]#[/code]
That the container is actually connected to the docker0 bridge can be seen in the following listing:
[code language=”plain”][root@test-rhel7 ~]# brctl show docker0
bridge name bridge id STP enabled interfaces
docker0 8000.0242a9b8550b no veth268c73c
[root@test-rhel7 ~]#[/code]
2 replies on “OpenShift 3.1 Networking from a container/workload point of view – Part 1: Container Networking on a plain Docker Host”
I should say this is one the best explained complicated docker networking in simple terms. Good job
Awesome series, thx.