1. 程式人生 > >Docker 資源限制之記憶體

Docker 資源限制之記憶體

一、壓測工具

通過如下 Dockerfile 構建簡單的測試映象

➜  cat Dockerfile
FROM ubuntu:latest

RUN apt-get update && \
    apt-get install stress
➜   docker build -t ubuntu-stress:latest .

二、記憶體測試

  • 目前 Docker 支援記憶體資源限制選項

    • -m--memory=""
      • Memory limit (format: <number>[<unit>]). Number is a positive integer. Unit can be one of b
        km, or g. Minimum is 4M.
    • --memory-swap=""
      • Total memory limit (memory + swap, format: <number>[<unit>]). Number is a positive integer. Unit can be one of bkm, or g.
    • --memory-swappiness=""
      • Tune a container’s memory swappiness behavior. Accepts an integer between 0 and 100.
    • --shm-size=""
      • Size of /dev/shm. The format is <number><unit>. number must be greater than 0. Unit is optional and can be b (bytes), k (kilobytes), m (megabytes), or g(gigabytes). If you omit the unit, the system uses bytes. If you omit the size entirely, the system uses 64m.
      • 根據實際需求設定,這裡不作過多的介紹
    • --memory-reservation=""
      • Memory soft limit (format: <number>[<unit>]). Number is a positive integer. Unit can be one of bkm, or g.
    • --kernel-memory=""
      • Kernel memory limit (format: <number>[<unit>]). Number is a positive integer. Unit can be one of bkm, or g. Minimum is 4M.
      • kernel memory 沒有特殊需求,則無需額外設定
    • --oom-kill-disable=false
      • Whether to disable OOM Killer for the container or not.

預設啟動一個 container,對於容器的記憶體是沒有任何限制的。

➜  ~ docker help run | grep memory  # 測試 docker 版本 1.10.2,宿主系統 Ubuntu 14.04.1
  --kernel-memory                 Kernel memory limit
  -m, --memory                    Memory limit
  --memory-reservation            Memory soft limit
  --memory-swap                   Swap limit equal to memory plus swap: '-1' to enable unlimited swap
  --memory-swappiness=-1          Tune container memory swappiness (0 to 100)
➜  ~

2.1 -m ... --memory-swap ...

  • docker run -it --rm -m 100M --memory-swap -1 ubuntu-stress:latest /bin/bash

指定限制記憶體大小並且設定 memory-swap 值為 -1,表示容器程式使用記憶體受限,而 swap 空間使用不受限制(宿主 swap 支援使用多少則容器即可使用多少。如果 --memory-swap 設定小於 --memory則設定不生效,使用預設設定)。

➜  ~ docker run -it --rm -m 100M --memory-swap -1 ubuntu-stress:latest /bin/bash
[email protected]4b61f98e787d:/# stress --vm 1 --vm-bytes 1000M  # 通過 stress 工具對容器記憶體做壓測
stress: info: [14] dispatching hogs: 0 cpu, 0 io, 1 vm, 0 hdd

使用 docker stats 檢視當前容器記憶體資源使用:

➜  ~ docker stats 4b61f98e787d
CONTAINER           CPU %               MEM USAGE/LIMIT     MEM %               NET I/O
4b61f98e787d        6.74%               104.8 MB/104.9 MB   99.94%              4.625 kB/648 B

通過 top 實時監控 stress 程序記憶體佔用:

➜  ~ pgrep stress
8209
8210    # 需檢視 stress 子程序佔用,
➜  ~ top -p 8210    # 顯示可以得知 stress 的 RES 佔用為 100m,而 VIRT 佔用為 1007m
top - 17:51:31 up 35 min,  2 users,  load average: 1.14, 1.11, 1.06
Tasks:   1 total,   0 running,   1 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.2 us,  3.1 sy,  0.0 ni, 74.8 id, 21.9 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem:   8102564 total,  6397064 used,  1705500 free,   182864 buffers
KiB Swap: 15625212 total,  1030028 used, 14595184 free.  4113952 cached Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
 8210 root      20   0 1007.1m 100.3m   0.6m D  13.1  1.3   0:22.59 stress

也可以通過如下命令獲取 stress 程序的 swap 佔用:

➜  ~ for file in /proc/*/status ; do awk '/VmSwap|Name/{printf $2 " " $3}END{ print ""}' $file; done | sort -k 2 -n -r | grep stress
stress 921716 kB
stress 96 kB
➜  ~
  • docker run -it --rm -m 100M ubuntu-stress:latest /bin/bash

按照官方文件的理解,如果指定 -m 記憶體限制時不新增 --memory-swap 選項,則表示容器中程式可以使用 100M 記憶體和 100M swap 記憶體。預設情況下,--memory-swap 會被設定成 memory 的 2倍。

We set memory limit(300M) only, this means the processes in the container can use 300M memory and 300M swap memory, by default, the total virtual memory size --memory-swap will be set as double of memory, in this case, memory + swap would be 2*300M, so processes can use 300M swap memory as well.

如果按照以上方式執行容器提示如下資訊:

WARNING: Your kernel does not support swap limit capabilities, memory limited without swap.
  • To enable memory and swap on system using GNU GRUB (GNU GRand Unified Bootloader), do the following:
    • Log into Ubuntu as a user with sudo privileges.
    • Edit the /etc/default/grub file.
    • Set the GRUB_CMDLINE_LINUX value as follows:
      • GRUB_CMDLINE_LINUX="cgroup_enable=memory swapaccount=1"
    • Save and close the file.
    • Update GRUB.
      • $ sudo update-grub
    • Reboot your system.
➜  ~ docker run -it --rm -m 100M ubuntu-stress:latest /bin/bash
[email protected]:/# stress --vm 1 --vm-bytes 200M # 壓測 200M,stress 程序會被立即 kill 掉
stress: info: [17] dispatching hogs: 0 cpu, 0 io, 1 vm, 0 hdd
stress: FAIL: [17] (416) <-- worker 18 got signal 9
stress: WARN: [17] (418) now reaping child worker processes
stress: FAIL: [17] (452) failed run completed in 2s
[email protected]:/# stress --vm 1 --vm-bytes 199M

docker stats 和 top 獲取資源佔用情況:

➜  ~ docker stats ed670cdcb472
CONTAINER           CPU %               MEM USAGE / LIMIT     MEM %               NET I/O             BLOCK I/O
ed670cdcb472        13.35%              104.3 MB / 104.9 MB   99.48%              6.163 kB / 648 B    26.23 GB / 29.21 GB
➜  ~ pgrep stress
16322
16323
➜  ~ top -p 16323
top - 18:12:31 up 56 min,  2 users,  load average: 1.07, 1.07, 1.05
Tasks:   1 total,   0 running,   1 sleeping,   0 stopped,   0 zombie
%Cpu(s):  4.8 us,  4.0 sy,  0.0 ni, 69.6 id, 21.4 wa,  0.0 hi,  0.2 si,  0.0 st
KiB Mem:   8102564 total,  6403040 used,  1699524 free,   184124 buffers
KiB Swap: 15625212 total,   149996 used, 15475216 free.  4110440 cached Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
16323 root      20   0  206.1m  91.5m   0.6m D   9.9  1.2   0:52.58 stress
  • docker run -it -m 100M --memory-swap 400M ubuntu-stress:latest /bin/bash
➜  ~ docker run -it --rm -m 100M --memory-swap 400M ubuntu-stress:latest /bin/bash
[email protected]5ed1fd88a1aa:/# stress --vm 1 --vm-bytes 400M  # 壓測到 400M 程式會被 kill
stress: info: [24] dispatching hogs: 0 cpu, 0 io, 1 vm, 0 hdd
stress: FAIL: [24] (416) <-- worker 25 got signal 9
stress: WARN: [24] (418) now reaping child worker processes
stress: FAIL: [24] (452) failed run completed in 3s
[email protected]5ed1fd88a1aa:/# stress --vm 1 --vm-bytes 399M  # 壓測到 399M 程式剛好可以正常執行(這個值已經處於臨界了,不保證不被 kill)

docker stats 和 top 獲取資源佔用情況:

➜  ~ docker stats 5ed1fd88a1aa
CONTAINER           CPU %               MEM USAGE / LIMIT     MEM %               NET I/O             BLOCK I/O
5ed                 12.44%              104.8 MB / 104.9 MB   99.92%              4.861 kB / 648 B    9.138 GB / 10.16 GB
➜  ~ pgrep stress
22721
22722
➜  ~ top -p 22722
top - 18:18:58 up  1:02,  2 users,  load average: 1.04, 1.04, 1.05
Tasks:   1 total,   0 running,   1 sleeping,   0 stopped,   0 zombie
%Cpu(s):  1.4 us,  3.3 sy,  0.0 ni, 73.7 id, 21.6 wa,  0.0 hi,  0.1 si,  0.0 st
KiB Mem:   8102564 total,  6397416 used,  1705148 free,   184608 buffers
KiB Swap: 15625212 total,   366160 used, 15259052 free.  4102076 cached Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
22722 root      20   0  406.1m  84.1m   0.7m D  11.7  1.1   0:08.82 stress

根據實際測試可以理解,-m 為實體記憶體上限,而 --memory-swap 則是 memory + swap 之和,當壓測值是 --memory-swap 上限時,則容器中的程序會被直接 OOM kill。

2.2 -m ... --memory-swappiness ...

swappiness 可以認為是宿主 /proc/sys/vm/swappiness 設定:

Swappiness is a Linux kernel parameter that controls the relative weight given to swapping out runtime memory, as opposed to dropping pages from the system page cache. Swappiness can be set to values between 0 and 100 inclusive. A low value causes the kernel to avoid swapping, a higher value causes the kernel to try to use swap space. Swappiness

--memory-swappiness=0 表示禁用容器 swap 功能(這點不同於宿主機,宿主機 swappiness 設定為 0 也不保證 swap 不會被使用):

  • docker run -it --rm -m 100M --memory-swappiness=0 ubuntu-stress:latest /bin/bash
➜  ~ docker run -it --rm -m 100M --memory-swappiness=0 ubuntu-stress:latest /bin/bash
[email protected]:/# stress --vm 1 --vm-bytes 100M  # 沒有任何商量的餘地,到達 100M 直接被 kill
stress: info: [18] dispatching hogs: 0 cpu, 0 io, 1 vm, 0 hdd
stress: FAIL: [18] (416) <-- worker 19 got signal 9
stress: WARN: [18] (418) now reaping child worker processes
stress: FAIL: [18] (452) failed run completed in 0s
[email protected]:/#

2.3 --memory-reservation ...

--memory-reservation ... 選項可以理解為記憶體的軟限制。如果不設定 -m 選項,那麼容器使用記憶體可以理解為是不受限的。按照官方的說法,memory reservation 設定可以確保容器不會長時間佔用大量記憶體。

2.4 --oom-kill-disable

➜  ~ docker run -it --rm -m 100M --memory-swappiness=0 --oom-kill-disable ubuntu-stress:latest /bin/bash
[email protected]:/# stress --vm 1 --vm-bytes 200M  # 正常情況不新增 --oom-kill-disable 則會直接 OOM kill,加上之後則達到限制記憶體之後也不會被 kill
stress: info: [17] dispatching hogs: 0 cpu, 0 io, 1 vm, 0 hdd

但是如果是以下的這種沒有對容器作任何資源限制的情況,新增 --oom-kill-disable 選項就比較 危險 了:

$ docker run -it --oom-kill-disable ubuntu:14.04 /bin/bash

因為此時容器記憶體沒有限制,並且不會被 oom kill,此時系統則會 kill 系統程序用於釋放記憶體。

2.5 --kernel-memory

Kernel memory is fundamentally different than user memory as kernel memory can’t be swapped out. The inability to swap makes it possible for the container to block system services by consuming too much kernel memory. Kernel memory includes:

  • stack pages
  • slab pages
  • sockets memory pressure
  • tcp memory pressure

這裡直接引用 Docker 官方介紹,如果無特殊需求,kernel-memory 一般無需設定,這裡不作過多說明。

三、記憶體資源限制 Docker 原始碼解析

關於 Docker 資源限制主要是依賴 Linux cgroups 去實現的,關於 cgroups 資源限制實現可以參考:Docker背後的核心知識——cgroups資源限制, libcontainer 配置相關的選項:

  • github.com/opencontainers/runc/libcontainer/cgroups/fs/memory.go
68 func (s *MemoryGroup) Set(path string, cgroup *configs.Cgroup) error {
69     if cgroup.Resources.Memory != 0 {
70         if err := writeFile(path, "memory.limit_in_bytes", strconv.FormatInt(cgroup.Resources.Memory, 10)); err != nil {
71             return err
72         }
73     }
74     if cgroup.Resources.MemoryReservation != 0 {
75         if err := writeFile(path, "memory.soft_limit_in_bytes", strconv.FormatInt(cgroup.Resources.MemoryReservation, 10)); err != nil {
76             return err
77         }
78     }
79     if cgroup.Resources.MemorySwap > 0 {
80         if err := writeFile(path, "memory.memsw.limit_in_bytes", strconv.FormatInt(cgroup.Resources.MemorySwap, 10)); err != nil {
81             return err   // 如果 MemorySwap 沒有設定,則 cgroup 預設設定值是 Memory 2 倍,詳見後文測試
82         }
83     }
84     if cgroup.Resources.OomKillDisable {
85         if err := writeFile(path, "memory.oom_control", "1"); err != nil {
86             return err
87         }
88     }
89     if cgroup.Resources.MemorySwappiness >= 0 && cgroup.Resources.MemorySwappiness <= 100 {
90         if err := writeFile(path, "memory.swappiness", strconv.FormatInt(cgroup.Resources.MemorySwappiness, 10)); err != nil {
91             return err
92         }
93     } else if cgroup.Resources.MemorySwappiness == -1 {
94         return nil  // 如果 MemorySwappiness 設定為 -1,則不做任何操作,經測試預設值為 60,後文附測試
95     } else {
96         return fmt.Errorf("invalid value:%d. valid memory swappiness range is 0-100", cgroup.Resources.MemorySwappiness)
97     }
98
99     return nil
100 }

附測試:

➜  ~ docker run -it --rm -m 100M --memory-swappiness=-1 ubuntu-stress:latest /bin/bash
[email protected]:/#

檢視宿主對應 container cgroup 對應值:

➜  ~ cd /sys/fs/cgroup/memory/docker/fbe9b0abf665b77fff985fd04f85402eae83eb7eb7162a30070b5920d50c5356
➜  fbe9b0abf665b77fff985fd04f85402eae83eb7eb7162a30070b5920d50c5356 cat memory.swappiness           # swappiness 如果設定 -1 則該值預設為 60
60
➜  fbe9b0abf665b77fff985fd04f85402eae83eb7eb7162a30070b5920d50c5356 cat memory.memsw.limit_in_bytes # 為設定的 memory 2
209715200
➜  fbe9b0abf665b77fff985fd04f85402eae83eb7eb7162a30070b5920d50c5356轉自:http://blog.opskumu.com/docker-memory-limit.html