1. 程式人生 > >docker run和docker exec報錯context deadline exceeded

docker run和docker exec報錯context deadline exceeded

mem int overlay number mit def back devices plugin

現象描述
docker run -d centos:v1 /bin/bash創建容器或者docker exec -it container_name bash進入容器,都會報錯“usrbindocker-current Error response from daemon:shim error.context deadline exceeded.”,docker ps、docker stats、docker info等命令均可用
基礎環境
物理機操作系統:CentOS Linux release 7.3.1611 (Core)
內核版本:3.10.0-693.el7.x86_64;該內核版本已修復單機最多跑100個容器(否則觸發xfs文件系統bug導致機器自動重啟)的bug
Docker version:
Client:
Version: 1.12.6
API version: 1.24
Package version: docker-common-1.12.6-16.el7.centos.x86_64
Go version: go1.7.4
Git commit: 3a094bd/1.12.6
Built: Fri Apr 14 13:46:13 2017
OS/Arch: linux/amd64

Server:
Version: 1.12.6
API version: 1.24
Package version: docker-common-1.12.6-16.el7.centos.x86_64

Go version: go1.7.4
Git commit: 3a094bd/1.12.6
Built: Fri Apr 14 13:46:13 2017
OS/Arch: linux/amd64
Docker info:
Containers: 68
Running: 39
Paused: 0
Stopped: 0
Images: 38
Server Version: 1.12.6
Storage Driver: devicemapper
Pool Name: docker-253:0-3222085682-pool
Pool Blocksize: 65.54 kB
Base Device Size: 10.74 GB
Backing Filesystem: xfs
Data file: /dev/loop0
Metadata file: /dev/loop1
Data Space Used: 23.86 GB
Data Space Total: 107.4 GB
Data Space Available: 83.51 GB
Metadata Space Used: 48.09 MB
Metadata Space Total: 2.147 GB
Metadata Space Available: 2.099 GB
Thin Pool Minimum Free Space: 10.74 GB
Udev Sync Supported: true
Deferred Removal Enabled: false
Deferred Deletion Enabled: false
Deferred Deleted Device Count: 0
Data loop file: /var/lib/docker/devicemapper/devicemapper/data
WARNING: Usage of loopback devices is strongly discouraged for production use. Use --storage-opt dm.thinpooldev to specify a custom block storage device.
Metadata loop file: /var/lib/docker/devicemapper/devicemapper/metadata
Library Version: 1.02.135-RHEL7 (2016-09-28)
Logging Driver: json-file
Cgroup Driver: systemd
Plugins:
Volume: local
Network: bridge host null overlay
Swarm: inactive
Runtimes: docker-runc runc
Default Runtime: docker-runc
Security Options: seccomp
Kernel Version: 3.10.0-693.el7.x86_64
Operating System: CentOS Linux 7 (Core)
OSType: linux
Architecture: x86_64
Number of Docker Hooks: 2
CPUs: 48
Total Memory: 251.1 GiB
Name: t-docker-02-12
ID: 4OTZ:QXM3:XSQW:ZPQK:2XEF:W25W:R5DN:QL6X:RMXV:63WP:BHAB:NGPK
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Insecure Registries:
registry.sfbest.com
127.0.0.0/8
Registries: docker.io (secure)
問題分析
1.1 日誌內容
docker的日誌裏包含大量的error,見下,
Jan 9 11:00:32 t-docker-02-12 dockerd-current: time="2018-01-09T11:00:32.482494003+08:00" level=error msg="Error running exec in container: rpc error: code = 2 desc = shim error: context deadline exceeded"
Jan 9 11:00:47 t-docker-02-12 dockerd-current: time="2018-01-09T11:00:47.540579791+08:00" level=error msg="Error running exec in container: rpc error: code = 2 desc = shim error: context deadline exceeded"
Jan 9 11:01:02 t-docker-02-12 dockerd-current: time="2018-01-09T11:01:02.581747742+08:00" level=error msg="Error running exec in container: rpc error: code = 2 desc = shim error: context deadline exceeded"
Jan 9 11:01:17 t-docker-02-12 dockerd-current: time="2018-01-09T11:01:17.614305903+08:00" level=error msg="Error running exec in container: rpc error: code = 2 desc = shim error: context deadline exceeded"
Jan 9 11:01:32 t-docker-02-12 dockerd-current: time="2018-01-09T11:01:32.658808780+08:00" level=error msg="Error running exec in container: rpc error: code = 2 desc = shim error: context deadline exceeded"
Jan 9 11:01:47 t-docker-02-12 dockerd-current: time="2018-01-09T11:01:47.702526455+08:00" level=error msg="Error running exec in container: rpc error: code = 2 desc = shim error: context deadline exceeded"
1.2 谷歌搜索
谷歌搜索“shim error: context deadline exceeded”,查到有人遇到相關問題,但是原因和解決辦法沒有找到,有的說是docker 1.12版本的一個bug,但是看樣子文中的這個bug跟當前遇到的問題沒啥關系,https://bugzilla.redhat.com/show_bug.cgi?format=multiple&id=1443103。
1.3 嘗試解決
1.3.1 docker exec進程
懷疑使用了大量的“docker exec -it containerid bash”命令後沒有正確的退出容器,導致過多的“docker exec”進程影響了docker run和docker exec命令的使用,所以kill掉了所有的“docker exec”進程。問題沒有解決。
1.3.2 docker info看到異常
Docker info:
Containers: 68
Running: 39
Paused: 0
Stopped: 0
Images: 38
一共有68個容器,但是只有39個是運行狀態,其余的都是Exited狀態。
然後把這些Exited狀態的容器刪掉,docker run和docker exec命令恢復,問題解決。
現懷疑是過多的“Exited狀態”的容器導致問題的出現。
因為是測試的宿主機,所以難免會試驗性的建一些可能根本起不來的容器,起不來的話就變成“Exited”狀態了。
亡羊補牢
定期執行docker rm docker ps -a | grep Exited | awk ‘{print $1}‘清理一下垃圾容器;
將docker及系統日誌加到elk裏,檢測日誌內容,如果每分鐘內的包含“error”的條目超過10條,就郵件報警。

docker run和docker exec報錯context deadline exceeded