PancrasL的博客

kubenates 运行任务时出现can't join IPC of container...non-shareable IPC的错误

2019-11-17

docker

背景介绍

最近在k8s的过程中遇到了can't join IPC of container...non-shareable IPC的错误,用博客记录一下错误的排查和解决过程。

原因是19.03版本 Docker 的默认 IpcMode 被修改为了 private ,需要修改配置文件将其设置为 shareable.

应用版本

  • 系统版本
    1
    2
    root@k8s-mst:/# uname -a
    Linux k8s-mst 4.15.0-70-generic #79-Ubuntu SMP Tue Nov 12 10:36:11 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
  • kubenate版本
    1
    2
    root@k8s-mst:/# kubectl version
    Client Version: version.Info{Major:"1", Minor:"5+", GitVersion:"v1.5.9-beta.0-dirty", GitCommit:"f35802d3a00b37a32476451266af05ce9760fec0", GitTreeState:"dirty", BuildDate:"2019-11-13T06:51:04Z", GoVersion:"go1.7.4", Compiler:"gc", Platform:"linux/amd64"}
  • docker版本
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    root@k8s-nod1:/# docker version
    Client: Docker Engine - Community
    Version: 19.03.4
    API version: 1.40
    Go version: go1.12.10
    Git commit: 9013bf583a
    Built: Fri Oct 18 15:54:09 2019
    OS/Arch: linux/amd64
    Experimental: false

    Server: Docker Engine - Community
    Engine:
    Version: 19.03.4
    API version: 1.40 (minimum version 1.12)
    Go version: go1.12.10
    Git commit: 9013bf583a
    Built: Fri Oct 18 15:52:40 2019
    OS/Arch: linux/amd64
    Experimental: false
    containerd:
    Version: 1.2.10
    GitCommit: b34a5c8af56e510852c35414db4c1f4fa6172339
    runc:
    Version: 1.0.0-rc8+dev
    GitCommit: 3e425f80a8c931f88e6d94a8c831b9d5aa481657
    docker-init:
    Version: 0.18.0
    GitCommit: fec3683

错误排查

  • 使用 kubectl get pod 查看pod的运行状态发现异常
1
2
3
4
5
6
7
8
root@k8s-mst:~/web_sample# kubectl apply -f nginx-deployment.yaml
deployment "nginx-deployment" created
root@k8s-mst:~/web_sample# kubectl get pod
NAME READY STATUS RESTARTS AGE
nginx-deployment-4087004473-4m0br 0/1 RunContainerError 0 7s
nginx-deployment-4087004473-80rqd 0/1 RunContainerError 0 7s
nginx-deployment-4087004473-qfdmw 0/1 RunContainerError 0 7s

  • 使用 kubectl describe pod 查看详细的错误信息
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
Events:
FirstSeen LastSeen Count From SubObjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
34s 34s 1 {default-scheduler } Normal Scheduled Successfully assigned nginx-deployment-4087004473-qfdmw to k8s-nod2
31s 31s 1 {kubelet k8s-nod2} spec.containers{nginx} Normal Created Created container with docker id 7fc4f63e0bd6; Security:[seccomp=unconfined]
30s 30s 1 {kubelet k8s-nod2} spec.containers{nginx} Warning Failed Failed to start container with docker id 7fc4f63e0bd6 with error: Error response from daemon: {"message":"can't join IPC of container 2dfd13510b3f816ed0437ea70a8a57daeaa7a38eec0035972044b25dd102f1cd: non-shareable IPC (hint: use IpcMode:shareable for the donor container)"}
28s 28s 1 {kubelet k8s-nod2} spec.containers{nginx} Normal Created Created container with docker id 3c61a9aaa6a4; Security:[seccomp=unconfined]
28s 28s 1 {kubelet k8s-nod2} spec.containers{nginx} Warning Failed Failed to start container with docker id 3c61a9aaa6a4 with error: Error response from daemon: {"message":"can't join IPC of container 2dfd13510b3f816ed0437ea70a8a57daeaa7a38eec0035972044b25dd102f1cd: non-shareable IPC (hint: use IpcMode:shareable for the donor container)"}
18s 18s 1 {kubelet k8s-nod2} spec.containers{nginx} Normal Created Created container with docker id 42970a39a9b4; Security:[seccomp=unconfined]
17s 17s 1 {kubelet k8s-nod2} spec.containers{nginx} Warning Failed Failed to start container with docker id 42970a39a9b4 with error: Error response from daemon: {"message":"can't join IPC of container 2dfd13510b3f816ed0437ea70a8a57daeaa7a38eec0035972044b25dd102f1cd: non-shareable IPC (hint: use IpcMode:shareable for the donor container)"}
32s 3s 4 {kubelet k8s-nod2} spec.containers{nginx} Normal Pulled Container image "nginx:1.7.9" already present on machine
30s 2s 4 {kubelet k8s-nod2} Warning FailedSync Error syncing pod, skipping: failed to "StartContainer" for "nginx" with RunContainerError: "runContainer: Error response from daemon: {\"message\":\"can't join IPC of container 2dfd13510b3f816ed0437ea70a8a57daeaa7a38eec0035972044b25dd102f1cd: non-shareable IPC (hint: use IpcMode:shareable for the donor container)\"}"

2s 2s 1 {kubelet k8s-nod2} spec.containers{nginx} Normal Created Created container with docker id 20ca9e358257; Security:[seccomp=unconfined]
2s 2s 1 {kubelet k8s-nod2} spec.containers{nginx} Warning Failed Failed to start container with docker id 20ca9e358257 with error: Error response from daemon: {"message":"can't join IPC of container 2dfd13510b3f816ed0437ea70a8a57daeaa7a38eec0035972044b25dd102f1cd: non-shareable IPC (hint: use IpcMode:shareable for the donor container)"}

  • 根据日志信息,错误是由 IPC mode 导致的,并且提到使用 shareable 模式来解决,接下来需要将IpcMode修改为shareable
1
"message":"can't join IPC of container 2dfd13510b3f816ed0437ea70a8a57daeaa7a38eec0035972044b25dd102f1cd: non-shareable IPC (hint: use IpcMode:shareable for the donor container)"

解决方案

  • Docker 官方文档中用关键字搜索 docker ipc ,在docker run reference中找到修改IpcMode的方式
    在这里插入图片描述
  • docker docs 中用关键字搜索 docker default ipc 查找修改默认 ipc 模式的方式,docker 19.03 release note中提到默认的ipc模式被修改为了private,因此我们需要将

image-20210528163138562

  • 通过链接来到githubpr提交记录查找解决方法
1
Old (bad, but backward-compatible) behavior (i.e. "shareable" containers by default) can be enabled by either using --default-ipc-mode shareable daemon command line option, or by adding a "default-ipc-mode": shareable" line in docker.json configuration file.
  • 根据提示,修改/etc/docker/daemon.json(没有的话新建一个),添加如下内容
1
2
3
{
"default-ipc-mode": "shareable"
}
  • 重启docker
1
systemctl restart docker

重新运行

  • 重新部署deployment,可以看到pod正常运行。
1
2
3
4
5
6
7
root@k8s-mst:~/web_sample# kubectl apply -f nginx-deployment.yaml
deployment "nginx-deployment" created
root@k8s-mst:~/web_sample# kubectl get pods
NAME READY STATUS RESTARTS AGE
nginx-deployment-4087004473-9wjbr 1/1 Running 0 29s
nginx-deployment-4087004473-g9qpr 1/1 Running 0 29s
nginx-deployment-4087004473-z6jkm 1/1 Running 0 29s

Reference