Kubernetes – Calico Troubleshooting

By | 2023년 1월 10일
Table of Contents

Kubernetes – Calico Troubleshooting

Calico 설치중 오류가 발생하여 해당 오류를 제거하는 방법을 정리합니다.

방화벽 해제

가장 많이 발생하는 오류는 방화벽이므로 일단 방화벽은 해제해준다.

이후에 오류가 해결되면 그때 특정 포트만 오픈하는 작업을 해준다.

이전 CNI 잔재 제거

k describe po calico-node-lqstp -n kube-system
......
Events:
  Type     Reason     Age                     From     Message
  ----     ------     ----                    ----     -------
  Warning  Unhealthy  2m54s (x7432 over 18h)  kubelet  (combined from similar events): Readiness probe failed: 2023-01-10 03:26:10.015 [INFO][340172] confd/health.go 180: Number of node(s) with BGP peering established = 0
calico/node is not ready: BIRD is not ready: BGP not established with 10.44.0.0,172.16.0.202,172.16.0.203,172.16.0.204

weave 를 설치 후 제거하고, calico 를 설치한 상태이다.
weave 의 잔재가 남아있는 것을 확인할 수 있다.
(flannel 도 유사한 형태로 잔재가 남는다.)

ifconfig
......
weave: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1376
        inet 10.44.0.0  netmask 255.240.0.0  broadcast 10.47.255.255
        inet6 fe80::e03c:19ff:fe0c:834f  prefixlen 64  scopeid 0x20<link>
        ether e2:3c:19:0c:83:4f  txqueuelen 1000  (Ethernet)
        RX packets 633  bytes 40806 (39.8 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 967058  bytes 40619287 (38.7 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
ip link
......
23: weave: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1376 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether e2:3c:19:0c:83:4f brd ff:ff:ff:ff:ff:ff

제거해 준다.

sudo ifconfig weave down
sudo ip link delete weave

오류 상태인 Pod 를 삭제한다.

k delete po calico-node-lqstp -n kube-system
k get po -n kube-system -o wide
k describe po calico-node-426rn -n kube-system
......
Events:
  Type     Reason     Age                From               Message
  ----     ------     ----               ----               -------
  Normal   Scheduled  49s                default-scheduler  Successfully assigned kube-system/calico-node-426rn to es-search05
  Normal   Pulled     49s                kubelet            Container image "docker.io/calico/cni:v3.24.5" already present on machine
  Normal   Created    49s                kubelet            Created container upgrade-ipam
  Normal   Started    48s                kubelet            Started container upgrade-ipam
  Normal   Pulled     48s                kubelet            Container image "docker.io/calico/cni:v3.24.5" already present on machine
  Normal   Created    48s                kubelet            Created container install-cni
  Normal   Started    47s                kubelet            Started container install-cni
  Normal   Pulled     47s                kubelet            Container image "docker.io/calico/node:v3.24.5" already present on machine
  Normal   Created    46s                kubelet            Created container mount-bpffs
  Normal   Started    46s                kubelet            Started container mount-bpffs
  Normal   Pulled     46s                kubelet            Container image "docker.io/calico/node:v3.24.5" already present on machine
  Normal   Created    46s                kubelet            Created container calico-node
  Normal   Started    45s                kubelet            Started container calico-node
  Warning  Unhealthy  43s (x2 over 44s)  kubelet            Readiness probe failed: calico/node is not ready: BIRD is not ready: Error querying BIRD: unable to connect to BIRDv4 socket: dial unix /var/run/calico/bird.ctl: connect: connection refused

connect: connection refused

k get po -n kube-system -o wide
k describe po calico-node-426rn -n kube-system
......
Events:
  Type     Reason     Age                From               Message
  ----     ------     ----               ----               -------
......
calico-node
  Warning  Unhealthy  43s (x2 over 44s)  kubelet            Readiness probe failed: calico/node is not ready: BIRD is not ready: Error querying BIRD: unable to connect to BIRDv4 socket: dial unix /var/run/calico/bird.ctl: connect: connection refused

위 오류는 서버 내에 여러개의 네트워크가 설치된 상태로 Calico 가 잘못된 네트워크를 선택했을 때 발생하는 오류이다.

kubectl set env daemonset/calico-node -n kube-system IP_AUTODETECTION_METHOD=interface=bond*

위 명령으로 외부 접속용 네트워크를 지정해 주면 오류가 없어진다.

어떤 네트워크가 외부 접속용 네트워크인지 모른다면,
회사 내에 네트워크 관리자가 별도로 있다는 의미이니,
그 사람에게 물어보면 된다.

Calico 가 자동으로 Pod 를 재생성한다.

k logs calico-node-kv8z8 -n kube-system
......
bird: Mesh_172_16_0_205: Connected to table master
bird: Mesh_172_16_0_205: State changed to wait
bird: Mesh_172_16_0_201: Connected to table master
bird: Mesh_172_16_0_201: State changed to wait
bird: Mesh_172_16_0_203: Connected to table master
bird: Mesh_172_16_0_203: State changed to wait
bird: Mesh_172_16_0_202: Connected to table master
bird: Mesh_172_16_0_202: State changed to wait
bird: Graceful restart done
bird: Mesh_172_16_0_201: State changed to feed
bird: Mesh_172_16_0_202: State changed to feed
bird: Mesh_172_16_0_203: State changed to feed
bird: Mesh_172_16_0_205: State changed to feed
bird: Mesh_172_16_0_201: State changed to up
bird: Mesh_172_16_0_202: State changed to up
bird: Mesh_172_16_0_203: State changed to up
bird: Mesh_172_16_0_205: State changed to up
2023-01-10 05:42:50.004 [INFO][100] felix/health.go 242: Overall health status changed newStatus=&health.HealthReport{Live:true, Ready:true
curl -L https://github.com/projectcalico/calico/releases/download/v3.24.5/calicoctl-linux-amd64 -o calicoctl
chmod 700 calicoctl
sudo mv calicoctl /usr/bin/
sudo calicoctl node status
Calico process is running.

IPv4 BGP status
+--------------+-------------------+-------+----------+-------------+
| PEER ADDRESS |     PEER TYPE     | STATE |  SINCE   |    INFO     |
+--------------+-------------------+-------+----------+-------------+
| 172.16.0.202 | node-to-node mesh | up    | 05:42:26 | Established |
| 172.16.0.203 | node-to-node mesh | up    | 05:42:25 | Established |
| 172.16.0.204 | node-to-node mesh | up    | 05:42:47 | Established |
| 172.16.0.205 | node-to-node mesh | up    | 05:42:38 | Established |
+--------------+-------------------+-------+----------+-------------+

IPv6 BGP status
No IPv6 peers found.

답글 남기기