Table of Contents
Kubernetes – Calico Troubleshooting
Calico 설치중 오류가 발생하여 해당 오류를 제거하는 방법을 정리합니다.
방화벽 해제
가장 많이 발생하는 오류는 방화벽이므로 일단 방화벽은 해제해준다.
이후에 오류가 해결되면 그때 특정 포트만 오픈하는 작업을 해준다.
이전 CNI 잔재 제거
k describe po calico-node-lqstp -n kube-system
......
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning Unhealthy 2m54s (x7432 over 18h) kubelet (combined from similar events): Readiness probe failed: 2023-01-10 03:26:10.015 [INFO][340172] confd/health.go 180: Number of node(s) with BGP peering established = 0
calico/node is not ready: BIRD is not ready: BGP not established with 10.44.0.0,172.16.0.202,172.16.0.203,172.16.0.204
weave 를 설치 후 제거하고, calico 를 설치한 상태이다.
weave 의 잔재가 남아있는 것을 확인할 수 있다.
(flannel 도 유사한 형태로 잔재가 남는다.)
ifconfig
......
weave: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1376
inet 10.44.0.0 netmask 255.240.0.0 broadcast 10.47.255.255
inet6 fe80::e03c:19ff:fe0c:834f prefixlen 64 scopeid 0x20<link>
ether e2:3c:19:0c:83:4f txqueuelen 1000 (Ethernet)
RX packets 633 bytes 40806 (39.8 KiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 967058 bytes 40619287 (38.7 MiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
ip link
......
23: weave: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1376 qdisc noqueue state UP mode DEFAULT group default qlen 1000
link/ether e2:3c:19:0c:83:4f brd ff:ff:ff:ff:ff:ff
제거해 준다.
sudo ifconfig weave down
sudo ip link delete weave
오류 상태인 Pod 를 삭제한다.
k delete po calico-node-lqstp -n kube-system
k get po -n kube-system -o wide
k describe po calico-node-426rn -n kube-system
......
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 49s default-scheduler Successfully assigned kube-system/calico-node-426rn to es-search05
Normal Pulled 49s kubelet Container image "docker.io/calico/cni:v3.24.5" already present on machine
Normal Created 49s kubelet Created container upgrade-ipam
Normal Started 48s kubelet Started container upgrade-ipam
Normal Pulled 48s kubelet Container image "docker.io/calico/cni:v3.24.5" already present on machine
Normal Created 48s kubelet Created container install-cni
Normal Started 47s kubelet Started container install-cni
Normal Pulled 47s kubelet Container image "docker.io/calico/node:v3.24.5" already present on machine
Normal Created 46s kubelet Created container mount-bpffs
Normal Started 46s kubelet Started container mount-bpffs
Normal Pulled 46s kubelet Container image "docker.io/calico/node:v3.24.5" already present on machine
Normal Created 46s kubelet Created container calico-node
Normal Started 45s kubelet Started container calico-node
Warning Unhealthy 43s (x2 over 44s) kubelet Readiness probe failed: calico/node is not ready: BIRD is not ready: Error querying BIRD: unable to connect to BIRDv4 socket: dial unix /var/run/calico/bird.ctl: connect: connection refused
connect: connection refused
k get po -n kube-system -o wide
k describe po calico-node-426rn -n kube-system
......
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
......
calico-node
Warning Unhealthy 43s (x2 over 44s) kubelet Readiness probe failed: calico/node is not ready: BIRD is not ready: Error querying BIRD: unable to connect to BIRDv4 socket: dial unix /var/run/calico/bird.ctl: connect: connection refused
위 오류는 서버 내에 여러개의 네트워크가 설치된 상태로 Calico 가 잘못된 네트워크를 선택했을 때 발생하는 오류이다.
kubectl set env daemonset/calico-node -n kube-system IP_AUTODETECTION_METHOD=interface=bond*
위 명령으로 외부 접속용 네트워크를 지정해 주면 오류가 없어진다.
어떤 네트워크가 외부 접속용 네트워크인지 모른다면,
회사 내에 네트워크 관리자가 별도로 있다는 의미이니,
그 사람에게 물어보면 된다.
Calico 가 자동으로 Pod 를 재생성한다.
k logs calico-node-kv8z8 -n kube-system
......
bird: Mesh_172_16_0_205: Connected to table master
bird: Mesh_172_16_0_205: State changed to wait
bird: Mesh_172_16_0_201: Connected to table master
bird: Mesh_172_16_0_201: State changed to wait
bird: Mesh_172_16_0_203: Connected to table master
bird: Mesh_172_16_0_203: State changed to wait
bird: Mesh_172_16_0_202: Connected to table master
bird: Mesh_172_16_0_202: State changed to wait
bird: Graceful restart done
bird: Mesh_172_16_0_201: State changed to feed
bird: Mesh_172_16_0_202: State changed to feed
bird: Mesh_172_16_0_203: State changed to feed
bird: Mesh_172_16_0_205: State changed to feed
bird: Mesh_172_16_0_201: State changed to up
bird: Mesh_172_16_0_202: State changed to up
bird: Mesh_172_16_0_203: State changed to up
bird: Mesh_172_16_0_205: State changed to up
2023-01-10 05:42:50.004 [INFO][100] felix/health.go 242: Overall health status changed newStatus=&health.HealthReport{Live:true, Ready:true
curl -L https://github.com/projectcalico/calico/releases/download/v3.24.5/calicoctl-linux-amd64 -o calicoctl
chmod 700 calicoctl
sudo mv calicoctl /usr/bin/
sudo calicoctl node status
Calico process is running.
IPv4 BGP status
+--------------+-------------------+-------+----------+-------------+
| PEER ADDRESS | PEER TYPE | STATE | SINCE | INFO |
+--------------+-------------------+-------+----------+-------------+
| 172.16.0.202 | node-to-node mesh | up | 05:42:26 | Established |
| 172.16.0.203 | node-to-node mesh | up | 05:42:25 | Established |
| 172.16.0.204 | node-to-node mesh | up | 05:42:47 | Established |
| 172.16.0.205 | node-to-node mesh | up | 05:42:38 | Established |
+--------------+-------------------+-------+----------+-------------+
IPv6 BGP status
No IPv6 peers found.