Ugrás a tartalomra
Címlap
enaplo.hu

Fő navigáció

  • Címlap
  • Blog
Felhasználói fiók menüje
  • Bejelentkezés

Morzsa

  1. Címlap
  2. Blogok
  3. gabor.auth blogja

Was the raft log corrupted, truncated, or lost?

Beküldő: gabor.auth , 2 augusztus, 2025
  • gabor.auth's Blog
  • A hozzászóláshoz regisztráció és bejelentkezés szükséges

So, Kubernetes, bare-metal cluster, and the etcd died on one of the control-plane machines after a reboot due to a hardware failure. You can clearly see that the control-plane node is "NotReady" and the log shows:

panic: tocommit(11989253) is out of range [lastIndex(11989215)]. Was the raft log corrupted, truncated, or lost?

What is the Kubernetes way? Kill egy node, create a new node, add this new node to the cluster and that's it. But this is an easily fixable error anyway:

  1. stop the "kubelet" on the affected node,
  2. remove the broken "etcd" member from the cluster on any healthy node,
  3. remove the "etcd" broken data,
  4. start the "kubelet" on the affected node.

In CLI commands:

# systemctl stop kubelet
# kubectl exec --stdin --tty -n kube-system etcd-control-plane1 -- sh
# export ETCDCTL_API=3
# etcdctl --endpoints=https://127.0.0.1:2379 --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/healthcheck-client.crt --key=/etc/kubernetes/pki/etcd/healthcheck-client.key -w table member list
+------------------+---------+----------------+-------------------------------+--------------------------------+------------+
|        ID        | STATUS  |     NAME       |          PEER ADDRS           |         CLIENT ADDRS           | IS LEARNER |
+------------------+---------+----------------+-------------------------------+--------------------------------+------------+
| 3bdfefa0fdf07aec | started | control-plane1 |    https://192.168.74.84:2380 |    https://192.168.74.84:2379  |      false |
| 62c4c2a1fcdcc7f8 | started | control-plane2 |    https://192.168.79.78:2380 |    https://192.168.79.78:2379  |      false |
| fe45461662c93a78 | started | control-plane3 |  https://192.168.212.236:2380 |  https://192.168.212.236:2379  |      false |
+------------------+---------+----------------+-------------------------------+--------------------------------+------------+
# etcdctl --endpoints=https://127.0.0.1:2379 --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/healthcheck-client.crt --key=/etc/kubernetes/pki/etcd/healthcheck-client.key member remove fe45461662c93a78 
Member fe45461662c93a78 removed from cluster 3bdfefa0fdf07aec
# etcdctl --endpoints=https://127.0.0.1:2379 --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/healthcheck-client.crt --key=/etc/kubernetes/pki/etcd/healthcheck-client.key member add fe45461662c93a78 
--peer-urls https://192.168.212.236:2380
# systemctl- start kubelet

That's it... :)

Blog comments

Blog tags
devops
kubernetes

Search form

Language switcher

  • English
  • Hungarian

Popular content

Mai:

  • Was the raft log corrupted, truncated, or lost?
  • Hello Drupal World!
  • Reloaded
  • Reloaded
  • Dilemma

Összesített:

  • Facebook?!
  • Hello Drupal World!
  • Dilemma
  • Was the raft log corrupted, truncated, or lost?
  • Reloaded

Legutóbb olvasott:

  • Was the raft log corrupted, truncated, or lost?
  • Facebook?!
  • Reloaded
  • Hello Drupal World!
  • Dilemma

Blog tags

Blog tags
devops
kubernetes

Friss hozzászólások

Nincs megjeleníthető hozzászólás.
RSS hírcsatorna
A gépházban: Drupal