elder
dev

Digging Into etcd

OSS:kubernetes::fire:

December 1st, 2019


What Is etcd? 🔗︎

etcd per the official site is:

A distributed, reliable key-value store for the most critical data of a distributed system

I presume etcd is a play on /etc and the long history of nameing daemons with a d suffix, a daemon for your /etc config, though I’ve not yet found proof of this.

Kubernetes uses etcd as the backing store for cluster data, which drove my own interest in collecting the information in this post.

Clearly a lot of clusters out there are using etcd for critical data storage, but how does it work?

History 🔗︎

For a history of etcd see: https://coreos.com/blog/history-etcd

For bonus points: https://www.wired.com/2013/08/coreos-the-new-linux/

Roughly etcd was created out of a desire for a distributed data store addressing the following issues:

Initially etcd was used by coreOS’ fleet container orchestration system, but it was quickly adopted for other uses and later donated to the CNCF.

Architecture 🔗︎

Overview 🔗︎

Data Model 🔗︎

etcd’s upstream documentation is instructive here: github.com/etcd-io/etcd/blob/master/Documentation/learning/data_model.md

Consensus 🔗︎

Leader election is used to maintain a single leader replica, all requests are routed to the leader internally and comitted only after acheiving consensus on the request.

Raft is the consensus algorithm used both requests and for leader elections. The official raft site is a a good reference for understanding how this works. Another great resource linked from the official site is thesecretlivesofdata.com/raft/.

etcd’s raft implementation is widely used and contains some useful documentation.

Storage 🔗︎

Data is stored with a memory-mapped B+ tree using bbolt, a fork of bolt, inspired by LMDB.

TODO 🔗︎

Additional Resources 🔗︎

The Carnegie Mellon Database Group “Database of Databases” site has a great page on etcd at dbdb.io/db/etcd