Skip to content
All projects
Distributed SystemsActiveJun 2026 — Present

DKVS

A fault-tolerant, replicated key-value store written from scratch in C++20 — an original implementation of Raft.

0

external runtime dependencies

<1 s

leader failover, zero data lost

CRC32

per-record WAL integrity

No frameworks, no serialization libraries, no consensus libraries. Leader election and log replication are implemented directly from the Raft paper; persistence is a CRC-checked write-ahead log; the only runtime dependencies are POSIX sockets and threads.

01

Why build it from scratch

DKVS exists to understand what replication actually costs — the consistency trade-offs and fault-tolerance mechanics that libraries normally abstract away. Every command a client submits becomes an entry in a replicated log. The leader replicates it to followers; once a majority has fsynced it, the entry commits, each node's applier feeds it to the in-memory state machine, and the client handler that submitted it gets the result.

The design is strictly layered: a client server (thread per connection) accepts a text protocol, a RaftNode handles consensus RPCs between peers, a storage layer owns metadata and the write-ahead log, and the KVStore state machine applies only committed entries.

02

Correctness under failure

The interesting work is in the failure paths. A deposed leader stranded behind a network partition can never serve stale data as committed, because reads are serialized through the log like writes. A node that crashes mid-write recovers by replaying its log and rejoining the cluster; CRC32 checks detect torn tail writes and discard them safely.

The project is exercised through a live cluster script — start three nodes, write through the CLI, kill the leader mid-flight, and verify that a new leader answers with data intact.

03

Highlights

  • Leader election and log replication (Raft): commands are acknowledged only after a majority of nodes has durably stored them
  • Crash recovery: every node fsyncs entries to a write-ahead log before acknowledging; torn or corrupted tail writes are detected by per-record CRC32 and discarded safely
  • Automatic failover: kill -9 the leader and the survivors elect a new one in a few hundred milliseconds, with zero committed data lost
  • Linearizable operations: reads go through the replicated log, so a GET observes exactly the writes committed before it — even across leader changes
  • Quorum safety: with a majority of nodes down, the cluster refuses writes rather than diverging (CP)
  • Client redirects: clients may contact any node; non-leaders answer REDIRECT and the CLI follows automatically