The wiki exists. The runbook exists. The onboarding guide exists.
None of them describe the system as it currently operates.
The knowledge is not in the documentation. It is in the person who stopped updating the documentation two years ago. This becomes visible when that person goes on vacation. Or when they resign.
⸻
The Documentation Decay
Documentation begins with good intentions. The first version is accurate. It reflects the current state of the system at the moment it was written.
Then the system changes. A configuration is updated. A dependency is swapped. A workaround is introduced during an incident and never removed. The documentation is not updated, because updating documentation is not measured, rewarded, or enforced.
Within six months, the documentation describes a system that no longer exists. Within a year, it actively misleads anyone who reads it.
The system has not lost knowledge. It has something worse: it has wrong knowledge wearing the uniform of authority.
⸻
The Expert Dependency
When documentation decays, knowledge concentrates. It pools in the heads of the two or three engineers who have been on the team long enough to know the real topology, the undocumented failure modes, the actual deployment sequence that differs from the runbook.
These engineers become critical path. Every incident routes through them. Every architecture question ends at their desk. They are indispensable, which means the organization has built a single point of failure out of a human being.
The organization does not see this as a risk. It sees this as a strength. "We have a strong senior engineer." It is a strength until that engineer accepts another offer.
⸻
The Transfer Illusion
When the risk becomes visible, the organization attempts a knowledge transfer. The expert is asked to document their knowledge, give talks, or pair with junior engineers.
This fails, because the knowledge that matters most is not procedural. It is contextual. The expert does not just know what the system does. They know why it was built this way, what alternatives were rejected, and which components are held together by assumptions that are no longer true.
This context cannot be transferred in a wiki page. It was accumulated over years of exposure. Compressing it into a two-week handoff is like compressing an apprenticeship into a slide deck.
⸻
Building Redundancy
Knowledge silos are not solved by documentation drives. They are solved by structural redundancy.
Rotate ownership. Require that no system has fewer than three engineers who can operate it independently. Make incident response a shared duty, not an expert's burden.
The cost is velocity. Two engineers learning a system are slower than one expert maintaining it. But the expert is a single point of failure, and single points of failure are the most expensive components in any architecture.
End.