The Toil Budget

"The toil is not the problem. The problem is that nobody budgets for the toil."
// 2 MIN READLOAD: NOMINAL
[OPERATIONS][DIAGNOSTIC]

The sprint is planned. The velocity is estimated. The capacity is allocated to features. The plan treats the team's productive hours as available for planned work.

They are not. A significant percentage of every sprint is consumed by toil: manual deployments, certificate rotations, access requests, environment provisioning, data fixes, and the accumulated operational friction that nobody owns.

The Invisible Tax

Toil is not tracked because it is not planned. It does not appear in the backlog. It does not have story points. It does not belong to a project.

The engineer who spends four hours troubleshooting a flaky CI pipeline does not log those hours against a ticket. They absorb the cost personally and show up to the standup with less progress than expected. The system reads this as underperformance. It was infrastructure overhead.

Aggregated across the team, toil consumes 15 to 30 percent of total engineering capacity. This is not a guess. It is a consistent finding across every engineering organization that has measured it honestly.

The Measurement Resistance

Engineering leadership resists measuring toil because the number is uncomfortable.

If the data shows that 25 percent of capacity is consumed by operational overhead, the immediate follow-up question is: "Why have you not automated it?" The answer is because automating it requires investment that competes with features, and features always win the prioritization contest.

Measuring the toil makes the cost visible. Visible costs demand justification. The system prefers the cost to remain invisible, distributed across individual contributors who absorb it silently.

The Automation Paradox

The obvious solution is automation. Automate the deployments. Automate the access provisioning. Automate the data fixes.

But automation is itself engineering work. It requires design, implementation, testing, and maintenance. The team that is already at capacity with planned work cannot absorb automation without dropping something from the sprint.

This creates a paradox: the team is too busy to invest in the automation that would make them less busy. The toil persists because eliminating it requires the very capacity that the toil consumes.

The Explicit Allocation

Toil will not be eliminated by hoping engineers find time. It requires a budget.

Reserve 20 percent of sprint capacity for unplanned operational work. Do not try to plan what fills this bucket. Let the team use it for whatever manual, repetitive, and undignified work needs to happen that week.

Then track what fills the bucket. After three sprints, patterns emerge. The patterns reveal where automation investment would have the highest return. Fund those investments explicitly, from the planned capacity you protected.

The toil budget is not a concession. It is an honest accounting of what the team actually does, followed by a strategy for doing less of it.

End.