"You can instrument delivery metrics, but the question 'how productive are my engineers' is the most annoying question in engineering leadership because it almost always means something the numbers cannot answer"
"A complete rewrite is a decision that never works out for anyone — I wish I had been experienced enough to see that before I walked into Digg v4."
Evidence from the Archive
Carta
Carta's engineering organization as a context for implementing real IC accountability
Larson's archetypes for staff engineers: the tech lead, the architect, the solver, the right hand
Will Larson is the author of Staff Engineer and An Elegant Puzzle, has led engineering at Stripe, Uber, and Calm, and is currently CTO at Carta -- making him one of the most cited voices on engineering career paths in the industry. Their core argument: The IC career path is real, but companies have failed to define what senior IC roles actually look like -- and without accountability, the titles become retention gimmicks that devalue the entire track.
The evidence is specific: Larson's archetypes for staff engineers: the tech lead, the architect, the solver, the right hand. Furthermore, the retention-driven title inflation: 'Staff Engineer' meaning 'senior engineer we wanted to keep'. Carta's engineering organization as a context for implementing real IC accountability.
In Will Larson's own words: "One of the things that I've been pushing on, I wrote my last book, Staff Engineer, about what is the career path for senior engineers. One of the challenges is if we aren't comfortable holding engineers accountable because we just want to retain all the engineers, then we can't put them in senior roles." (Connecting the accountability gap to the inability to create real senior IC roles.)
Digg
Digg v4 launch day: caterers, sushi, and champagne flutes around a table of engineers — because the site was not up. It took a month to come back
Will Larson lived through one of the most famous rewrite failures in startup history, driven by a genuine capability gap (social features) — and it still killed the company
Larson brings the most viscerally specific rewrite horror story in the Lenny archive: Digg v4. The decision to do a complete rewrite was made two and a half years before he joined, driven by a real strategic insight — that Digg was losing to the emerging social networks and needed a social layer the existing architecture couldn't support. It's the strongest possible case for a rewrite: not aesthetics, not accumulated debt, but a genuine capability gap.
And it still failed catastrophically. Launch day was sushi, champagne flutes, and caterers arranged around a table of engineers because the site was not up. It took the team a full month to get the rewritten product functional, during which most engineers had already given up; only about five were still actively trying to bring it back. The site was eventually restored, but Digg ran out of money and sold for parts.
In Will's own words: "The decision that was done, I think two and a half years before I joined, and the shift about six months after I joined was they needed to do a complete rewrite in order to get there. This is a decision that never works out for anyone. So I think as someone with more experience, I could have predicted this wasn't going to work out. But I was earlier in my career." (Reflecting on the Digg v4 rewrite.)
Stripe
Will Larson's Stripe incident-management team got so absorbed in analytics they forgot to actually reduce incidents — 'we weren't prioritizing improvements, we were prioritizing measurement'
His broader critique: DORA metrics are diagnostic (where to look) not evaluative (whether you're good), and the dashboard industry has commoditized them without preserving the caveats
Larson comes at this from the opposite end of the spectrum from Forsgren — not as a skeptic of the research but as someone who has sat in the CTO seat while a board asked him to prove his team is productive enough. He calls productivity measurement 'probably one of the most common and also maybe the most annoying questions eng leaders get.'
His critique has two layers. First, DORA metrics are diagnostic, not evaluative — they tell you where friction is, not whether your org is good. Second, he has personally lived the failure mode of over-measuring. At Stripe, his team worked on incident management and got so caught up in building perfect analytics for incidents that they forgot to actually reduce them. That's the specific trap the skeptical camp warns about: measurement eating the work it's supposed to describe.
In Will's own words: "We weren't actually prioritizing improvements, we were just prioritizing measurement. And you can't keep measuring. There's measure twice, cut once. Sure, but you don't measure infinite times and never get to cut." (Describing how his incident management work at Stripe got stuck in analysis instead of reducing incidents.)