"There is no single metric that works — developer happiness is the true north, and every other metric is just an input trying to predict it"
Evidence from the Archive
GitHub
GitHub Copilot: 1.5M+ developers, 37,000+ organizations, 55% faster coding, 85% more confident in code quality
Shopify: over 1 million lines of code written by Copilot in their codebase
As CPO of GitHub overseeing Copilot -- the first widely adopted AI coding tool with 1.5M+ developers and 37,000+ organizations -- she has more data on how developers actually use AI assistance than almost anyone else in the industry. Their core argument: Copilot, not pilot -- the human must remain in the loop because software development requires systems thinking that AI cannot yet replace.
The evidence is specific: GitHub Copilot: 1.5M+ developers, 37,000+ organizations, 55% faster coding, 85% more confident in code quality. Furthermore, shopify: over 1 million lines of code written by Copilot in their codebase. Accenture: 88% of AI-suggested code was retained by developers.
In Inbal Shani's own words: "Copilot is a copilot. It's not a pilot. You still need the human in the loop. But that means that now the software developer or the user of the AI tools to develop software needs to form a different thinking." (Her core thesis on why the copilot model is the right architecture.)
GitHub
GitHub's Copilot is measured by code quality, security catches, and time-to-value — not time saved, because 'you can write really bad code really fast'
Inbal Shani sits on top of more developer telemetry than anyone and concludes there is no single metric to rule them all: developer happiness is the downstream outcome
Shani runs product for GitHub, which means she's sitting on top of more data about what developers actually do every day than almost anyone else. Her answer to 'how do you measure productivity' is the most deflationary on the spectrum: there are no right metrics, no one metric to rule them all.
When she evaluates whether Copilot is doing its job, she explicitly refuses to use time saved as the headline number — you can write really bad code really fast. Instead she collapses a basket of inputs (code quality, security vulnerabilities caught, collaboration patterns, time-to-value on shipped features) into what she calls the downstream outcome: developer happiness. If your metric regime doesn't ultimately improve how developers feel about their work, you're measuring the wrong things.
In Inbal's own words: "The most easiest one is time, but time is, it's funny what I'm going to say, but time is not quantifiable as a success metrics because you can write really bad code really fast." (Rejecting the most obvious metric — time saved — as meaningless in isolation.)