AI · Featured Debate

5 guests 6 episodes 2,974 words

Copilot or Autopilot? The AI Architecture Decision That Defines the Next Decade

Should AI products be copilots or fully autonomous agents?

GitHub named its AI tool "Copilot" -- not "Pilot" -- for a reason. But Bret Taylor thinks the entire market is going toward agents. Scott Wu built Devin to work as a fully autonomous junior engineer that submits its own pull requests. Intercom's AI agent Fin is on track for $100M ARR in under three quarters, priced at $0.99 per resolved ticket. The copilot metaphor is comforting, but is it already outdated?

This is not a philosophical debate about the future of AI. It is a concrete product architecture and business model decision that every team building with AI has to make right now: do you build a tool that assists humans, or a system that acts on their behalf?

Should AI products be designed as copilots (augmenting humans in real time) or as autonomous agents (completing tasks end-to-end independently)? And how does this choice shape your product, pricing, and competitive position?

The 5 Positions

1 / 5

The Market Is Going to Agents -- Outcome-Based Pricing Is the Correct Business Model

Bret Taylor

Bret Taylor, co-founder and CEO of Sierra and board chair of OpenAI, is unequivocal. Taylor co-created Google Maps, invented the Like button at FriendFeed, served as CTO of Facebook, and was co-CEO of Salesforce -- giving him arguably the broadest executive vantage point on software business models of anyone in tech. His argument for agents is not primarily a technology argument. It is a business model argument.

"The whole market is going to go towards agents. I think the whole market is going to go towards outcomes-based pricing. It's just so obviously the correct way to build and sell software."
Bret Taylor
▶ 00:00:07

The core insight is structural. Productivity software -- tools that make humans faster -- is brutally hard to sell. Gains are diffuse, hard to measure, and even harder to attribute. The copilot model inherits this problem: charging per seat for an AI assistant means asking customers to pay for potential, not outcomes.

"It's so hard to sell productivity software, which I learned the hard way."
Bret Taylor
▶ 00:00:21

That hard way was Quip, Taylor's productivity company, which he sold to Salesforce for $750 million -- a solid outcome but one that taught him firsthand how resistant enterprises are to paying for tools that merely augment. Agents break this dynamic entirely. When AI runs autonomously and delivers a measurable result -- a support ticket resolved, a lead qualified, a code task completed -- you can charge for that outcome. The vendor earns money only when the customer gets value. This creates a self-reinforcing flywheel: better AI produces better outcomes, which generates more revenue, which funds better AI. Copilots lack this loop because their value capture is indirect.

However, Taylor's argument contains a hidden assumption: that outcomes are cleanly attributable to the AI. In domains where human and AI contributions are entangled -- creative work, strategic decisions, complex multi-step processes -- isolating the agent's contribution becomes genuinely difficult. And as CEO of Sierra, an AI agent company, Taylor has a direct financial stake in the agent thesis.

When This Applies

when outcomes are clearly measurable and attributable (Intercom's Fin at $0.99/resolution), when tasks are well-defined and repeatable (customer support, code tasks, data processing), when you can charge for outcomes rather than access (Sierra's model), and when asynchronous delegation is natural to the workflow (Cognition's Devin via Slack and Linear).

Copilot, Not Pilot -- The Human Must Remain in the Loop

Inbal Shani

Inbal Shani, then chief product officer of GitHub, defends the copilot model with a systems-thinking argument rooted in more data about developer behavior than anyone else has. As CPO overseeing GitHub Copilot -- with 1.5 million developers and over 37,000 organizations -- she has observed at scale how humans actually use AI assistance.

"Copilot is a copilot. It's not a pilot. You still need the human in the loop. But that means that now the software developer or the user of the AI tools to develop software needs to form a different thinking."
Inbal Shani
▶ 00:08:02

Shani's argument is that the hardest part of software development is not writing code but understanding systems, architecture, and connected experiences. Those are precisely the capabilities AI cannot yet replicate. Copilots preserve human judgment where it matters most while automating the mechanical work.

More importantly, she argues that copilots elevate rather than diminish the human role. When AI handles code generation, junior developers can "spend more time from the get-go understanding the system, understanding the environment that they're building, understanding the product that they're building, which today they don't have time because they are still learning how to code." The copilot model does not just make developers faster -- it makes them think more like senior engineers from day one.

"Let me be very clear, you cannot cut your people. You have to have a human in the loop. Copilot is a copilot, it is not a pilot. And I keep on saying that sentence again and again."
Inbal Shani
▶ 00:12:23

Her data supports the claim: developers write code 55% faster with Copilot, 85% feel more confident in code quality, and crucially, the most elite engineers love it just as much as juniors do. It is not a crutch -- it is a genuine multiplier at every skill level.

The temporal caveat matters here: Shani's interview is from December 2023, before the explosive growth of agentic coding tools throughout 2024-2025. GitHub itself has since moved significantly toward more agentic features, and competitors like Cursor ($300M ARR in two years) have proven that developers want more autonomy in their tools, not less. The copilot framing may have been correct at the time while simultaneously being a transitional stage.

When This Applies

when errors are expensive and irreversible (healthcare, legal, financial transactions), when regulatory requirements mandate human oversight, when your users need to learn alongside the AI (education, creative work), or when your product serves as a multiplier to expensive human time (GitHub Copilot for developers, Microsoft Copilot for knowledge workers).

Fully Autonomous Agents Working Asynchronously End-to-End

Scott Wu

Scott Wu, co-founder and CEO of Cognition, represents the most aggressive position. Wu built Devin, the world's first autonomous AI software engineer, and his 15-person team runs one of the most agent-native engineering organizations in existence, with each engineer managing up to five Devins simultaneously.

"Devin is a fully autonomous software engineer that is going to work on tasks end to end... you can tag Devin on an issue in Slack, you can tag Devin in Linear, and Devin will make pull requests in your GitHub, and so it's very much built to work with engineering teams as your junior engineer."
Scott Wu
▶ 00:06:10

The critical difference from copilots is the interaction model. You do not pair with Devin in real time. You delegate to it asynchronously, the way you would assign a task to a human engineer. The best workflow, Wu found, is "to work with multiple Devins and to run them asynchronously and to kick them off and to only jump in basically as you needed to provide feedback or steer the plan." This is delegation and review, not collaboration.

Wu is honest about the limits. He embraces the concept of "jagged intelligence" -- agents are dramatically better than humans at some tasks and dramatically worse at others. Devin evolved from "a high school CS student" at launch to "a college intern" to "a junior engineer" over the course of a year. The trajectory matters more than the current state.

"People didn't really believe that an agent was possible. Right. And it was, I mean, it was a very different time."
Scott Wu
▶ 00:07:02

Cognition's own metrics tell the story: about a quarter of their production PRs are authored by Devin today, and they expect this to exceed 50% by year end. But Wu identified a crucial pattern in customer success: companies that thrive with Devin follow the same onboarding process they would use for a new human engineer. Start with small tasks, let the agent get familiar with the codebase, set up the environment, then gradually increase scope. Customers who failed gave Devin massive re-architecture tasks on day one -- just as a new hire would fail at those.

Agents at Scale -- Proof the Business Model Works

Eoghan McCabe

Eoghan McCabe, founder and CEO of Intercom, provides the most compelling existence proof. McCabe founded Intercom 14 years ago, grew it to a multi-billion-dollar valuation, then watched it approach negative growth before transforming it into the market-leading AI agent company. Fin, their customer support agent, is growing at 300%+ year over year, leading the market by every metric.

"Fin is our AI agent who will pass 100 million ARR with Fin in less than three quarters."
Eoghan McCabe
▶ 00:00:15

McCabe's most revealing insight is about pricing. Intercom historically had one of the most hated pricing models in SaaS -- complex, opaque, and widely mocked in viral memes. For Fin, they made a deliberate philosophical choice: charge $0.99 per resolution so that revenue was "100% aligned with the value that they attained." Early on, each resolution cost Intercom $1.20 to serve, meaning they lost money on every transaction. McCabe bet that inference costs would drop. They did.

"We charge 99 cents to resolve tickets, customer problems, and we have a higher resolution rate than anyone else... It is the metric by which these agents are assessed, and we wanted our revenue to be 100% aligned with the value that they attained."
Eoghan McCabe
▶ 00:23:33

The $0.99 price was strategically chosen. Their research showed businesses typically spent $20-30 per human-resolved ticket. But McCabe sensed customers would "not value the digital work as much as the human work, even though the digital work is better." Sub-dollar pricing removed the objection entirely: if you will not pay 99 cents for an instant, expert resolution available 24/7, there is no business here.

McCabe also challenged his own history. Intercom's original mission was "make internet business personal." Now they sell an AI agent. His resolution: "Providing a customer with a highly engaged, instantly available expert, consistent, fast, charismatic, funny, friendly, personal agent available for literally every single customer every minute of the day around the clock is so much more personal than making them wait 2, 3, 4 days for a crappy canned response."

When This Applies

regardless of your starting point. Build observability, audit trails, and confidence scores that let users gradually increase the AI's autonomy as they gain trust. The products that successfully navigate this transition -- from human-in-the-loop to human-on-the-loop to human-over-the-loop -- will capture the most value.

Both Models Will Coexist on a Spectrum

Aparna Chennapragada

Aparna Chennapragada, chief product officer at Microsoft, offers the most analytical framework. Previously CPO at Robinhood and VP at Google (Google Lens, Search, AI Assistant), she brings experience across consumer and enterprise AI at massive scale.

Rather than picking a side, she defines agents along three dimensions: autonomy (how much you can delegate), complexity (multi-step vs. one-shot tasks), and natural interaction (conversational rather than GUI-based). This reframes the debate from a binary to a continuous spectrum.

"When I think about agents, I think about three things. One, it's autonomy... And it's a spectrum, it's not a zero-one. Second, I think of as complexity. It's not a one-shot, 'Hey, summarize this document,' but it's 'build me this prototype.' And then the third one I think of is it's a much more natural interaction."
Aparna Chennapragada
▶ 00:17:10

Her position on pricing is equally nuanced. She identifies three coexisting models -- seat-based, usage-based, and outcome-based -- and argues at least two should coexist.

"We've just barely scratched the surface of whether you do seat monetization, usage like on tap, and then of course outcome-based stuff... So all three to me are kind of like, great, but at least two out of three should coexist."
Aparna Chennapragada
▶ 00:43:24

She also introduces a concept no one else raises: NLX -- Natural Language Experience as the new UX. Agent products are not just "chat with AI." Conversations have grammars, structures, and invisible UI elements. Plans, progress indicators, follow-ups, and editable workflows all need to be designed as deliberately as any GUI. The companies that master NLX design will build durable advantages regardless of where their products sit on the copilot-to-agent spectrum.

When This Applies

when your product serves users at different skill levels or in different contexts (Microsoft's dual bet on Copilot and agents). Power users may want agent-level automation while new users need copilot-level guidance. Distribute across tiers as Chennapragada suggests -- at least two of three pricing models should coexist.

Evidence from the Archive

Intercom

Fin: $100M ARR trajectory in <3 quarters, 300%+ YoY growth, #1 by customer count, revenue, and benchmarks

Early Fin economics: $0.99 revenue vs. $1.20 cost per resolution -- a deliberate bet on declining inference costs

Eoghan McCabe · Founder and CEO ▶ 00:00:15

Cognition (Devin)

Cognition's 15-person team: each engineer uses 5 Devins, ~25% of PRs are agent-authored, expected to exceed 50%

Devin integrates with Slack, Linear, and GitHub -- the same tools human engineers use

Scott Wu · Co-founder and CEO ▶ 00:06:10

Microsoft

Cursor hitting $300M ARR in 2 years -- a competitive challenge to Microsoft's Copilot

Microsoft Copilot across Office products (seat-based augmentation) alongside agent research projects

Aparna Chennapragada · Chief Product Officer ▶ 00:43:24

Andreessen Horowitz (a16z)

Linus Torvalds acknowledging AI now codes better than the world's best programmers (holiday break 2025)

Executive/secretary email example: tasks shifted but both jobs persisted with different task bundles

Marc Andreessen · Co-founder and General Partner

GitHub

GitHub Copilot: 1.5M+ developers, 37,000+ organizations, 55% faster coding, 85% more confident in code quality

Shopify: over 1 million lines of code written by Copilot in their codebase

Inbal Shani · Chief Product Officer ▶ 00:08:02

Sierra / OpenAI

Sierra's AI agents handle customer interactions end-to-end with outcome-based pricing

Quip (Taylor's productivity startup, sold to Salesforce for $750M) is cited as evidence of how hard it is to monetize productivity tools

Bret Taylor · Co-founder & CEO of Sierra; Board Chair, OpenAI ▶ 00:00:07

The Synthesis

The copilot-vs-agent debate is really a debate about three things: task measurability, error tolerance, and business model.

Task Measurability

What three variables actually determine whether to build a copilot or an agent?

Trust Curve

Why does the transition follow trust, not capability?

Institutional Resistance

Why will copilots persist even where agents are technically superior?

Task Unbundling

Is the real unit of analysis the product or the task?

The copilot-vs-agent debate is really about three things: task measurability, error tolerance, and business model. Outcome-based pricing creates structurally superior business dynamics, but requires clearly measurable and attributable AI impact.

The copilot-to-agent transition follows a trust curve, not a capability curve. New AI capabilities start as copilots with high oversight and graduate to agents as users build confidence. GitHub Copilot started as code completion and is progressively becoming more agentic. The most successful customers onboard agents like junior engineers: incrementally, through small wins.

Regulated industries, unionized workforces, and bureaucratic organizations will mandate human-in-the-loop for legal and political reasons long after the technical case for autonomy is settled. ChatGPT may be a better doctor than your doctor today, but it cannot get a license to practice medicine.

The real unit of analysis is the task, not the product. The copilot-vs-agent choice is about which tasks to shift from human to AI. The job persists longer than the individual tasks. Ten years from now, the job title might simply be 'I build products' -- encompassing what we currently separate into PM, engineering, and design.

Which Approach Fits You?

Answer 3 questions about your situation. We'll match you to the right approach.

Question 1

How measurable and attributable are the outcomes of your AI product?

Question 2

What is the error tolerance in your domain?

Question 3

How do you want to price your AI product?

Notable Absences

The Bottom Line

Andreessen adds a structural reason copilots may persist even where agents are technically superior: institutional resistance. "ChatGPT is almost certainly a better doctor than your doctor today, but ChatGPT can't get a license to practice medicine." Regulated industries, unionized workforces, and bureaucratic organizations will mandate human-in-the-loop for legal and political reasons long after the technical case for autonomy is settled.

The non-obvious insight: the copilot-to-agent transition follows a trust curve, not a capability curve. New AI capabilities start as copilots (low trust, high oversight) and graduate to agents (high trust, low oversight) as users build confidence. GitHub Copilot started as code completion and is progressively becoming more agentic. Intercom had a working Fin prototype six weeks after GPT-3.5, but pricing it at $0.99 per resolution was a trust bet that took months of conviction. Scott Wu's most successful customers onboard Devin the same way they would onboard a junior engineer: incrementally, building trust through small wins.

Sources

Bret Taylor — "He saved OpenAI, invented the “Like” button, and built Google Maps: Bret Taylor on the future of careers, coding, agents, and more" — Lenny's Podcast, July 31, 2025
Inbal Shani — "The future of AI in software development | Inbal Shani (CPO of GitHub)" — Lenny's Podcast, December 1, 2023
Scott Wu — "How Devin replaces your junior engineers with infinite AI interns that never sleep | Scott Wu (Cognition CEO)" — Lenny's Podcast, September 8, 2025
Eoghan McCabe — "How Intercom rose from the ashes by betting everything on AI | Eoghan McCabe (founder and CEO)" — Lenny's Podcast, August 21, 2025
Aparna Chennapragada — "Microsoft CPO: If you aren’t prototyping with AI, you’re doing it wrong | Aparna Chennapragada" — Lenny's Podcast, May 18, 2025