Artificial intelligence is not constrained by model capability. It is constrained by economics.

In Christian Catalini’s new paper, “Some Simple Economics of AGI”, the ceiling of AI is defined not by model capability, but by two racing cost curves: verification and deployment costs. Automation happens only when outputs can be validated at a reasonable cost and at scale.

As frontier capabilities expand and enterprises accelerate adoption, the central question is no longer “Can AI do this work?”, but rather “Can the value of this work be proven?” When it comes to vertical AI, market structures with greater verifiability receive a premium over markets that do not.

Expansive Economics of AI 

Four years after ChatGPT’s introduction, businesses are in an AI arms race. Vertical AI startups are proliferating across sectors, including legal, finance, real estate, insurance, healthcare, construction, home services, and many others.

The goal of the “Some Simple Economics of AGI” research paper is to identify the boundary of what an AI can accomplish. The conclusion is that the real boundary lies on (1) Cost to Automate, (2) Cost to Verify.

The Cost to Automate curve is falling rapidly, driven by improvements in compute efficiency, model architectures, and accumulated training data. Benchmarks such as SWE-bench show dramatic year-over-year gains in problem-solving accuracy. Task horizons that frontier systems can autonomously manage are expanding on sub-annual cycles.

  • Is the task structurally legible?

  • Is performance objectively benchmarkable?

  • Does automation create a material cost advantage?

While the Cost to Verify bends slowly, constrained by process structures, institutional capacity, and regulatory frameworks. Verification is tethered to human attention and apprenticeship pipelines that produce expertise.

  • Is there objective ground truth?

  • Are outcomes directly and causally attributable?

  • What level of precision is required? (e.g., institutional trust, regulation, etc.) 

Some Simple Economics of AGI

Christian maps economic activities into four structural regimes based on these two axes. Durable advantage belongs not to those who generate output, but to those who can certify it. The “sweet spots” for startups have become activities that are verifiable and highly automatable.

  • Q1 Safe industrial zone: Structured, rules-based tasks with observable ground truth. Outputs are auditable and scale safely (e.g., coding, customer services).

  • Q2 Economic blind spot / Runaway risk zone: Automatable tasks, but outcomes are multi-causal, delayed, or judgment-dependent. Human oversight limits scale (e.g., sales qualification, marketing campaigns)

  • Q3 Human manual / Artisan zone: Hard-to-automate tasks with tacit knowledge, but outputs are relatively easy to evaluate. AI may assist but does not displace (e.g., design, field services)

  • Q4 Pure tacit zone: Subjective, long-horizon, or preference-driven tasks. Ground truth is diffuse; both automation and verification are difficult (e.g., modelling, management)

Evaluating Vertical AIs (by Level of Verifiability)

As an investor, this requires a fundamentally different lens for assessing market structure and value capture for vertical AI versus vertical SaaS during the last decade.

In the prior SaaS era, vertical software differentiated primarily through industry-specific logic and workflow customization. Two companies with $100M in ARR in different industries could command similar multiples if growth and margins were comparable.

In the new world of SaaS, vertical market structure can diverge meaningfully based on verifiability. The ability to attribute performance to AI systems will determine competitive positioning and long-term pricing power.

To assess this systematically, vertical AI markets can be evaluated along two axes: (1) Automation Feasibility, (2) Attribution Clarity

Automation Feasibility: This dimension measures the technical and economic viability of replacing labour with machine execution

  • What is the surface area of automatable workflows?

  • What is the volume and frequency of those workflows?

  • How technically difficult is automation within this domain?

Industries that are text-heavy, rules-based, and digitally native tend to score highly. Legal drafting, credit underwriting, compliance analysis, and claims processing represent domains where automation potential is substantial, because tasks are legible and benchmarkable.

Attribution Clarity: This dimension measures whether AI performance can be tied directly to auditable economic outcomes

  • Are system outputs tied directly to measurable KPIs?

  • Can performance be isolated and causally linked to economic outcomes?

  • Is verification objective and repeatable?

Industries where AI outputs are directly tied to measurable financial outcomes tend to score highly on attribution clarity. Areas like expense management, fraud detection, and booking services allow AI services to be directly attributable to economic value, enabling organizations to measure outcomes as core drivers of value creation.

For example, in multifamily housing, AI can sit directly inside revenue and cost flows where it manages leasing inquiries, renewal pricing, maintenance triage, vendor dispatch, and delinquency management. They are tied to occupancy, rent collection, staffing ratios, and expense control, which are verifiable outcomes with clear ties to business success.

If response times improve and occupancy rises 150 basis points, the incremental revenue is observable. If maintenance routing reduces technician overtime, the savings show up in payroll. If renewal pricing algorithms lift effective rents, the delta can be audited across the portfolio.

In contrast, legal research or investment analysis may speed up drafting or diligence, but ultimate outcomes are multi-causal. Even when quality improves, isolating economic impact is difficult. Productivity gains may reduce billable time rather than increase revenue; however, they can’t be directly linked to desired outcomes.

This distinction does not render legal AI unattractive. It simply places it in a different structural quadrant. Vertical AI companies operating in markets with high outcome verifiability enjoy a structural advantage. When economic impact can be independently measured and audited, defensibility strengthens. When it cannot, competition tends to compress value capture over time.

Avoiding The Goodhart Trap

A common risk in AI evaluation arises when measurable KPIs are treated as objectives rather than proxies for value. As AI systems optimize for these metrics, they can inadvertently maximize the metric without improving the underlying economic outcome.

  • If underwriting approval speed is the focus, throughput may rise at the expense of portfolio health or default risk.

  • If customer engagement scores drive evaluation, systems optimize for interaction volume rather than true problem resolution

  • If time spent per financial analysis becomes the target, AI may prioritize speed over the depth, accuracy, or strategic insight of the analysis

For investors, the key lens is whether AI is optimized for verifiable economic outcomes, not just measurable proxies. High attribution clarity signals that a system influences real value creation: revenue, margin, or risk reduction, rather than simply improving surface-level metrics. 

The Test 

  1. Is the software embedded in the transaction (inflow and/or outflow)? If yes, the system is embedded in the transaction and drives revenue/cost outcomes.

  2. Can performance be independently verified? If yes: AI outputs can be audited against the customer’s measurable KPIs, creating defensibility and pricing power.

  3. Are the AI-driven processes consistent and repeatable across customers and over time? If yes, the system reliably delivers effective outcomes across similar workflows or scenarios, demonstrating leverage through scalability 

Conclusion

Markets differ. The extent to which AI can drive verifiable value will differ across verticals. For investors, in the next 12-24 months, the question will increasingly shift from whether AI can perform tasks to whether it can prove its value to the customer. Those that pass this test create scalable, defensible businesses; those that do not remain marginal tools will face immense pricing pressure and commoditization.

Keep Reading