Anjali Shrivastava

Welcome

I'm Anjali, a monetization Data Scientist working my way through the realities of inference economics.

What this project isn't:

speculation about an impending "AI bubble"
analysis of macroeconomic impacts of AI
strategy or advice for VCs

Rather, this is my attempt to describe how AI challenges SaaS economies of scale, and develop an opinion on how software pricing should evolve.

Essays

A token is not a stable unit of cost (Google Doc)

Tim O'Reilly Podcast DSP Podcast Tweet
TL;DR
The Nth token in a conversation is an order of magnitude more expensive than the first
Variable costs destabilize per-token API pricing, and Cursor and Anthropic's pricing changes indicate this
Heterogeneous usage leads to fat tailed risks that compound with usage growh
Tail risk represents lower margins, service degradation and system outages in extreme cases
Why fat tails emerge at scale (Google Doc)
TL;DR
A few long-context requests can explode memory use since each carries its own non-shareable KV cache
Because real-world traffic has near-infinite variance, the mean cost per request doesn’t converge
Without live telemetry on cache pressure, batching logic and P/D ratios, providers can’t optimize unit economics
The emerging pricing fix is priority contracts that turn unpredictable per-request costs into a forecastable aggregate-capacity problem
Opportunities for mechanism design (Suggestions welcome)

Diagrams

what else I'm thinking about:

causal inference to understand the relationship between workload characteristics and resource load
the impact of sequence length on costs, and what entropy reveals about it
credits as an evolutionary step towards value-based pricing and more experimentation

I'm at anjali.shrivastava99@gmail.com and on Twitter at @anjali_shriva.

View the old version of this site at /vintage