Welcome

I'm Anjali, a monetization Data Scientist working my way through the realities of inference economics.

What this project isn't:

  • speculation about an impending "AI bubble"
  • analysis of macroeconomic impacts of AI
  • strategy or advice for VCs

Rather, this is my attempt to describe how AI challenges SaaS economies of scale, and develop an opinion on how software pricing should evolve.


Essays

  1. A token is not a stable unit of cost (Google Doc)
    Tim O'Reilly Podcast DSP Podcast Tweet
    TL;DR
    The Nth token in a conversation is an order of magnitude more expensive than the first
    Variable costs destabilize per-token API pricing, and Cursor and Anthropic's pricing changes indicate this
    Heterogeneous usage leads to fat tailed risks that compound with usage growh
    Tail risk represents lower margins, service degradation and system outages in extreme cases
  2. Why fat tails emerge at scale (Google Doc)
    TL;DR
    A few long-context requests can explode memory use since each carries its own non-shareable KV cache
    Because real-world traffic has near-infinite variance, the mean cost per request doesn’t converge
    Without live telemetry on cache pressure, batching logic and P/D ratios, providers can’t optimize unit economics
    The emerging pricing fix is priority contracts that turn unpredictable per-request costs into a forecastable aggregate-capacity problem
  3. Pricing experiments under infinite variance (WIP)

Diagrams


what else I'm thinking about:


I'm at anjali.shrivastava99@gmail.com and on Twitter at @anjali_shriva.

View the old version of this site at /vintage