← home / notes
On distributed systems, reliability engineering, and what LLM agents are actually good for in infrastructure.
The dependency graph you draw from RPC traffic is real but incomplete. The dependencies that take you down are…
When you fuse many noisy event streams into one model, the hard problems are not throughput - they are identity and…
A pattern worth reusing: when an LLM makes a judgment at scale, do not let it grade itself. Pair a cheap predictor…
Most platforms that reason about a system are really stacks of queries against a model of that system. The recurring…
Two ways to feed a model context: retrieve fuzzy passages by similarity, or call purpose-built tools that return…
Before you wire an LLM into a real workflow, decide how you will know it is good enough - because the eval is the…
A separate model can score outputs you cannot label by hand - but only if you treat it like a measuring instrument: a…
Most items in a large workload are easy, and a few are genuinely hard. The cheapest reliable pipeline is the one that…
When an LLM is the caller, the interface is the prompt. Typed responses, idempotent writes, granular composable…
A correct model or a sharp analysis is necessary but not sufficient. What gets internal AI actually used is the last…
Reliability effort gets spent on whatever feels scary in the room. Here is how to replace that gut feel with a…
A live broadcast turns one source into millions of simultaneous viewers in seconds. The hard part is not the video -…
In 2018 I helped build an RNN model that turned English questions into SQL, trained on WikiSQL and later published…