PydanticAI vs LangChain - Choosing an Agent Framework for Production, Not Demos
When you're choosing an agent framework, popularity is the wrong scorecard. Pick the one that fails loudly in development and gracefully in production - or you'll find out in audit.
Why Architecture Reviews Need Contracts, Not Chat
This article shows how to build a multi-agent architecture reviewer that produces a structured review artifact: normalized findings with severity, evidence, and recommendations, plus clarifying questions and explicit “needs human judgment” flags.
uv for Faster Teams, Fewer Environment Fires
Python dependency management is not a developer problem. It is a team productivity problem that shows up as slow CI, painful on-boarding, and a different tool in every repository.
The 3am Pager - A Scrappy LLM Cost Monitor with Python and ntfy.sh
This article is for the ones building an LLM product and want to know what it actually costs you - per feature, per user, per request - without paying for Datadog or wiring up OpenTelemetry.
Your RAG Pipeline Is Lying to You
We will walk through a concrete RAG example - a pipeline over a corporate annual report - and build the testing layer that most teams skip entirely. The code is real and runnable. The failures are not hypothetical.