Coinbase CEO Brian Armstrong revealed how Coinbase halved its AI spend while token usage grew "exponentially". According to him, key tactics include defaulting to cheaper open-weight models like GLM 5.2 and Kimi 2.7 via an LLM gateway, AI-driven prompt preprocessing for optimal model selection, keeping context lean and raising cache hit rates from 5% to 60% in tools like LibreChat.