some cached thoughts on my explorations

teaching a model to manage kv-cache memory

Apr 26, 2026

building an rl environment to learn how kv-cache eviction works in llm serving systems

speeding up diffusion models with first block caching

Aug 13, 2025

how to speed up diffusion inference with minimal quality loss using first block caching