June 25, 2026 · AI
Speeding up LLM inference with MTP and diffusion
MTP and diffusion inference on Gemma 4 and Qwen 3.6, fp8 on one H100
All Posts
한국어June 25, 2026 · AI
MTP and diffusion inference on Gemma 4 and Qwen 3.6, fp8 on one H100
May 27, 2026 · AI
Why spec-driven development is a requirements-inference architecture
May 27, 2026 · AI
Why skills and harnesses overlap in implementation
May 26, 2026 · AI
How harness design determines whether agents actually adapt.
May 22, 2026 · AI
Why agent memory moved from RAG storage to the foundation of policy adaptation
May 22, 2026 · AI
Weights, prompts, and code as parameters at different layers of a learnable policy space
May 21, 2026 · AI
Source-level comparison of RAG-Anything, ApeRAG, and EdgeQuake as LightRAG derivatives
May 21, 2026 · DB
Distributed processing (Ray/Daft/Smallpond) and Citus + pg_lake FDW integration
May 21, 2026 · DB
Eight phases building a PostgreSQL-centered Lakehouse: DuckLake's libpq collision and pg_lake
May 21, 2026 · DB
Why Arrow+GPU achieves 10x at SF=100 and Heap+GPU loses to CPU on wide tables