Skip to main content
Latent Space
Latent Space
Blog Profile
Admin
Blog Profile
Jaesol Shin

Jaesol Shin

Towards observable, reliable, scalable AI

GitHub

Categories

All Posts 30 Research 1 AI 21 Development 3 DB 5

Tags

AI11 Benchmark8 PostgreSQL8 RAG7 LLM6 Agent5 zsh4 Claude Code4 OpenAI4 GraphRAG4 LightRAG4 Harness4 Multi-account3 Dotfiles3 Configuration3 Developer Workflow3 API3 vLLM2 Productivity2 GPT-52 DeepSeek2 GraphDB2 RCTE2 Neo4j2 Apache AGE2 Recursive CTE2 Memory2 Skills2

Archive

2026 30

#MTP

한국어

June 25, 2026 · AI

Speeding up LLM inference with MTP and diffusion

MTP and diffusion inference on Gemma 4 and Qwen 3.6, fp8 on one H100

#LLM #vLLM #MTP #Speculative Decoding
Jaesol Shin

Jaesol Shin

Towards observable, reliable, scalable AI

GitHub

Categories

All Posts 30 Research 1 AI 21 Development 3 DB 5

Tags

AI11 Benchmark8 PostgreSQL8 RAG7 LLM6 Agent5 zsh4 Claude Code4 OpenAI4 GraphRAG4 LightRAG4 Harness4 Multi-account3 Dotfiles3 Configuration3 Developer Workflow3 API3 vLLM2 Productivity2 GPT-52 DeepSeek2 GraphDB2 RCTE2 Neo4j2 Apache AGE2 Recursive CTE2 Memory2 Skills2

Archive

2026 30