diff --git a/README.md b/README.md index 75b621c..a85e05c 100644 --- a/README.md +++ b/README.md @@ -12,18 +12,27 @@ ## 202502 Open-Source Week We're a tiny team @deepseek-ai pushing our limits in AGI exploration. -Starting **next week**, we'll open-source 5 repos – one daily drop – not because we've made grand claims, +Starting **this week** , Feb 24, 2025 we'll open-source 5 repos – one daily drop – not because we've made grand claims, but simply as developers sharing our small-but-sincere progress with full transparency. These are humble building blocks of our online service: documented, deployed and battle-tested in production. -No vaporware, just code that moved our tiny moonshot forward. +No vaporware, just sincere code that moved our tiny yet ambitious dream forward. Why? Because every line shared becomes collective momentum that accelerates the journey. Daily unlocks begin soon. No ivory towers - just pure garage-energy and community-driven innovation 🔧 Stay tuned – let's geek out in the open together. -### Day0: ??? +### Day 1 - FlashMLA +**Efficient MLA Decoding Kernel for Hopper GPUs** +Optimized for variable-length sequences, battle-tested in production + +🔗 GitHub Repo +✅ BF16 support +✅ Paged KV cache (block size 64) +⚡ Performance: 3000 GB/s memory-bound | BF16 580 TFLOPS compute-bound on H800 + +### Ongoing Releases... ## 2024 AI Infrastructure Paper (SC24) ### Fire-Flyer AI-HPC: A Cost-Effective Software-Hardware Co-Design for Deep Learning