mirror of
https://github.com/deepseek-ai/open-infra-index.git
synced 2025-04-03 16:54:04 +00:00
update day 1 - FlashMLA
This commit is contained in:
parent
1010047513
commit
006cdcf4e2
1 changed files with 12 additions and 3 deletions
15
README.md
15
README.md
|
@ -12,18 +12,27 @@
|
||||||
## 202502 Open-Source Week
|
## 202502 Open-Source Week
|
||||||
We're a tiny team @deepseek-ai pushing our limits in AGI exploration.
|
We're a tiny team @deepseek-ai pushing our limits in AGI exploration.
|
||||||
|
|
||||||
Starting **next week**, we'll open-source 5 repos – one daily drop – not because we've made grand claims,
|
Starting **this week** , Feb 24, 2025 we'll open-source 5 repos – one daily drop – not because we've made grand claims,
|
||||||
but simply as developers sharing our small-but-sincere progress with full transparency.
|
but simply as developers sharing our small-but-sincere progress with full transparency.
|
||||||
|
|
||||||
These are humble building blocks of our online service: documented, deployed and battle-tested in production.
|
These are humble building blocks of our online service: documented, deployed and battle-tested in production.
|
||||||
No vaporware, just code that moved our tiny moonshot forward.
|
No vaporware, just sincere code that moved our tiny yet ambitious dream forward.
|
||||||
|
|
||||||
Why? Because every line shared becomes collective momentum that accelerates the journey.
|
Why? Because every line shared becomes collective momentum that accelerates the journey.
|
||||||
Daily unlocks begin soon. No ivory towers - just pure garage-energy and community-driven innovation 🔧
|
Daily unlocks begin soon. No ivory towers - just pure garage-energy and community-driven innovation 🔧
|
||||||
|
|
||||||
Stay tuned – let's geek out in the open together.
|
Stay tuned – let's geek out in the open together.
|
||||||
|
|
||||||
### Day0: ???
|
### Day 1 - FlashMLA
|
||||||
|
**Efficient MLA Decoding Kernel for Hopper GPUs**
|
||||||
|
Optimized for variable-length sequences, battle-tested in production
|
||||||
|
|
||||||
|
🔗 <a href="https://github.com/deepseek-ai/FlashMLA"><b>GitHub Repo</b></a>
|
||||||
|
✅ BF16 support
|
||||||
|
✅ Paged KV cache (block size 64)
|
||||||
|
⚡ Performance: 3000 GB/s memory-bound | BF16 580 TFLOPS compute-bound on H800
|
||||||
|
|
||||||
|
### Ongoing Releases...
|
||||||
|
|
||||||
## 2024 AI Infrastructure Paper (SC24)
|
## 2024 AI Infrastructure Paper (SC24)
|
||||||
### Fire-Flyer AI-HPC: A Cost-Effective Software-Hardware Co-Design for Deep Learning
|
### Fire-Flyer AI-HPC: A Cost-Effective Software-Hardware Co-Design for Deep Learning
|
||||||
|
|
Loading…
Add table
Reference in a new issue