mirror of
https://github.com/deepseek-ai/open-infra-index.git
synced 2025-04-02 16:44:02 +00:00
update day 2 - DeepEP
This commit is contained in:
parent
006cdcf4e2
commit
35446186f6
1 changed files with 14 additions and 2 deletions
16
README.md
16
README.md
|
@ -23,15 +23,27 @@ Daily unlocks begin soon. No ivory towers - just pure garage-energy and communit
|
||||||
|
|
||||||
Stay tuned – let's geek out in the open together.
|
Stay tuned – let's geek out in the open together.
|
||||||
|
|
||||||
### Day 1 - FlashMLA
|
### Day 1 - [FlashMLA](https://github.com/deepseek-ai/FlashMLA)
|
||||||
**Efficient MLA Decoding Kernel for Hopper GPUs**
|
**Efficient MLA Decoding Kernel for Hopper GPUs**
|
||||||
Optimized for variable-length sequences, battle-tested in production
|
Optimized for variable-length sequences, battle-tested in production
|
||||||
|
|
||||||
🔗 <a href="https://github.com/deepseek-ai/FlashMLA"><b>GitHub Repo</b></a>
|
🔗 <a href="https://github.com/deepseek-ai/FlashMLA"><b>FlashMLA GitHub Repo</b></a>
|
||||||
✅ BF16 support
|
✅ BF16 support
|
||||||
✅ Paged KV cache (block size 64)
|
✅ Paged KV cache (block size 64)
|
||||||
⚡ Performance: 3000 GB/s memory-bound | BF16 580 TFLOPS compute-bound on H800
|
⚡ Performance: 3000 GB/s memory-bound | BF16 580 TFLOPS compute-bound on H800
|
||||||
|
|
||||||
|
### Day 2 - [DeepEP](https://github.com/deepseek-ai/DeepEP)
|
||||||
|
|
||||||
|
Excited to introduce **DeepEP** - the first open-source EP communication library for MoE model training and inference.
|
||||||
|
|
||||||
|
🔗 <a href="https://github.com/deepseek-ai/DeepEP"><b>DeepEP GitHub Repo</b></a>
|
||||||
|
✅ Efficient and optimized all-to-all communication
|
||||||
|
✅ Both intranode and internode support with NVLink and RDMA
|
||||||
|
✅ High-throughput kernels for training and inference prefilling
|
||||||
|
✅ Low-latency kernels for inference decoding
|
||||||
|
✅ Native FP8 dispatch support
|
||||||
|
✅ Flexible GPU resource control for computation-communication overlapping
|
||||||
|
|
||||||
### Ongoing Releases...
|
### Ongoing Releases...
|
||||||
|
|
||||||
## 2024 AI Infrastructure Paper (SC24)
|
## 2024 AI Infrastructure Paper (SC24)
|
||||||
|
|
Loading…
Add table
Reference in a new issue