This commit is contained in:
Huang Panpan 2025-02-26 11:34:01 +08:00 committed by hpp
parent b4998cec46
commit 92b859fe9f

View file

@ -48,20 +48,15 @@ Excited to introduce **DeepEP** - the first open-source EP communication library
Introducing DeepGEMM - an FP8 GEMM library that supports both dense and MoE GEMMs, powering V3/R1 training and inference.
⚡ Up to 1350+ FP8 TFLOPS on Hopper GPUs
✅ No heavy dependency, as clean as a tutorial
✅ Fully Just-In-Time compiled
✅ Core logic at ~300 lines - yet outperforms expert-tuned kernels across most matrix sizes
✅ Supports dense layout and two MoE layouts
⚡ Up to 1350+ FP8 TFLOPS on Hopper GPUs
✅ No heavy dependency, as clean as a tutorial
✅ Fully Just-In-Time compiled
✅ Core logic at ~300 lines - yet outperforms expert-tuned kernels across most matrix sizes
✅ Supports dense layout and two MoE layouts
🔗 GitHub: https://github.com/deepseek-ai/DeepGEMM
### Ongoing Releases...
## 2024 AI Infrastructure Paper (SC24)