DeepSpeed

☆ Save On Wikipedia ↗
DeepSpeed
Original authorMicrosoft Research
DeveloperMicrosoft
ReleaseMay 18, 2020 (2020-05-18)
Stable release
v0.19.2 / June 16, 2026 (2026-06-16)
Written inPython, CUDA, C++
TypeSoftware library
LicenseApache License 2.0
Websitedeepspeed.ai
Repositorygithub.com/microsoft/DeepSpeed

DeepSpeed' is an open-source optimization library for the distributed training and inference of deep learning models using PyTorch.[1]

Library

The library is designed to reduce computing power and memory use and to train large distributed models with better parallelism on existing computer hardware.[2][3] DeepSpeed is optimized for low latency, high throughput training. It includes the Zero Redundancy Optimizer (ZeRO) for training models with 1 trillion or more parameters.[4] Features include mixed precision training, single-GPU, multi-GPU, and multi-node training as well as custom model parallelism. The DeepSpeed source code is licensed under Apache License and available on GitHub.[5]

The team claimed to achieve up to a 6.2x throughput improvement, 2.8x faster convergence, and 4.6x less communication.[6]

See also

References

Further reading

  • Rajbhandari, Samyam; Rasley, Jeff; Ruwase, Olatunji; He, Yuxiong (2019). "ZeRO: Memory Optimization Towards Training A Trillion Parameter Models". arXiv:1910.02054 [cs.LG].