Running Llama2 on 8 GPUs with triton without tensor parallelism Running Llama2 on 8 GPUs with Triton Without Tensor Parallelism The need for efficient model deployment has never been more critical especially with the rise of 3 min read 23-09-2024 12
How to debug Triton Python, especially Triton-JIT compiler passes? How to Debug Triton Python Focusing on Triton JIT Compiler Passes Debugging code can often be a daunting task especially when dealing with specialized environme 2 min read 14-09-2024 73
Build triton under virtualenv explode memory Triton Installation Under Virtualenv Memory Explosions and Solutions Building Triton a powerful inference server can sometimes lead to memory explosions particu 2 min read 13-09-2024 18
pip install deepspeed ERROR: error: subprocess-exited-with-error/error: metadata-generation-failed Conquering the ERROR error subprocess exited with error error metadata generation failed during Deep Speed Installation Are you encountering the frustrating ERR 2 min read 03-09-2024 12