๐Ÿ‘‹ Welcome

Hey! I’m Baptiste, a Machine Learning Engineer at Hugging Face ๐Ÿค— I made this blog to share my AI adventures and weird thoughts. Hope you find something useful here! ๐Ÿ˜ตโ€๐Ÿ’ซ

๐Ÿš€ My Work on Accelerating LLM Inference with TGI on Intel Gaudi

Iโ€™m thrilled to share my latest project: adding native integration of Intel Gaudi hardware support directly into Text Generation Inference (TGI), the production-ready serving solution for Large Language Models (LLMs). This integration brings the power of Intelโ€™s specialized AI accelerators to the high-performance inference stack, enabling more deployment options for the open-source AI community ๐ŸŽ‰ โœจ What Iโ€™ve Accomplished Iโ€™ve fully integrated Gaudi support into TGIโ€™s main codebase in PR #3091. Previously, we had to maintain a separate fork for Gaudi devices at tgi-gaudi. This was cumbersome for users and prevented supporting the latest TGI features at launch. By leveraging the new TGI multi-backend architecture, Iโ€™ve made it possible to support Gaudi directly on TGI โ€“ no more dealing with a custom repository ๐Ÿ™Œ ...

March 28, 2025