Amazon's cloud computing division announced on Wednesday a partnership with AI startup Hugging Face to make it easier to run thousands of AI models on Amazon's custom computing chips.
Valued at $4.5 billion, Hugging Face has become a central platform for AI researchers and developers to share chatbots and other AI software, supported by companies like Amazon, Alphabet's Google, and Nvidia. Developers primarily use the platform to obtain and modify open-source AI models, such as Meta Platforms' Llama 3.
However, after developers adjust open-source AI models, they typically want to deploy them in real-world software. On Wednesday, Amazon and Hugging Face announced a collaboration to enable developers to achieve this on Amazon Web Services' (AWS) custom chip, Inferentia2.
Jeff Boudier, Head of Product and Growth at Hugging Face, stated, "For us, efficiency is of paramount importance—ensuring that as many people as possible can run models in the most cost-effective way."
AWS aims to attract more AI developers to use its cloud services for AI delivery through this partnership. While Nvidia dominates the model training market, AWS claims its chips can run these trained models—performing inference—at a lower cost.
Matt Wood, Head of AI Products at AWS, said, "You might train these models once a month, but you could perform thousands of inferences per hour. That is where Inferentia2 really shines."