“Sorry it’s kind of messy, because we’re mad scientists here,” joked Rami Sinno, engineering director at Annapurna Labs.
Inside Annapurna’s Austin hub, I’m touring the nerve center of what was once a tiny Israeli chip startup—snapped up by Amazon for $350 million in 2015—that’s now powering the tech giant’s AI ambitions.
I’m led into two gear-filled garages overflowing with circuit boards, server racks, and testing equipment. One room is a quiet workshop, with an engineer booting up chips on test boards to see if they are running properly and another peering through a microscope with a soldering iron in his hand, delicately adjusting tiny chip components. The other room is so noisy it requires earplugs, with large fans cooling the air inside as Annapurna staff stress-test the AI chips in seven-foot servers that power Amazon’s vast global network of data centers, some of which are as large as eight football fields.
This is where Annapurna designs and tests some of the world’s most advanced semiconductors for Amazon’s all-important AWS cloud division and its customers. It now carries the weight of leading Amazon’s high-stakes push into creating homegrown chips to train and run its AI models and thereby keep up with rivals like Google’s Alphabet, Facebook’s Meta, and Microsoft.
The effort is intended to reduce Amazon’s reliance on Nvidia, which dominates the market for chips that power AI. Nvidia’s AI chips, called GPUs, cost up to $40,000 each. Considering that tens of thousands of these chips may be required for a single data center, the overall cost is astronomical. By developing its own chips, Amazon gets greater control and the ability to offer competitive alternatives to customers on AWS.
All of this is critical now as Amazon leans on Annapurna to help supercharge Amazon’s $8 billion investment in AI startup Anthropic. Amazon’s AWS is Anthropic’s primary cloud provider, which means that Anthropic uses Amazon’s in-house chips to train and deploy models like its popular Claude AI chatbot.
The Amazon-Anthropic partnership includes a recently announced collaboration on Project Rainier, a massive supercomputer designed by Annapurna Labs that would use hundreds of thousands of Annapurna-designed Trainium2 chips. Amazon and Anthropic say Project Rainier will deliver more than five times the computing power used to train the current generation of leading AI models.
Having such large computing clusters is still seen by many in the AI industry as an imperative for those hoping to stay at the bleeding edge of AI model development. But some analysts have begun to question whether such massive infrastructure is truly necessary following advances from Chinese company DeepSeek earlier this year. That company emerged from relative obscurity to show it was possible to build highly-capable AI models at less cost and requiring far fewer chips. It released open-source models that matched or outperformed some AI model benchmarks of much bigger companies—despite being trained with significantly less computing power.
Nafea Bshara, cofounder of Annapurna Labs, told Fortune that his team deliberately stays out of the spotlight. It’s in sharp contrast to Nvidia, which holds splashy events headlined by a rockstar CEO, complete with signature black leather jacket. “You don’t see much about us—we want to let the product and value speak for themselves, not the flashy stuff,” Bshara said.
But the Amazon-Anthropic partnership has stirred fresh curiosity into Annapurna’s work: Is there really room in the market for Amazon’s solid, cost-efficient chips, which analysts say tend to be a generation behind Nvidia in terms of performance? Amazon is also reportedly attempting to significantly undercut Nvidia on AI chip pricing—but will that really persuade AWS cloud customers to choose Trainium?
Annapurna Labs did not start out working on AI
Annapurna Labs, founded in 2011 by three Intel and Broadcom veterans, did not start out focusing on AI chips for Amazon. Its early work for the company was all about the cloud: building specialized chips to improve how AWS handled things like data storage, network traffic, and routine computing. By 2018, it had garnered a win for AWS with Graviton, a family of chips used by over 90% of AWS customers for cloud infrastructure. But in recent years, as Nvidia GPUs became the go-to chip for AI, big cloud providers like Amazon’s AWS, Google Cloud, and Microsoft Azure began building their own AI chips to reduce reliance on Nvidia and tailor hardware for better performance on their own cloud infrastructure. Google made its first TPU AI chip available in its cloud in 2018, for example, while Annapurna Labs launched its first Inferentia AI chips for AWS in 2019 and then Trainium in 2020. The latest, Trainium2, was unveiled in 2023 and is now generally available.
Sinno showed me examples of how the Trainium2 chips nest into custom-built boards, which are inserted into racks that are stacked into tall servers that live in AWS data centers. AWS customers access these chips through the cloud, while Annapurna continuously monitors the chips and makes sure everything works under the hood. It sets the right voltages and frequencies, tracks the health of the chip, the board, and the full system, and looks for signals that something might fail.
“We look for early warnings that will tell us if a chip is about to cause a bad customer experience,” Sinno said. “If they detect a risk, the system automatically shuts that chip down and spins up another one in its place.”
Efficient and lower cost options for AWS customers
Gil Luria, head of technology research at investment banking firm D.A. Davidson, told Fortune that Amazon is fortunate to have gotten into the chip business a decade ago, since building competitive chips takes years of development and iteration. It’s not just about manufacturing the chips, either. One of Nvidia’s biggest advantages is that it spent years building a new software architecture and system, called CUDA, to help developers and researchers work with their AI chips.
Had Annapurna Labs started its AI chip efforts for Amazon a half-decade later, it would still be years away from a finished product, Luria explained. But now, with its Inferentia and Trainium chips, AWS can offer customers an option. Amazon’s in-house chips may not be better performance-wise than Nvidia’s latest GPUs, but for some AI work, they are more efficient, lower-cost, and fully integrated into the rest of the AWS technology stack. In the complex economics of cloud computing, those are important differentiators.
Nvidia is aware that big cloud providers, which are also some of its biggest customers, are working overtime to reduce their reliance on its GPUs, said Daniel Newman, CEO of analyst firm The Futurum Group. In fact, at Nvidia’s recent GTC conference, for the first time Nvidia CEO Jensen Huang glossed over his company’s long-time partnerships with Amazon, Google, and Microsoft, focusing instead on its partnerships with a new generation of cloud computing companies, including CoreWeave and Lambda Labs. “It was kind of the beginning to prepare for life beyond [big cloud providers],” Newman said.
Still, while Graviton was a big success for AWS for pure cloud computing chips, Newman said the jury is out on Annapurna’s AI offerings. So far, Inferentia and Trainium have been most successful for Amazon’s own internal use, he explained.
While Nvidia’s dominance has been largely driven by its role in training massive AI models—an extremely compute-intensive task—Amazon has taken a segmented approach. Inferentia, was developed specifically to generate outputs from models that have already been trained (what’s known in the industry as “inference”), and is used for the voice in Amazon’s Alexa, as well as AI used in Amazon.com’s search and recommendations. The building blocks of Trainium, on the other hand, were meant for both inference and training—including Amazon’s own deep learning models, such as the recently-released Nova.
A mutually-beneficial relationship with Anthropic
Annapurna Labs’ Bshara said that when the lead time for delivering new cutting-edge AI chips is so long—18-36 months—the key to success is to work with the companies pushing the envelope on the technology. “For that, Anthropic is by far the best one in my view,” he said, also citing Amazon team working on artificial general intelligence, or AI that’s as capable as a human at certain tasks, as well as open source developers who develop and deploy AI models on Trainium as also giving valuable feedback.
Gadi Hutt, director of product and customer engineering at Annapurna Labs, said the relationship with Anthropic began in 2022. The Anthropic team dedicated to improving AI model training performance asked Annapurna to give it Trainium to play with. “Usually, when we talk to customers, we want to understand what they are going to build so we can guide them,” he said. “But they said, ‘No, just give it to us.’ They came back with a full report of tests that they had run over the weekend.”
Tom Brown, chief compute officer and co-founder of Anthropic, recalled Annapurna giving him access to Trainium on a Friday, and him trying to implement a few small neural networks (a type of AI model used for deep learning, the foundation of large language models). But some basic functions were not working correctly. “I documented the issues and sent them my feedback,” he told Fortune by email. “They iteratively improved the system as we continued testing it…they took our requirements seriously.”
Brown added that Anthropic’s collaboration with Annapurna is more integrated than typical vendor relationships. “We communicate frequently on everything from low-level optimization work to high-level architectural decisions for next-generation chips,” he said. “It’s mutually beneficial—we both need to work together to build hardware that’s not just effective for today’s models, but also for the models that will come years down the road.”
At Amazon’s re:Invent cloud computing conference in December, Anthropic reported that it had optimized its Claude models to run on Trainium2, and got 60% faster performance than its baseline for its Claude Haiku 3.5 model.
The pressure is on for both Annapurna Labs and Anthropic
Annapurna’s Bshara admitted to feeling a lot of pressure to deliver for Anthropic and other AWS customers. And it’s not just because of the competition with Nvidia, he added. “A lot of customers took a bet on us and we need to keep earning their trust, both on quality of the product, scale of the product [and the] roadmap philosophy,” Bshara said.
Luria agreed that Anthropic must be successful for either Amazon, through Annapurna Labs, or Anthropic to win the AI race. “If Anthropic’s models don’t keep getting bigger, more powerful and more popular with users, neither company is going to win,” he said.
Newman remains bullish on Amazon’s Annapurna Labs investment when it comes to AI. The chip performance is impressive, he said, and although Anthropic and Amazon developers may prefer the latest cutting-edge Nvidia chip, business teams—which want efficiency and lower cost—usually win out.
“In my eyes, this is shots fired,” he said. “Nvidia won’t have the entire market uncontested.” And, it’s still relatively early days for Annapurna Labs, Newman added. “Nvidia spent two decades, Google’s TPU took 10 plus years,” he said. “Annapurna is a few years into the game. Chip development is hard.”
This story was originally featured on Fortune.com
Recent Comments