Holy chips! Microsoft’s new AI silicon will power its chatty assistants


A photo of the Microsoft Azure Maia 100 chip that has been altered with splashes of color by the author to look as if AI itself were bursting forth from its silicon substrate.
Enlarge / A photo of the Microsoft Azure Maia 100 chip that has been altered with splashes of color by the author to look as if AI itself were bursting forth from its silicon substrate.
Microsoft | Benj Edwards

reader comments
25 with

On Wednesday at the Microsoft Ignite conference, Microsoft announced two custom chips designed for accelerating in-house AI workloads through its Azure cloud computing service: Microsoft Azure Maia 100 AI Accelerator and the Microsoft Azure Cobalt 100 CPU.

Microsoft designed Maia specifically to run large language models like GPT 3.5 Turbo and GPT-4, which underpin its Azure OpenAI services and Microsoft Copilot (formerly Bing Chat). Maia has 105 billion transistors that are manufactured on a 5-nm TSMC process. Meanwhile, Cobalt is a 128-core ARM-based CPU designed to do conventional computing tasks like power Microsoft Teams. Microsoft has no plans to sell either one, preferring them for internal use only.

As we’ve previously seen, Microsoft wants to be “the Copilot company,” and it will need a lot of computing power to meet that goal. According to Reuters, Microsoft and other tech firms have struggled with the high cost of delivering AI services that can cost 10 times more than services like search engines.

Amid chip shortages that have driven up the prices of highly sought-after Nvidia AI GPUs, several companies have been designing or considering designing their own AI accelerator chips, including Amazon, OpenAI, IBM, and AMD. Microsoft also felt the need to make a custom silicon to push its own services to the fore.

The Verge, the firm has long collaborated on silicon for its Xbox consoles and co-engineered chips for its line of Surface tablets.

No tech company is an island, and Microsoft is no exception. The company plans to continue to rely on third-party chips both out of supply necessity and likely to please its tangled web of business partnerships. “Microsoft will also add the latest Nvidia H200 Tensor Core GPU to its fleet next year to support larger model inferencing [sic] with no increase in latency,” it writes, referring to Nvidia’s recently announced AI-crunching GPU. And it will also add AMD MI300X-accelerated virtual machines to Azure.

So, how well do the new chips perform? Microsoft hasn’t released benchmarks yet, but the company seems happy with the chips’ performance-per-watt ratios, especially for Cobalt. “The architecture and implementation is designed with power efficiency in mind,” Microsoft corporate VP of hardware product development Wes McCullough said in a statement. “We’re making the most efficient use of the transistors on the silicon. Multiply those efficiency gains in servers across all our datacenters, it adds up to a pretty big number.”

Article Tags:
Article Categories:
Technology