Microsoft, which developed silicon for Xbox two decades ago and later co-designed chips for Surface devices, has unveiled two custom chips: Azure Maia for artificial intelligence (AI) servers and Azure Cobalt CPU for cloud workloads. It shows how Microsoft is architecting its cloud hardware stack and why custom silicon is crucial in this design journey.
These homegrown chips, tailored for AI and cloud workloads, aim to work hand-in-hand with software developed to unlock new capabilities and opportunities for Microsoft’s data center services. However, Microsoft has provided few technical details about these in-house chips.
Figure 1 The two custom chips aim to optimize cloud infrastructure for Azure data centers. Source: Microsoft
Below is a sneak peek of these custom chips designed to power Microsoft’s Azure data centers while enabling significant cost savings for the company and its cloud service users.
Maia 100 AI accelerator
Microsoft Azure Maia 100 is an AI accelerator specifically designed to run training and inference for large language models (LLMs) and generative image tools. It comprises 105 billion transistors and is manufactured on TSMC’s 5-nm node. In a nutshell, it aims to enable higher density for servers at higher efficiencies for cloud AI workloads.
Named after a bright blue star, Maia is part of Microsoft’s multi-billion partnership with OpenAI; the two companies are collaborating to jointly refine and test Maia on OpenAI models. Currently, it’s being tested on GPT 3.5 Turbo, the model that powers ChatGPT, Bing AI workloads, and GitHub Copilot.
Figure 2 Maia 100 paves the way for training more capable models and making those models cheaper. Source: Microsoft
Microsoft and rivals like Alphabet are currently grappling with the high cost of AI services, which according to some estimates, are 10 times greater than traditional services like search engines. Microsoft executives claim that by optimizing silicon for AI workloads on Azure, the company can overhaul its entire cloud server stack to optimize performance, power, and cost.
“We are rethinking the cloud infrastructure for the era of AI, and literally optimizing every layer of that infrastructure,” said Rani Borkar, head of Azure hardware systems and infrastructure at Microsoft. She told The Verge that Maia chips will nestle onto custom server boards, which will be placed within tailor-made racks that fit easily inside existing Microsoft data centers.
That’s how Microsoft aims to reimagine the entire stack and think through every layer of its data center footprint. However, Microsoft executives are quick to note that the development of Maia 100 won’t impact the existing partnerships with AI chipmakers like AMD and Nvidia for Azure cloud infrastructure.
Azure Cobalt 100 CPU
Microsoft’s second in-house chip, Azure Cobalt CPU, named after the blue pigment, seems to answer the Graviton in-house chips offered by its chief cloud rival, Amazon Web Services (AWS). The 128-core chip, built on an Arm Neoverse CSS design, is designed to power general cloud services on Azure. And, like Azure Maia 100, Cobalt CPU is manufactured on TSMC’s 5-nm node.
Microsoft, currently testing Cobalt CPU on workloads like Microsoft Teams and SQL server, claims a 40% performance boost compared to commercial Arm server chips during initial testing. “We made some very intentional design choices, including the ability to control performance and power consumption per core and on every single virtual machine,” Borkar said.
Figure 3 Cobalt CPU is seen as an internal cost saver and an answer to AWS-design custom chips. Source: Microsoft
Maia 100 AI accelerator and Cobalt 100 CPU will arrive in 2024 and be kept in-house. Microsoft hasn’t shared design specifications and performance benchmarks of these chips. However, their naming conventions show that the development of second-generation Maia and Cobalt custom chips might be in the works right now.
We are making the most efficient use of the transistors on the silicon, says Wes McCullough, Microsoft’s corporate VP of hardware product development. Now multiply those efficiency gains in servers across all our data centers, and it adds up to a pretty big number, he wrote on the company’s blog.
Related Content
- AI Can’t Design Chips Without People
- AI Inference May Become the Biggest Market for Ampere
- MLPerf Training Scores: Microsoft Demonstrates Fastest Cloud AI
- Implementing Edge AI? You Still Need The Cloud, Say Cloud Providers
- Edge AI vs. Cloud AI: What’s Best for Sustainable Computing Systems?
googletag.cmd.push(function() { googletag.display(‘div-gpt-ad-native’); });
–>
The post A closer look at Microsoft’s custom chip duo for AI, cloud workloads appeared first on EDN.