Micron Technology is sampling the industry’s first 8-high 24-GB HBM3 Gen2 memory with bandwidth greater than 1.2 TB/s and pin speed over 9.2 Gbit/s, claiming up to a 50% improvement over current solutions. Boasting a 2.5× performance/watt improvement over earlier generations, the new HBM offering is said to set records for artificial intelligence (AI) data center metrics for performance, capacity and power efficiency. The improvements reduce training times of large language models like GPT-4, deliver efficient infrastructure use for AI inference and lower total cost of ownership (TCO).
At the heart of Micron’s high-bandwidth memory (HBM) solution is its 1β (1-beta) DRAM process node, enabling a 24-Gb DRAM die to be assembled into an 8-high cube in an industry-standard package dimension. Micron delivers 50% more capacity for a given stack height than competitive solutions, according to the company.
The higher memory capacity results in faster training over current solutions and reduced training time for LLMs by more than 30%, Micron said.
The HBM3 Gen3 performance-to-power ratio and pin speeds manage the extreme power demands of today’s AI data centers, the company said. Improved power efficiency is possible given Micron’s doubling of through-silicon vias (TSVs) over competitive HBM3 offerings, thermal impedance reduction through a 5× increase in metal density and an energy-efficient data path design.
The Micron HBM3 Gen2 memory’s performance is driving cost savings for AI data centers. For example, an installation of 10 million GPUs, every five watts of power savings per HBM cube is estimated to save operational expenses of up to $550 million over five years.
Supporting the effort, TSMC is working with Micron for further evaluation and tests for the next-generation HPC application.