There have been reports that Intel recently introduced the Gaudi 3 AI accelerator, a processor designed for AI workloads that is more affordable than Nvidia's H100 AI GPU in terms of performance. Gaudi 3 consists of two chipsets with a total of 64 tensor processor cores (TPCs), eight matrix multiplication engines (MMEs), 96MB of on-chip SRAM cache, and 24 200 GbE network interfaces, and each accelerator is equipped with 128GB of HBM2E memory, providing up to 3.67 TB/s of bandwidth.
In terms of raw performance, the H100 maintains a lead in some benchmarks, especially when it comes to accelerating AI computing with sparsity. However, Gaudi 3 may have a better performance in terms of energy efficiency with its 5nm process. The Gaudi 3 has an advantage in terms of memory capacity, with 128GB of HBM2e RAM, while the H100 has 80GB of HBM3 RAM. In terms of BF16 performance, the Gaudi 3 is a 4x improvement over the previous generation, and has the potential to surpass the H100 in BF16-optimized workloads. In addition, the Gaudi 3 has also doubled its network bandwidth to 600 GB/s, which is slightly lower than the H100's 800 GB/s, but shows the importance that Intel places on communication capabilities. It is worth noting that Gaudi 3 has a significant improvement over H100 in terms of large model training and inference speed, as well as energy efficiency, in some cases, the training speed can be 40% faster, the inference speed can be 50% faster, and the energy efficiency can be improved by up to 2.3 times. This makes Gaudi 3 competitive when it comes to handling large-scale generative AI models.
Figure: Intel unveils Gaudi 3 AI accelerator (Source: Intel)
In terms of real-world performance, the Gaudi 3 AI accelerator needs to compete with AMD's Instinct MI300 series as well as Nvidia's H100 and B100/B200 processors. At the moment, Intel has shown some slides claiming that the Gaudi 3 can offer a significant price performance advantage compared to Nvidia's H100. Intel also noted that the price of the accelerator kit based on the eight Gaudi 3 processors is around $125,000, which means that each Gaudi 3 will cost around $15,625, while the current price of the Nvidia H100 card is $30,678.
In addition, Gartner expects global revenue from AI chips to grow by 33% in 2024, with AI accelerators in servers worth $21 billion. This shows that the AI accelerator market is growing rapidly, and the launch of Intel Gaudi 3 is precisely to capture this market opportunity.
Overall, while the Gaudi 3 may be slightly inferior to Nvidia's H100 in terms of performance, its low-price strategy may appeal to businesses and research institutes that are cost-sensitive but need efficient AI computing power. As the AI chip market continues to expand and mature, we may see more products like Gaudi 3 to provide more choice and competition for AI computing.
Justin Hotard, executive vice president and general manager of Intel's Data Center and AI Group, said, "The demand for AI is driving a dramatic shift in data centers, requiring changes in hardware, software, and development tools. With the introduction of Xeon 6 and Gaudi 3 AI accelerators with P-cores, Intel is building an open ecosystem that enables our customers to implement all workloads with greater performance, efficiency, and security.”
It is reported that Intel's Gaudi 3 AI accelerator will be available through IBM Cloud and Intel Tiber Developer Cloud. In addition, Intel Xeon 6 and Gaudi 3-based systems will be generally available from Dell, HPE, and Supermicro in Q4, with Dell and Supermicro systems shipping in October and Supermicro's devices shipping in December.