With a weekly price surge of 30%, and single units selling for over $100,000, the NVIDIA A800 has become a hot commodity in the market due to the booming interest in artificial intelligence and the current chip shortage.

Taking advantage of this wave, NVIDIA's CEO, Jensen Huang, has swiftly emerged as one of the most notable entrepreneurs, standing shoulder to shoulder with Elon Musk. NVIDIA's market capitalization has also soared to $1.17 trillion, making it one of the hottest tech companies in the market.

However, while high-end AI chips are scarce and the demand for computing power skyrockets amid the "Model War," numerous competitors, big and small, are striving to catch up. They hope to break the prevailing perception that "AI chip equals NVIDIA," aiming to claim a share in this trillion-dollar market.

Intel introduced its AI processor Gaudi 2 for the Chinese market on July 11. This chip is a counterpart to the NVIDIA 100 series, specially designed for training large language models.

The launch of Gaudi 2 signifies another giant entering the AI chip market. Intel follows in the steps of AMD, which has already launched its AI products. These three chip powerhouses - NVIDIA, Intel, and AMD - are once again clashing in the AI era.

The AI chip market will not be dominated by a single entity. As more giants enter the arena, a fresh competition has commenced.

Showdown

By launching Gaudi 2, Intel has directly challenged NVIDIA.

The shrinking PC market and the softening data center business have put pressure on Intel's performance. Intel, once the top player in the server chip market, is losing market share to competitors like AMD. The surge in demand for computing power driven by the AI wave has presented Intel with a new opportunity.

Gaudi 2, designed by Habana Labs, which Intel acquired for $2 billion in 2019, has been built from the ground up to enhance deep learning training efficiency in the cloud and data centers.

At the launch event, Sandra Rivera, Intel's Executive Vice President and General Manager of the Data Center and AI Business Unit, spent a considerable amount of time discussing Gaudi 2's performance in comparison to NVIDIA's high-end GPUs A100 and H100.

According to the presented data, for instance, in pre-training the Bert model, Gaudi 2's performance is 1.7 times that of NVIDIA's A100. As for the more advanced NVIDIA H100, Habana Labs' COO Eitan Medina stated that Gaudi 2 is currently the only viable alternative for large language model (LLM) training. In the MLPerf 3.0 benchmark test, only Gaudi 2 and H100 can handle GPT-3 training.

At present, there's a gap in GPT-3 model training performance between Gaudi 2 and H100, with a single H100 outperforming Gaudi 2 by 3.6 times. However, Medina predicts that with Intel's September release of software support and new features for FP8, Gaudi 2 will surpass H100 in terms of cost-effectiveness.

Cost-effectiveness is a key advantage of Gaudi 2 in its competition against the NVIDIA 100 series. Medina told Business Times that when running ResNet-50, Gaudi 2's performance per watt is approximately twice that of NVIDIA's A100. When running the 176 billion-parameter BLOOMZ model, Gaudi 2 outperforms the A100 by roughly 1.6 times in terms of performance per watt.

Meaning, while offering respectable performance, Gaudi 2 clearly outshines NVIDIA's A100 in terms of power consumption and also offers a competitive challenge to H100 in cost-effectiveness. Consequently, Intel has emerged as the most potent contender among NVIDIA's challengers.

Though similar to A100, to comply with the US Bureau of Industry and Security regulations, Gaudi 2 has some differences in its international version. However, Medina assures that the Chinese version of Gaudi 2 has minimal performance differences compared to the international version. The planned 5nm Gaudi 3, set to launch next year, will also be available to Chinese customers, provided it meets compliance.

At present, Intel has entered into partnerships with domestic server manufacturers such as Inspur Information, New H3C, and Superfusion, as well as companies like Baidu Intelligent Cloud. Inspur's VP and General Manager of AI & HPC product line, Liu Jun, also expressed that they would jointly launch a new generation AI server NF5698G7 with Intel, which will support eight Gaudi 2 chips.

Additionally, Rivera revealed that by 2025, Intel plans to integrate the Gaudi AI chip and GPU product lines to launch a more complete next-generation GPU product. This will help meet a variety of different needs through a broad product line.

Race

Intel is not the first chip titan to challenge NVIDIA.

Last June, AMD launched its CPU+GPU architecture-based Instinct MI300 to venture into the AI training market. Then in June this year, AMD unveiled the MI300X, with a massive 192GB HBM memory, to further optimize large model training.

AMD's Data Center Hardware Director, Forrest Norrod, stated that the AI craze led by ChatGPT has taken AMD by surprise. The industry has been eager for a competitor to NVIDIA, offering an alternative solution. "Before, we were just one of the competitors," Norrod said. "Now we are the alternative."

However, AMD's strategy is distinct from Intel's. They believe that the combination of CPU and GPU is the future. "AMD's architecture design is based on the synergy between CPU and GPU, leveraging our Infinity Fabric technology. When we look at the GPU competition, we are not just comparing GPU to GPU, but also considering CPU+GPU performance," Norrod emphasized.

Despite the aggressive remarks, AMD's efforts in the AI field have not been as successful as expected. In the MLPerf 3.0 benchmark test, AMD's performance failed to reach the top tier. AMD's products also cannot handle GPT-3 training, a critical capability in the current 'Model War.'

Consequently, although AMD introduced products into the AI market ahead of Intel, it seems that Intel may be the more potent challenger to NVIDIA.

Rising Stars

NVIDIA, Intel, and AMD aren't the only players in the AI chip market.

China's Cambricon has also released its high-end AI chip MLU270 in a bid to compete with the major players. Cambricon's CEO, Chen Tianshi, said that the MLU270 is compatible with the most popular AI software platforms and can provide excellent performance for many AI applications.

Similarly, Graphcore, a UK-based start-up, launched its second-generation chip, the IPU-M2000, last year. The IPU-M2000 aims to increase performance while reducing power consumption, two critical aspects of AI chip design. According to Graphcore, the IPU-M2000 has performed well in the MLPerf 3.0 benchmark test, comparable to NVIDIA's A100.

Emerging AI chip companies like these aim to offer more choices for the market and drive more competition. Still, it remains to be seen whether these smaller companies can effectively challenge the established giants.

Outlook

The market for AI chips is thriving, and competition is heating up as different players strive for market share.

As the AI industry continues to evolve, the market is sure to see more contenders and innovations. NVIDIA, as the current market leader, will have to defend its position against numerous challengers, including Intel, AMD, and various emerging players. Meanwhile, new developments in AI technology and the needs of AI users will continue to drive the market forward.

In the face of such fierce competition, NVIDIA's dominant position is not guaranteed. The future of the AI chip market remains uncertain, but what is clear is that the battle of the AI chip titans has just begun.