Apple's latest move in the artificial intelligence arena has raised eyebrows, as the tech giant disclosed that its AI models were trained on Google's custom chips, rather than Nvidia's widely used GPUs. This revelation, detailed in a recently published technical paper by Apple, marks a significant shift in the landscape of AI training infrastructure, underscoring the competitive and strategic decisions being made by leading technology firms.
The paper reveals that Apple's artificial intelligence system, Apple Intelligence, utilized Google's Tensor Processing Units (TPUs) for pretraining its AI models. This choice highlights a growing trend among big tech companies to seek alternatives to Nvidia's graphics processing units (GPUs), which have long been the industry standard for high-end AI training. Nvidia's GPUs are in such high demand that they have become difficult to procure in the necessary quantities, prompting companies like Apple to look elsewhere.
In its 47-page document, Apple noted that its Apple Foundation Model (AFM) and AFM server were trained on "Cloud TPU clusters," implying that Apple rented servers from a cloud provider, presumably Google, to conduct the intensive calculations required for AI training. This strategy allows Apple to train its models efficiently and on a scalable basis, including both AFM-on-device and AFM-server models.
This move comes amid broader industry discussions about the potential overinvestment in AI infrastructure. Both Meta CEO Mark Zuckerberg and Alphabet CEO Sundar Pichai have recently commented on the substantial investments being made in AI, acknowledging the high business risks of falling behind in this critical technological race. "The downside of being behind is that you're out of position for like the most important technology for the next 10 to 15 years," Zuckerberg remarked on a podcast with Bloomberg's Emily Chang.
While Apple's technical paper did not explicitly mention Nvidia, the absence of any reference to Nvidia's hardware suggests a deliberate choice to bypass the industry leader. Instead, Apple relied on two types of Google's TPUs-2,048 TPUv5p chips for its device AI models and 8,192 TPUv4 processors for its server AI models. Nvidia, known for its GPUs, focuses on selling its chips and systems as standalone products, whereas Google offers access to its TPUs through its Google Cloud Platform, necessitating that customers build software within Google's cloud environment to utilize these chips.
The disclosure of Apple's reliance on Google's infrastructure follows earlier reports by Reuters in June, but the full extent of this dependency was only detailed in the recent research paper. This decision aligns with Apple's unveiling of new AI features at its June developer conference, where it announced the integration of OpenAI's ChatGPT technology into its software suite.
Apple's strategic pivot to Google's TPUs underscores the shifting dynamics in the AI training market. Nvidia's dominance, while still substantial, faces challenges as companies like Apple seek alternatives that offer scalability and efficiency. The decision to use Google's TPUs, which are optimized for AI workloads, reflects a calculated move to harness the capabilities of specialized hardware designed for AI tasks.
This development also highlights the collaborative yet competitive nature of relationships among tech giants. While Apple and Google are fierce competitors in many areas, this collaboration in AI training infrastructure demonstrates the complex interplay of competition and cooperation that defines the tech industry.
Apple's decision to train its AI models on Google's TPUs could have far-reaching implications for the future of AI development. As the demand for advanced AI capabilities continues to grow, the choices made by leading tech companies in their AI training infrastructure will likely influence broader industry trends. With Apple rolling out portions of Apple Intelligence to beta users this week, the tech world will be watching closely to see how these new AI features perform and how they leverage the capabilities of Google's TPUs.