Guoxin Securities: CSP Cloud Factory AI Arms Race Accelerates Rapid Development of Intelligent Computing Center Architecture

Zhitongcaijing · 08/25 07:33

The Zhitong Finance App learned that Guoxin Securities released a research report saying that as global CSP manufacturer Capex continues to grow rapidly, intelligent computing center connectivity technology is undergoing a leapfrog upgrade from 400G to 1.6T. Leading AI chip companies promote the rapid development of intelligent computing centers; CSP Internet Cloud Factory develops its own ASIC chips and computing power clusters to explore ways to adapt to its own AI development. Intelligent computing center interconnection technology mainly uses optical communication and copper/copper backplane connections. It is recommended to focus on optical module manufacturers, optical device manufacturers, copper connections, etc., and communication equipment manufacturers.

Guoxin Securities's main views are as follows:

The CSP Internet Cloud Factory AI arms race has entered the 2.0 era, and the development of intelligent computing center interconnection technology is rapidly iterating

Since ChatGPT3.5 ignited the “big model revolution” in 2023, AI development has attracted much attention, and major technology companies have invested in large-scale model research and development and increased the construction of intelligent computing centers. According to the Capex guidelines of CSP manufacturers, it is estimated that in 2025, the four overseas manufacturers of Amazon, Google, Microsoft, and Meta will increase Capex to a total of 361 billion US dollars, an increase of more than 58% over the previous year; domestic Byte, Tencent, and Ali Capex are expected to exceed 360 billion yuan. In the early stages of this AI wave, Nvidia, as a leading AI chip company, was in short supply of AI chips; as CSP Cloud Factory continued to increase investment in intelligent computing centers, self-developed ASIC computing power chips with higher cost performance became the core of a new round of development in the AI arms race, and the Internet technology of AI chip clusters also accelerated iterative upgrading. This article mainly discusses the development of the intelligent computing center network architecture and future new technologies.

AI chip leader Nvidia accelerates the iteration of its AI chip performance and promotes the rapid development of intelligent computing centers

Nvidia's P/V/A/H/B chip architectures have been accelerated from the initial upgrade every 4 years to iterative upgrades every 2 years. Over the past 3 years, AI computing power clusters have also evolved from 64 AI chip cabinets to 256 or even 288/576 AI chip clusters, and the network connection rate between chips has evolved from 400G to 1.6T currently in use. Optical communication, copper connection/backplane connections, liquid cooling, etc. involved in intelligent computing center interconnection technology are all benefiting significantly from the development of the industry. Driven by the development of the AI industry, leading AI chip companies such as Huawei and AMD have successively released computing power cluster supernode projects developed and designed by themselves.

CSP Internet Cloud Factory develops its own ASIC chips and computing power clusters to explore ways to adapt to its own AI development

(1) Google developed its own ASIC chip TPU as early as 2015, and is currently planning its seventh-generation TPU chip. Since TPUV4, it has created an original OCS all-optical switching architecture, and since TPUV6, it has used 1.6T optical module transmission. (2) The Trainium chip developed by AWS Amazon is planned to reach the third generation. At the end of last year, the Trainium2 cluster interconnect attracted much attention using AEC copper cable connections, and the Trainium3 cluster architecture planned next year began using copper backplane connections. (3) META developed its own MTIA chips for the first time, but META has been deeply designing data center architectures for many years. The early famous CLOS architecture came from META, and META also designed unique cabinets specifically for Nvidia and AMD chips. (4) Broadcom, Marvell and other vendors actively participate in supporting the construction of data centers in CSP cloud factories around the world. (5) Domestic CSP cloud factories, Tencent (ETH-X) /Ali (ALS) /Byte, etc. are all designing data center architectures according to their own needs; Lixun and other vendors are actively participating in the design of interconnection solutions.

The optical communication/copper connection market is growing rapidly, and new technologies such as CPO/copper backplane/sWith (PCIe) /OCS/OIO/DCI can be expected in the future

ASIC chip shipments continue to increase. The bank estimates that global 800G optical modules are expected to reach 40 million next year, and 1.6T optical modules are expected to exceed 7 million. In 2029, the CPO penetration rate is expected to reach 50% (Lightcounting forecast), the OCS market size is expected to exceed 1.6 billion US dollars (CignalAI forecast), the PCIe Switch market is expected to reach 5 billion US dollars (ABI forecast), and the DCI market size is expected to reach 28.4 billion US dollars (Mordor Intelligence forecast).

Risk warning: AI development and investment fall short of expectations; increased industry competition; global geopolitical risks; changes in the industrial chain caused by the development of new technologies.