The Zhitong Finance App learned that CITIC Construction Investment released a research report saying that agents have now become a key driving force for global technology giants. Both C-side and B-side have begun rapid product iteration, and it is expected that all types of agents will enter the rapid implementation phase next year. Investments in the computing power sector are divided into two categories: overseas boom investment and domestic autonomy and control. Overseas boom investment requires attention to new technology and incremental changes; while the general trend of domestic autonomy and control is AI chips, the core is AI chips, and it is recommended to focus on companies leading in terms of shipment volume, ecology, and product power. More AI applications will gradually be implemented as domestic large-scale model capabilities improve, transmission prices drop, and policy support.
CITIC Construction Investment's main views are as follows:
Looking ahead to 2025, investments in the computing power sector are divided into two categories: overseas boom investment and domestic autonomy and control:
Overseas prosperity investment:
1) Valuation fluctuations: The AI computing power global valuation system refers to Nvidia and the pace of TSMC Cowos production expansion. The bank is optimistic about Nvidia's 25-year high growth and 26-year steady growth;
2) Focusing on incremental changes and investment in new technology, the most important thing in 25 years is that Nvidia NVL36 and 72 cabinets will begin to be shipped, and the number of AI chip interconnects in a single cabinet will continue to increase in order to cope with larger model parameter training. Among them, copper connections, liquid cooling, and power supplies have changed the most, and the performance delivery period will begin in '25. In terms of new technologies, such as CPO and MPO, etc., will continue to mature in 2025;
3) Investment around share changes. As the industrial chain deepens, suppliers in optical modules, PCBs, etc. will change their share next year.
Copper connections: Domestic manufacturers have advantages in delivery capacity and product quality, and their share is still increasing.
Considering the overall shipment volume of NVL36 and NVL72 cabinets next year (the NVL72 equivalent is estimated to be 40,000 units and the NVL36 versions total about 30,000 units), the in-cabinet copper cable high-speed line market alone will reach 3.5 billion +. Considering that external cabinet wires are also being shipped in large quantities, the increase in high-speed copper wire is significant. Moreover, next year, AMD, Google TPU, and other major manufacturers will also adopt cabinet solutions, and the high-speed copper wire they use is also quite large.
Power supply: With the rapid increase in total power consumption of servers (especially AI servers), server power supplies must meet server operation needs by increasing power density and maintaining a high energy conversion rate (over 96% titanium) under the OCP ORV3 standard.
Better materials, better topology, and more integration are the main ways to increase power density, so the power supply industry not only enjoys a rapid rise in demand due to increased total power consumption, but also increased single-watt price increases due to factors such as material changes and increased cooling requirements, and helped the industry space expand rapidly. It is worth noting that considering the recent increase in the difficulty of streaming films from the mainland region and the increase in the share of streaming chips in the mainland region, cabinets equipped with domestic AI chips may require higher power requirements.
Liquid cooling: Single card power consumption has increased and more cards are concentrated in one cabinet, and cooling has been upgraded from air cooling to liquid cooling.
1) The Nvidia GB200 consists of two 1200W GPUs and 300W CPUs. The total power consumption is as high as 2.7KW. The doubling of the power consumption of a single chip makes its requirements for heat dissipation far exceed traditional air cooling capacity;
2) The NVL 72 server is equipped with 36 GB200 GPUs, which are more integrated. The system power consumption can reach 120 KW, further increasing cooling requirements;
3) Policies strictly control PUE and require more efficient cooling solutions;
4) From a full life cycle perspective, liquid cooling systems have strong operational advantages under fixed IT requirements. Overall, liquid cooling has a higher construction cost than air cooling systems. Based on the cost of liquid cooling of 0.95 to 10,000 yuan/KW (including outdoor cooling sources), and the cost of air cooling is 0.35 million yuan/KW, if the NVL 72 single cabinet consumes about 120 kW, the cost of liquid cooling systems is 1,14-1.26 million yuan (about 160,000 US dollars), which is 780,000 yuan more expensive than air cooling systems.
Domestic autonomy and control:
According to a package of rules issued by the US Department of Commerce and Security Administration (BIS) at the end of 2023, the maximum AI chip performance that can be obtained domestically is basically at the H20 level. Considering that the H20's FP 16 computing power is only 6.7% of the B200 chip, its performance is not enough to support domestic exploration of larger parameter models, so the urgency of domestic AI chip development is highlighted. In the future, leading domestic AI chip companies will also be leading in terms of shipment volume, ecology, and product strength. Furthermore, considering domestic chip manufacturing processes and processes, and that domestic Internet customers will start promoting cabinet solutions next year, it is recommended to pay attention to domestic power supplies, liquid cooling, and other related targets.
Agent: Agent has now become a key driving force for global tech giants, including Claude 3.5 Sonnet on the PC side, Auto GLM on mobile phones, and Salesforce and Microsoft agent products in enterprise business flows. At the same time, multi-agent collaborative group intelligence has also begun to be gradually commercialized. For example, Baidu's second, complex, multi-step tasks can be achieved through agents.
There will be differences between C-side and B-side agents. C-side personal assistants place more emphasis on comprehensive ability and ability to solve life scenarios; B-side agents place more emphasis on expertise. They must not only have core agents with overview ability and accurately generate business execution flows based on tasks, but also have a large number of agents with independent skills and expertise, can handle specific tasks, and agents can communicate with each other. With the spread of agents, the consumption of inference computing power will increase dramatically, and when multiple agents communicate and collaborate, the consumption of tokens and computing power will increase exponentially.
Autonomous driving: Tesla is expected to release the FSD V13 version to non-Tesla owners in the last week of November. The main features of this version include native AI4 input and neural network architecture, 3 times larger model size, 3 times longer model context length, 4.2 times larger (training) data, and 5 times more training computation (achieved through Cortex training clusters).
Compared to v12.5.4, v13 increased the required intervention interval mileage by 4 times. With the popularity of large models in autonomous driving, the blunt characteristics of autonomous driving under previous rules have changed. Currently, the experience is more similar to the experience of human driving, thereby reducing the number of takeovers. However, for L4 driverless cars, there is still a big gap between Tesla's average takeover mileage compared to humans. Currently, it is not possible to fully achieve autonomous driving by relying on bicycle intelligence alone, so we need to pay attention to domestic vehicle road cloud construction.
AI applications empower thousands of industries: Another main battleground for big model applications is industry applications.
“Outline of the Strategic Plan for Expanding Domestic Demand (2021-2035)”: Firmly implement the strategy to expand domestic demand, cultivate a complete domestic demand system, and focus on promoting the deep integration of 5G, artificial intelligence, big data and other technologies with transportation and logistics, energy, ecological and environmental protection, water conservancy, emergency response, public services, etc., to help improve the governance capacity of related industries. Around AI, it has begun to be implemented in the fields of finance, industry, education, transportation, military, medical care, etc.
On the financial side, big models are gradually becoming better investment and research assistants, virtual wealth management people, financial knowledge bases, etc. On the industrial side, large models have begun to provide human-computer interaction, AIGC generation samples, etc. in CAD and other software. In the field of robotics, robots are rapidly becoming more intelligent after being connected to larger models, and they have begun to replace people to complete simple tasks in scenarios such as factories.
In the military field, overseas Palantir has successfully used large models as battlefield assistants on the battlefield.
In the field of education, AI is gradually becoming a virtual teacher in more subjects. In the field of transportation, vehicle-road cloud collaboration places higher demands on infrastructure, empowers intelligent traffic management, and can effectively reduce the cost of intelligent driving.
In the medical field, in the past, AI itself had advanced applications (traditional models) in medical imaging, new drug development, etc. The advent of generative models has further deepened AI development in these fields, but overall, overseas R&D is more pharmaceutical-oriented, and domestic R&D is more health-management. There are differences in the application directions of the two depending on the effectiveness of the big model.
Risk warning: Expectations of the recession in North America are gradually increasing, and there is great uncertainty in the macro environment. Changes in the international environment may affect the normal production and delivery of relevant companies, and the company's shipments fall short of expectations; demand and capital expenses for informatization and digitization fall short of expectations; market competition intensifies, leading to a rapid decline in gross margin; the price of major raw materials rises, causing gross margin to fall short of expectations; exchange rate fluctuations affect exchange earnings and gross profit margins of export-oriented enterprises; the effects of large-scale model algorithm updates and iterations fall short of expectations. It may affect the evolution and expansion of large models, which in turn will affect their commercialization implementation, etc.; the progress of automobile and industrial intelligence falls short of expectations, etc.