On the evening of June 6, Facewall Intelligence released MiniCPM 4.0, a large end-side model. The company said that through its self-developed CPM.cu inference framework, the new model can achieve up to 220 times faster in extreme scenarios and 5 times faster than conventional ones, and supports deployment on frameworks such as vLLM, SGLang, and LLAMAFactory.

Zhitongcaijing · 06/07 01:09
On the evening of June 6, Facewall Intelligence released MiniCPM 4.0, a large end-side model. The company said that through its self-developed CPM.cu inference framework, the new model can achieve up to 220 times faster in extreme scenarios and 5 times faster than conventional ones, and supports deployment on frameworks such as vLLM, SGLang, and LLAMAFactory.