Huawei Debuts AI Inference Tech With China UnionPay, Promises 90% Cut in Fir...

时间：2025-08-13 15:07:15

　　AI-generated image

　　AsianFin — Huawei has unveiled a new AI inference technology designed to slash latency, lower costs and boost the commercial viability of large AI models, as demand shifts from training to inference workloads.

　　The system, called UCM Inference Memory Data Manager, aims to improve the speed and efficiency of AI by caching previously processed results and retrieving them from high-performance shared storage rather than recalculating from scratch. Huawei says the approach can cut “first-token” latency by up to 90%, increase tokens processed per second by as much as 22 times in long-sequence scenarios, and reduce per-token costs — all without major new hardware investments.

　　UCM consists of three key modules: connectors that integrate with popular inference engines, an accelerator library for hierarchical KV Cache management, and an adapter that speeds up access to professional shared storage. By coordinating inference frameworks, computing power, and storage, Huawei says the system addresses industry pain points of “slow” and “expensive” inference.

　　Huawei is piloting UCM with China UnionPay in high-frequency financial scenarios, where response time and accuracy are critical. UnionPay reported that using UCM cut model inference times for customer-service classification from 600 seconds to under 10 seconds — a 50x improvement — while boosting classification accuracy from under 10% to 80%.

　　According to Huawei, demand for inference computing power now exceeds training demand, accounting for 58.5% of workloads. But China’s AI sector faces higher latency, slower output speeds and smaller context windows than leading overseas models, partly due to lower infrastructure investment and limited access to advanced chips.

　　Huawei plans to open source UCM in September, making it compatible with multiple inference engines, storage systems, and hardware vendors. The company says it hopes to rally industry players around common standards for AI inference acceleration. scanning headlines.

上一篇：SAIC MG Bets Big on All-New MG4 to Break Into Chinas Crowded Electric Hatch... 返回首页

下一篇：Nvidia Bets on Humanoid Robots With Galbot, Unitree Partnerships to Tap Into...

热点推荐

1 超1.08亿美元涌入美国现货BTC ETF市场

消息，据a早期发行in监测，昨日有超过1.08亿美元资金涌入美国现货BTC ETF市场，净流入高达1....

2 美国对部分巴西商品征收25%关税，牛肉和

消息，美国贸易代表Greer指出，因不公平行为和做法，美国对部分巴西商品征收25%的关税，但牛...

3 伊朗革命卫队攻击以色列空军基地

消息，伊朗革命卫队表示，在一场导弹和无人机袭击中，他们将目标对准了科威特阿里萨利姆...

4 巨鲸在SK海力士反弹高点追涨建仓，浮亏

消息，据HyperInsight监测，地址0x8af开头的巨鲸在SK海力士反弹至1,440美元附近时追涨建仓，累计...

5 2026年7月16日银行间外汇市场人民币汇率中