当前位置:主页 > 业界 >

Huawei Debuts AI Inference Tech With China UnionPay, Promises 90% Cut in Fir...

时间:2025-08-13 15:07:15

  
 

  AI-generated image

  AsianFin — Huawei has unveiled a new AI inference technology designed to slash latency, lower costs and boost the commercial viability of large AI models, as demand shifts from training to inference workloads.

  The system, called UCM Inference Memory Data Manager, aims to improve the speed and efficiency of AI by caching previously processed results and retrieving them from high-performance shared storage rather than recalculating from scratch. Huawei says the approach can cut “first-token” latency by up to 90%, increase tokens processed per second by as much as 22 times in long-sequence scenarios, and reduce per-token costs — all without major new hardware investments.

  UCM consists of three key modules: connectors that integrate with popular inference engines, an accelerator library for hierarchical KV Cache management, and an adapter that speeds up access to professional shared storage. By coordinating inference frameworks, computing power, and storage, Huawei says the system addresses industry pain points of “slow” and “expensive” inference.

  Huawei is piloting UCM with China UnionPay in high-frequency financial scenarios, where response time and accuracy are critical. UnionPay reported that using UCM cut model inference times for customer-service classification from 600 seconds to under 10 seconds — a 50x improvement — while boosting classification accuracy from under 10% to 80%.

  According to Huawei, demand for inference computing power now exceeds training demand, accounting for 58.5% of workloads. But China’s AI sector faces higher latency, slower output speeds and smaller context windows than leading overseas models, partly due to lower infrastructure investment and limited access to advanced chips.

  Huawei plans to open source UCM in September, making it compatible with multiple inference engines, storage systems, and hardware vendors. The company says it hopes to rally industry players around common standards for AI inference acceleration. scanning headlines.

热点推荐
1 Moonpay推出面向AI代理的稳定币借记卡

消息,Moonpay推出面向AI代理与用户的稳定币借记卡Moonagents Card,该卡基于Mastercard网络运行,由...

2 Hyperliquid早期贡献者Loracle增持CL空单1363

消息,Hyperliquid早期贡献者Loracle最近增持CL空单1,363.64枚,约合1,010,862.41美元,持仓规模达到...

3 2026年4月加密风险投资降至6.59亿美元,创

消息,2026年4月,加密风险投资资金降至6.59亿美元,为2024年以来的最低月度总额,较3月的26亿...

4 Moonpay推出虚拟Mastercard稳定币卡,支持A

消息,Moonpay推出了Moonagents卡,这是一款虚拟的Mastercard产品,允许AI代理和用户直接使用稳定币...

5 Matrixport关联地址(子地址1):ETH多单由

消息,Matrixport关联地址的ETH多单已由亏转盈。该地址的盈亏情况为:从亏损781,764.97美元转为盈...

6 ZEC最大空头:CL空单增持12437.76枚

消息,ZEC最大空头CL空单近期增持12,437.76枚,约合1,227,262.32美元,持仓规模达到16,406,905.40美元...

7 分析师:比特币4月保持12%涨幅,标普50

消息,比特币在4月份结束时价格超过76,000美元,保持了近12%的月度涨幅。然而,标普500指数在...

8 受伊朗战争冲击,英国工厂成本上涨与交

消息,受伊朗战争冲击,英国工厂面临成本上涨与交付延误加剧。调查显示,受霍尔木兹海峡...

9 日本或进行了第二轮干预,日元下跌主趋

消息,分析师Justin Low评估日元汇率波动,指出日本可能进行了第二轮干预,日元下跌的主趋势...

10 法巴银行:中东冲突对日本消费品价格影

消息,法国巴黎银行经济学家表示,中东冲突对日本消费品价格的影响仍然有限。他们指出,...

成都来彰科技 蜀ICP备2025134723号-1

资讯来源互联网,如有版权问题请联系管理员删除。