AI-generated image
AsianFin — Huawei has unveiled a new AI inference technology designed to slash latency, lower costs and boost the commercial viability of large AI models, as demand shifts from training to inference workloads.
The system, called UCM Inference Memory Data Manager, aims to improve the speed and efficiency of AI by caching previously processed results and retrieving them from high-performance shared storage rather than recalculating from scratch. Huawei says the approach can cut “first-token” latency by up to 90%, increase tokens processed per second by as much as 22 times in long-sequence scenarios, and reduce per-token costs — all without major new hardware investments.
UCM consists of three key modules: connectors that integrate with popular inference engines, an accelerator library for hierarchical KV Cache management, and an adapter that speeds up access to professional shared storage. By coordinating inference frameworks, computing power, and storage, Huawei says the system addresses industry pain points of “slow” and “expensive” inference.
Huawei is piloting UCM with China UnionPay in high-frequency financial scenarios, where response time and accuracy are critical. UnionPay reported that using UCM cut model inference times for customer-service classification from 600 seconds to under 10 seconds — a 50x improvement — while boosting classification accuracy from under 10% to 80%.
According to Huawei, demand for inference computing power now exceeds training demand, accounting for 58.5% of workloads. But China’s AI sector faces higher latency, slower output speeds and smaller context windows than leading overseas models, partly due to lower infrastructure investment and limited access to advanced chips.
Huawei plans to open source UCM in September, making it compatible with multiple inference engines, storage systems, and hardware vendors. The company says it hopes to rally industry players around common standards for AI inference acceleration. scanning headlines.
消息,萨尔瓦多政府今年已累计购买超过150枚比特币,并维持每日买入1枚且零卖出的策略。...
2 国家互联网信息办公室发布数字身份互通消息,国家互联网信息办公室发布《促进分布式数字身份互通互认应用规定》,拟推动分布式...
3 鲸鱼37bnff抛售800枚BTC,亏损3530万美元消息,据Lookonchain监测,鲸鱼地址37bnff在持有比特币7个月后,今日抛售了800枚BTC,按当前价格...
4 Arthur Hayes:过去4天累计5900枚ETH,亏损6Arthur Hayes在过去4天内累计购买了5900枚ETH,价值约1058万美元,平均价格为1793美元。就在4小时前...
5 gomining推出gobtc pay支付协议,计划招募商消息,gomining宣布推出gobtc pay支付协议的SDK和API,允许商户接入其比特币支付系统,用于日常消...
6 分析师:黑石通过IBIT ETF单日抛售1000 BT消息,黑石在其IBIT ETF中于周四单日抛售超过1000 BTC,导致当日整体现货比特币ETF净流出约141...
7 高盛将年末黄金目标价下调至4900美元,预高盛将年末黄金目标价从每盎司5400美元下调至4900美元,理由是预计美联储今年不会降息,下一...
8 reality股票代币资产规模突破5000万美元消息,bitget旗下合规RWA发行平台reality今日宣布,其股票代币rtoken系列资产的管理规模已突破...
9 针对真主党反复违反停火协议消息,以色列国防军:针对真主党反复违反停火协议,以色列国防军在黎巴嫩南部打击了超过...
10 微软发现恶意软件劫持加密钱包并通过消息,微软发现一种恶意软件,该软件能够劫持加密钱包,并通过USB闪存驱动器进行传播。这...
成都来彰科技 蜀ICP备2025134723号-1
资讯来源互联网,如有版权问题请联系管理员删除。