Home > All news > Industry news > NeuReality boots AI accelerator utilization with NAPU
芯达茂F广告位 芯达茂F广告位

NeuReality boots AI accelerator utilization with NAPU

Recently, an Israeli startup, NeuReality, is working to revolutionize AI reasoning systems in data centers. They plan to replace traditional host central processing units (GPUs) with a new type of chip, the Network Addressable Processing Unit (NAPU). This NAPU chip has a hardware implementation that performs typical CPU functions, such as virtual machine monitors. NeuReality's goal with this innovation is to eliminate the performance bottlenecks that can be caused by existing CPUs, thus significantly increasing the frequency of use of AI gas pedals, which in turn reduces operational costs and energy consumption. NeuReality's CEO says this NAPU The CEO of NeuReality said the NAPU chip can achieve 100 percent utilization of AI gas pedals.

The way AI gas pedals are utilized varies from application to application, both in the cloud and in our local testing. In some cases, GPU or ASIC utilization may only reach 25 to 30 percent, while in other cases the CPU may be idle due to running large language models (LLMs), when the system is mainly limited by the performance of the GPU and memory interfaces, and the CPU does not play much of a role. As a result, the economics of current servers seem rather unreasonable when using gas pedals specialized for reasoning tasks.

Current AI server configurations typically include two CPUs and a Network Interface Controller (NIC), and sometimes a Data Processing Unit (DPU) or smartNIC to work alongside each AI gas pedal. Such a server architecture can support multiple virtual machines running, where the CPU is responsible for performing tasks such as network termination, maintaining quality of service between clients, and pre-processing data before it is sent to the AI gas pedal. As the performance of AI gas pedals continues to increase, the problem of underutilization of gas pedals will become more pronounced as CPUs remain a constraint on performance. While CPUs are powerful, they are essentially general-purpose processors. They are not designed specifically for AI tasks and therefore do not fully utilize the performance of AI gas pedals when processing AI requests.

NeuReality works to address the utilization problem by separating the AI processing flow from the CPU. The company solidifies CPU tasks such as network termination and quality of service onto a heterogeneous computing chip designed for AI reasoning workloads in large-scale production environments. NAPU is not an "AI CPU," but rather dedicated silicon designed for AI reasoning servers in data centers, designed to handle the large number and diversity of queries that characterize modern AI reasoning," emphasized the director, adding that NAPU is network-connected, meaning that AI queries can be sent directly to NAPU over Ethernet. .

Performance data provided by the company for its first-generation NAPU, known as the NR1, shows that by replacing its host CPU with the NR1, the AI gas pedal ASIC can improve performance per watt by a factor of about eight. While the NR1 was designed around the IBM AIU, it is general purpose and can be used with any AI gas pedal upon introduction.

The NAPU product line from NeuReality comes in two forms: the NR1-S device, designed for servers that don't require a CPU, and the NR1-M module, which can be plugged into CPU-equipped server racks to share the CPU's processing tasks.

The company's current focus is on applications including automatic speech recognition (ASR), natural language processing (NLP), fraud detection, secure telemedicine services, patient AI query search, and computer vision. However, the company's head believes that larger market opportunities may emerge as generative AI reasoning technologies are developed and applied at scale. He emphasized that in order to drive widespread adoption of generative AI in key industries, it is critical to reduce costs and increase affordability, and NeuReality is committed to making traditional AI applications more economically sustainable, thereby creating conditions conducive to the development of generative AI.

Related news recommendations

Login

Register

Login
{{codeText}}
Login
{{codeText}}
Submit
Close
Subscribe
ITEM
Comparison Clear all