Scientists from Korea Advanced Institute of Science and Technology (Kaist) have developed energy efficient NPU technology, which demonstrates a significant improvement in performance in laboratory testing.
Their specialized AI Chip ran 60% faster with 44% less electricity than graphics that control most AI systems based on the results of the controlled experiment.
Simply put, research led by Professor Jongse Park from Kaet’s School of Computing in cooperation with Hyperaccel Inc. It deals with one of the most powerful challenges in modern AI infrastructure: huge energy and hardware requirements for extensive generative generative AI models.
Current systems, such as the Chatgpt-4 OpenI and Google Gemini 2.5, require not only high memory bandwidth, but also considerable memory capacity, driving companies such as Microsoft and Google, buying Hundres thousands of NVIDIA.
Challenge for Bottleneck Memory
The main innovation is to approach the team to solve problems with narrow memory means that suffer from the existing AI infrastructure. Their energy -efficient NPU technology focuses on the “light” inference process and at the same time minimizes the critical balance with the accuracy that has proved to be demanding for the previous solution.
Student PhD Minsu Kim and Dr. Seongmin Hong of Hyperaccel Inc., serving as authors of comrades, presented their findings at the International Symposium on Computer Architecture 2025 in Tokyo. The research work called “Oaken: Fast and Effective LLM is used for a hybrid quantity of KV cache online,” he describes in detail their understanding of the approval of the problem.
The technology focuses on quantizing the cache KV, which scientists identify as accounting for most of the use of memory in generative AI systems. By optimizing this component, the team allows the same level of AI infrastructure to be the same using less NPU devices compared to traditional GPU systems.
Technical innovation and architecture
The Kaust NPU energy technology uses a three -point quantization algorithm: a hybrid quantization based on threshold, quantization of the group, quantization of the group and molten dense and saving coding. This approach allows the system to integrate into existing memory interfaces without required to change in operation logic in contemporary NPU architectures.
Hardware architecture includes memory management techniques at the page level for efficient use of limited memory and capacity bandwidth. In addition, the team introduced new coding techniques specifically optimized for quantized cache KV and solved the unique requirements of their approach.
“This research through joint work with Hyperaccel Inc. has found a solution in generative algorithms with light weighing AI and has developed basic NPU technology, which can solve the memory problem,” Professor Park explained.
“Through this technology, we have implemented NPUs with more than 60% improved power compared to the latest GPU combinations of technical technical technical technical technology that reduce memory requirements while manipurizing inference accuracy.”
The consequences of sustainability
The environmental impact of AI infrastructure has become a growing problem because generative adoption AI is accelerating. The energy technology developed by Kaist offers a potential way to sustainable AI operations.
With 44% lower energy consumption compared to current GPU solutions, widespread acceptance could normally reduce the carbon footprint of cloud services AI. However, the impact in the real world of technology will depend on several factors, including the production of scalabibility, cost efficiency and adoption levels in the industry.
Scientists acknowledge that their solution represents a significant step forward, but widespread implementation will require continuing development and industrial cooperation.
Industrial context and future outlook
The timing of this energy -efficient breakthrough of the NPU technology is particularly valid because AI companies face growing pressure to balance performance with sustainability. The current GPU control market has created a supply chain restriction and increased costs, which increases alternative solutions increasingly attractive.
Professor Park Note The Technology “It shows the possibility of implementation of a high -performance low -energy infrastructure specialized in generative AI and is expected to play a key role not only in the AI cloud centers, but also in the AI (AX) transformation AI, such as agent AI.”
Research is a significant step towards a more sustainable AI infrastructure, but its final impact will be determined depending on how effectively it can be modified and deployed in a commercial environment. Since the AI industry is still struggling with concern for energy consumption, innovations, such as Energy Energy technology, offered KAIST NPU technology, hope for a more sustainable future in the field of artificial intelligence calculation.
(Photo from Korea Advanced Institute of Science and Technology)
See also: 6 practices that ensure more sustainable data center operations


Do you want to know more about cyber security and cloud from industry leaders? Check out Cyber Security & Cloud Expo in Amsterdam, California and London.
Explore other upcoming events and webinars with technology and webinars driven Techforge here.
(Tagstranslate) ai