Norway's National Library Develops a Sovereign LLM Using Huawei Flash Storage
In brief
- Norway’s National Library is building its own large language model (LLM) to understand the Norwegian language.
- The project uses 2 petabytes of Huawei OceanStor Dorado flash storage for training data.
- Marius Husnes, Head of IT Platform at the library, revealed that no commercial LLM provider currently offers a local Norwegian-language model.
- This puts countries with unique languages at a disadvantage since globally trained English models miss out on local history, culture, and news.
- The National Library was tasked by Norway’s Ministry of Culture to develop this sovereign AI due to its extensive digital collection of Norwegian books, newspapers, and web content.
- The library has digitized over 20 PB of data under its legal deposit mandate, stored in a 3-2-1 preservation system (three copies, two media types, one off-site).
- This unique access gives the library an edge over private companies.
- The main challenges involve data quality and pipeline throughput, not compute power.
- The library uses an Nvidia DGX H200 system, a CPU cluster, and Huawei flash arrays for preprocessing.
- Once ready, data is sent to Norway’s national supercomputer, Sigma2 Olivia, for training.
- This project highlights the importance of preserving cultural heritage through AI while overcoming technical hurdles to ensure success.
Terms in this brief
- Flash Storage
- A high-speed storage technology that uses flash memory to store data quickly. It's faster than traditional hard drives and is often used in devices where quick access to data is crucial, like in servers or smartphones.
Read full story at Hacker News →
More briefs
Major Update to EAGLE 3.1 Enhances Speculative Decoding Capabilities
A new version of the EAGLE algorithm, EAGLE 3.1, has been released, focusing on improving robustness and efficiency in speculative decoding. Previous versions faced issues like "attention drift," where models lost focus during complex tasks. The update introduces two key changes: FC normalization after each hidden state and feeding normalized states into the next step, making the model more stable. This results in better performance across various scenarios, including longer contexts and diverse prompts, with up to double the acceptance length compared to EAGLE 3. Now integrated with TorchSpec for easier training and vLLM for deployment, EAGLE 3.1 sets a new standard for large language models, promising more reliable AI interactions in real-world applications.
NFL Team Embraces AI with Microsoft Copilot
The New York Jets' front office has rapidly adopted Microsoft Copilot, with over 91% of staff now using the AI tool daily. Just a few months ago, only a handful of employees used it, but today they’re averaging two to three prompts each day. This shift marks a cultural change toward "AI-first" thinking in business and football analytics. While AI offers efficiency gains, concerns remain about job displacement and its long-term societal impact. The Jets aim to leverage AI for success, but time is tight-30 more Super Bowl chances before a potential major disruption.
AI-Generated History Vloggers Gain Popularity
YouTube channels like Chloe VS History have gained over 15m views by using AI tools to create videos of historical settings. The goal is to get younger people more interested in history. The creator of Chloe VS History uses AI video generation tools to bring history to life. More people will watch AI-generated history videos in the future.
AI Agents Get Smarter Memories
AI agents now have access to a new type of memory system called Governed Evolving Memory (GEM). This breakthrough addresses four major issues that plague current systems: uncontrolled growth, missing semantic revision, capacity-driven forgetting, and read-only retrieval. GEM operates on the state trajectory rather than individual records, making it more efficient and reliable. The new system introduces four key operations-ingestion, revision, forgetting, and retrieval-that work together to manage memory effectively. This approach ensures that AI agents can learn from past decisions without losing important information or becoming overwhelmed by data. The researchers behind GEM also developed a prototype called MemState, which demonstrates the feasibility of this novel memory management approach. Looking ahead, the team identifies three key areas for future research: improving the efficiency of state-level operations, developing a native engine for GEM, and exploring how this new memory model can be applied across different industries. These advancements could pave the way for AI agents with truly long-term, evolving memories.
Datavault AI Spins Out Acoustic Sciences
Datavault AI plans to spin out its Acoustic Sciences division into a separate entity. The company is also preparing to launch multiple data tokenization exchanges. These moves mark a shift in how Datavault AI organizes its operations and approaches data monetization. The company recently reported first quarter 2026 revenue of US$3.42 million and a net loss of US$53.13 million. Datavault AI aims to build proprietary data exchange platforms across regulated markets. The company will launch multiple data tokenization exchanges soon.