Launch1w ago

NVIDIA and Google Cut AI Inference Costs with New Hardware

AI NewsApril 23, 2026

In brief

At the Google Cloud Next conference, Google and NVIDIA revealed a new hardware plan to make AI inference more affordable.
The companies introduced A5X bare-metal instances powered by NVIDIA's Vera Rubin NVL72 systems.
- This collaboration aims to reduce costs significantly-up to ten times less for certain tasks through optimized hardware-software design.
The move is crucial as AI models grow larger, making inference expenses a major concern for businesses and researchers.
By streamlining the infrastructure, this partnership could democratize access to high-performance AI, enabling more developers to deploy advanced models without prohibitive costs.
Looking ahead, the A5X instances are expected to be available in 2024.
- This breakthrough could shift how AI is deployed globally, making it faster and cheaper for organizations across industries to integrate intelligent systems into their operations.

Terms in this brief

Vera Rubin NVL72: A high-performance AI system developed by NVIDIA, designed to optimize AI inference tasks. It works alongside Google's A5X instances to reduce costs and improve efficiency for businesses and researchers using large AI models.

Read full story at AI News →

More briefs