Qualcomm Enters Data Center Market with Dragonfly Platform
In brief
- Qualcomm is entering the data center market with its Dragonfly platform.
- The company aims to make $15 billion in revenue by 2029.
- Qualcomm's strategy is based on three key strengths.
- It has a novel memory first architecture for superior efficiency.
- The company also acquired Modular to provide a hardware-agnostic software stack.
- Qualcomm has deep expertise in connectivity to address data center bottlenecks.
- The company will offer Arm-based Oryon CPUs and custom silicon.
- Microsoft and Meta have already made early commitments.
- Qualcomm is now a strong contender in the AI landscape.
- It will compete in the data center market next year.
Terms in this brief
- Dragonfly Platform
- Qualcomm's Dragonfly Platform is an entry into the data center market, designed to compete with companies like NVIDIA. It features a novel memory-first architecture for efficiency and offers Arm-based Oryon CPUs and custom silicon. Microsoft and Meta have already committed to using it.
- Modular
- Qualcomm acquired Modular to provide a hardware-agnostic software stack, challenging NVIDIA's CUDA. This allows Dragonfly Platform to work across different hardware types, addressing data center bottlenecks with connectivity expertise.
Read full story at Forbes →
More briefs
Utah Launches AI Prescription Pilot Program
Utah has launched a pilot program to allow AI chatbots to fill prescriptions for common health conditions. The program allows AI to fill birth control prescriptions and medication for asthma, diabetes, and other conditions. The pilot program aims to automate routine prescription renewals, which could lighten clinician workload and improve patient access to medication. For example, the program could help patients with diabetes get their prescriptions refilled more quickly and easily. The program has generated criticism from the medical sector, with concerns about liability and the potential for life-threatening reactions to medications. The future of AI in healthcare will likely depend on addressing these concerns and finding a balance between innovation and patient safety.
Amtrak Embraces Artificial Intelligence
Amtrak is using artificial intelligence to modernize its operations. The company had 35 million passengers last year. It is improving ticket reservations and other processes. Amtrak needs good data and effective change management to make this work. The company has already made some changes. It has consolidated its human resources and supply chain databases. This helps the company make better decisions. The company will continue to use technology to improve its operations. It wants to make life easier for its thousands of employees. Amtrak will keep working to modernize its systems and improve productivity.
AI Infrastructure Shifts to Heterogeneous Racks
Arm is rethinking the AI CPU as AI infrastructure becomes more specialized. The first phase of generative AI infrastructure focused on accelerator scale. The next phase is about rack-scale system composition, with heterogeneous AI racks optimized for different phases of the workflow. This shift matters because inference is changing structurally, with more time spent coordinating work across accelerators, memory, storage, networking and software services. AI infrastructure is becoming more specialized at each stage of the inference pipeline, with a growing separation between prefill and decode phases. The future of AI infrastructure will be shaped by these specialized heterogeneous racks.
AI Inference Gets a Memory Boost: New Techniques Reduce GPU Bottlenecks
AI models are getting bigger, and so are the demands they place on GPUs. Traditionally, these powerful graphics cards have been the workhorses for running inference tasks like image generation or natural language processing. But as models grow more complex, their memory needs outpace what even high-end GPUs can offer. Now, researchers are experimenting with ways to split AI workloads across multiple GPUs, effectively pooling their resources to handle larger datasets and more intricate computations. This development is crucial for developers building pipelines for media generation and other computationally intensive tasks. By distributing the workload, these new techniques aim to make large language models and generative AI more accessible, even with hardware limitations. While the exact performance improvements are still being tested, early results suggest a significant boost in efficiency without sacrificing model quality. Looking ahead, experts predict that this multi-GPU approach will become standard as AI models continue to evolve. Users can expect to see more tools and frameworks optimized for distributed inference, making it easier to scale up their projects without hitting memory walls.
AWS Reduces Vector Search Costs for AI Applications
AWS has announced a significant update to its AI infrastructure, focusing on cost efficiency and security. The company is replacing Amazon OpenSearch Serverless with Amazon S3 Vectors, which can reduce vector storage and query costs by up to 90% in moderate workloads. This move aims to make agentic AI applications more accessible while maintaining strict data governance. The new architecture introduces several key improvements: It uses Amazon S3 Tables with Apache Iceberg support, governed by AWS Lake Formation, which boosts transaction speed by up to ten times compared to self-managed solutions. Additionally, it enforces fine-grained access control across all layers of the data interaction chain, ensuring secure data handling from query execution to response synthesis. This update highlights AWS's commitment to supporting scalable and secure AI applications. Developers can now build more efficient and controlled systems for tasks like customer service automation. Watch for further updates on how these changes impact AI adoption in various industries.