GTC 2025 Expo Experience: AI Hardware, Software, and Cloud Innovations

Hardware & AI Servers

CoreWeave – GPU Cloud Challenges

At booth 315, CoreWeave showcased its GPU cloud services but with a critical difference from competitors. Their cloud clusters require manual setup, taking 6-8 weeks, making it a slower option compared to others.

Nebius – Automated, Configurable GPU Cloud

Nebius, an AI cloud provider originally tied to Yandex (“Russian Google” according to person in the booth), offers automated cluster setup for AI workloads. Their clusters support anywhere from 8 to 512 GPUs, mostly H200 and H100, and can be deployed quickly compared to CoreWeave.

MSI AI Server – MGX Modular AI Systems

MSI featured its AI server lineup, showcasing MGX, Nvidia’s 4-GPU configuration that allows adding up to 8 GPUs via a PCIe slot. These servers are managed by MGX-qualified BMCs, with an optional HaW Root of Trust (ROT) for securing the BMC and CPU BIOS, though notably not the GPUs. Nvidia self-certifies these MGX systems, raising questions about security verification.

Nvidia DGX GB300 & GB200 Supercomputing Racks

A highlight of the expo was Nvidia’s high-density DGX racks, drawing long lines. Two major configurations stood out:

  • GB300 – Liquid-cooled, ultra-dense design with 16 compute trays, each containing 4 Blackwell GPUs and 2 ARM CPUs, maintaining a 2:1 GPU-to-CPU ratio.

  • GB300 NVLink Switch – Connects all GPUs at 130TB/s, offering 144 NVLink ports (100GB/s each).

  • B300 – Air-cooled version, featuring only 4 trays instead of 16, due to cooling limitations, and x86 CPUs instead of ARM.

AI Cooling & Power Solutions

Cool IT – Cold Plates & Liquid Cooling

Cooling high-density AI workloads is an evolving challenge. Cool IT offers cold plates for GPUs and CPUs, costing between $10K-$20K each. They also manufacture a large-scale Coolant Distribution Unit (CDU), cooling 6-8 racks at a cost of around $100K.

With high-density racks consuming 160-180KWh today, future 300KWh racks will require even more advanced cooling solutions. Due to tariff concerns, Cool IT is expanding into Arizona from Canada.

Software & AI Cloud Ecosystems

Databricks – Open Source AI & Data Lakehouse

Databricks reinforced its vision of Lakehouse architecture, blending data lakes and warehouses.

  • Data-sharing models were a major focus, likened to Disney being a data provider (rich consumer data) with advertisers like toy companies or theme parks acting as consumers.

  • Unity Catalog offers access control and lineage tracking but lacks built-in AI security, leaving security to third-party solutions.

  • No model scanning—assumes users bring safe, trusted models (e.g., LLaMA), but acknowledges Hugging Face as a malware risk.

Snowflake & Secure ETL Sharing

Databricks' approach mirrors Snowflake’s data-sharing model, where companies securely share data across partners while maintaining stable edges—the controlled intersection of datasets (e.g., Disney + Sony).

Physical AI & Robotics

A newer trend in AI was Physical AI, essentially robotics under a rebranded name. Notable innovations included:

  • Chemical-mixing robots for automated lab experiments.

  • Robotic arms with human-controlled motion tracking, allowing remote operation for industrial or medical applications.

  • Laparoscopic AI-assisted surgery robots, where the robot’s camera/light follows the surgeon’s tools, reducing the need for an assistant.

AI Models & LLM Infrastructure

AI21 Labs – Custom LLMs for Secure AI

AI21 Labs introduced Jamba, an entirely self-developed LLM designed for text-heavy applications.

  • Focuses on banks and security-conscious enterprises needing on-prem AI models instead of public APIs or cloud-hosted AI.

Deloitte – AI Assistant with RAG (But Poor UX)

Deloitte showcased an AI assistant built for retrieval-augmented generation (RAG) on internal documents.

  • However, when asked, “Can I ask you a security question?”, it oddly responded, “Come on, you know I can’t tell you that?”, indicating missing sentiment and tone training.

  • Asking about Deloitte’s security solutions for AI resulted in generic, unhelpful answers.

GPU Virtualization & OS Innovations

Backend AI (Korea) – OS for GPUs?

A Korean company, Backend AI, presented a GPU-centric OS layer that claims to be faster than Nvidia GSX for fractional GPU allocation.

  • Notably, no one actually runs an OS on a GPU—GPUs require Ubuntu/Linux and CUDA drivers, raising questions about their performance claims.

Nvidia Workload Orchestration – Run.ai Integration

Nvidia showcased its workload orchestration solution, based on Run.ai (a recent acquisition).

  • Key features:

    • Automated workload management

    • Hardware/software failure isolation (e.g., detecting liquid cooling leaks)

    • Dynamic cluster provisioning & power optimization

cuPyNumeric – GPU Acceleration for Numpy Users

A welcome addition for Numpy users, cuPyNumeric allows easy porting of Numpy-based code to Nvidia GPUs.

  • Simple installation via Conda: conda install cuPyNumeric.

Final Thoughts

Too bad GTC Expo Ticket was only for Friday and I had not realized Expo was running only 11AM-2PM, so the $100 ticket sounded pricy for 3 hours, but in hindsight, I am glad I went.

The GTC 2025 Expo was packed with innovations across hardware, AI infrastructure, cooling, robotics, and software.

Things I saw that were interesting was Nvidia’s DGX GB300 racks with liquid cooling, AI21’s Jamba model for hosting Financial application AI, and Backend AI’s virtualization claims were eye-catching, some solutions—like Deloitte’s AI assistant and even Databricks lack of integration with model scanning tools highlighted the ongoing challenges in AI user experience and AI pipeline security.

The industry is racing toward high-density AI computing, but power consumption, cooling, and security remain critical issues that companies must address moving forward.

Previous
Previous

LLM-Agent Hacking: High Effort, Low Reward Gamble

Next
Next

AI infrastructure review: CoreWeave GPU cloud