Thchere

Accelerating AI Development: Q&A with NVIDIA and Google Cloud

Published: 2026-05-20 16:37:14 | Category: Education & Careers

The collaboration between NVIDIA and Google Cloud has reached new heights, empowering over 100,000 developers through a joint community launched at Google I/O the previous year. This initiative provides curated learning paths, hands-on labs, and events to help builders harness the full-stack NVIDIA AI platform on Google Cloud. In this Q&A, we explore the community's offerings, new additions for this year, and how developers are creating production-ready AI applications using cutting-edge tools like JAX, NVIDIA Dynamo, and Google DeepMind's models.

What Exactly Is the Joint Developer Community from NVIDIA and Google Cloud?

The joint developer community is a dedicated hub for developers, data scientists, and machine learning engineers who want to sharpen their AI skills using the latest technologies from both companies. Launched at Google I/O in the previous year, the community offers curated learning paths, hands-on labs, and events that guide members in building with the full-stack NVIDIA AI platform on Google Cloud. Over the past year, it has become a go-to resource for AI builders employing NVIDIA-accelerated tools for data science and machine learning. Members gain access to insights on production-ready retrieval-augmented generation (RAG) applications and real-world use cases like sports analytics and enterprise data pipelines. The community also fosters collaboration, enabling developers to experiment with large language models and hybrid cloud-based inference.

Accelerating AI Development: Q&A with NVIDIA and Google Cloud
Source: blogs.nvidia.com

What New Resources Are Rolling Out This Year for Community Members?

New additions this year include a learning path focused on using the JAX library on NVIDIA GPUs, a new NVIDIA Dynamo codelab specifically for inference optimizations, and monthly developer livestreams. The JAX learning path helps developers run and scale JAX workloads—from single-GPU experiments to multi-rack deployments—while maintaining strong performance on NVIDIA infrastructure within Google Cloud. Meanwhile, the Dynamo codelab enables builders to optimize large-scale inference, including for mixture-of-experts models, making AI applications more efficient. These resources complement existing content on the Google Cloud AI platform, allowing members to explore everything from open frameworks to practical deployment strategies on Google Kubernetes Engine (GKE).

How Does the Community Help Developers Create Production-Ready AI Applications?

The community provides tools and labs that combine NVIDIA libraries, open models, and Google Cloud's AI platform, enabling faster development of optimized, production-ready applications. For instance, developers can use the NVIDIA cuDF library in Google Colab Enterprise or Dataproc to accelerate data science and analytics. They can also deploy multi-agent applications by combining Google DeepMind’s Gemma 4 models, NVIDIA Nemotron open models, and the Google Agent Development Kit with G4 VMs powered by NVIDIA RTX PRO 6000 Blackwell GPUs—available on Google Cloud Run or with spot instances. The result is streamlined workflows that transition from prototype to production, as seen in applications like retrieval-augmented generation on GKE and observability for agent-based workloads. Practical guides and labs ensure developers gain hands-on experience with these technologies.

What Real-World Use Cases Are Developers Exploring with These Tools?

Developers in the community are experimenting with diverse real-world applications, including sports analytics and enterprise data pipelines. Using the NVIDIA AI platform on Google Cloud, they build hybrid on-premises and cloud inference solutions for large language models and prototype new research directions. For example, production-ready RAG applications on Google Kubernetes Engine have been achieved, and observability for agent-based workloads is being refined. These use cases demonstrate the platform's flexibility, from handling large-scale data processing to delivering low-latency inference for time-sensitive domains like live sports. The community also supports experimentation with multi-agent systems, enabling sophisticated workflows that combine multiple AI models and tools.

Accelerating AI Development: Q&A with NVIDIA and Google Cloud
Source: blogs.nvidia.com

How Do NVIDIA and Google Cloud Collaborate on Open Frameworks Like JAX?

NVIDIA and Google Cloud have deep integration with open frameworks such as JAX, allowing developers to build, scale, and productize JAX workloads seamlessly on NVIDIA AI infrastructure within Google Cloud. This collaboration extends to Google Cloud AI Hypercomputer, where the MaxText framework leverages JAX optimizations to train large models efficiently on NVIDIA GPUs. Additionally, NVIDIA Dynamo on GKE helps optimize inference for large-scale models, including mixture-of-experts architectures. By providing consistent performance and a unified experience across single-GPU experiments and multi-rack deployments, the partnership ensures developers can move from research to production with minimal friction. The monthly livestreams and updated learning paths keep the community abreast of these advancements.

What Specific Tools Are Available for Inference Optimization and Large-Scale Deployment?

Key tools include the NVIDIA Dynamo codelab focused on inference optimizations, which helps developers serve AI applications more efficiently using NVIDIA accelerated infrastructure on Google Cloud. The integration of Dynamo with Google Kubernetes Engine (GKE) enables scaling of mixture-of-experts models. Additionally, the new learning path for JAX on NVIDIA GPUs allows for seamless scaling from single-GPU experiments to multi-rack deployments. For deployment, developers can utilize Google Cloud's G4 VMs powered by NVIDIA RTX PRO 6000 Blackwell GPUs—available on Cloud Run or with spot instances. These resources, combined with the community's hands-on labs and livestreams, equip builders to handle large-scale inference and production workloads effectively.