
Krakow, Poland, 17 - 19 June 2026
RESOURCE_EXHAUSTED - Managing Google Cloud infrastructure for Cost-Effective AI and Agentic Workloads
Conference (INTERMEDIATE level)
Room 4B
The gold rush for AI has transformed cloud infrastructure. If you've tried to spin up a GPU or TPU lately, there's a high chance you've been stopped dead in your tracks by a dreaded RESOURCE_EXHAUSTED error.
As we transition into the era of Agentic workflows, where thousands of LLM calls, vector searches and autonomous loops run concurrently - even standard CPU and memory limits are being pushed to their breaking points. Throwing money at the problem is no longer a guaranteed solution, so it's time to learn all the features and tricks available to ensure smooth computing.
In this session, we will move past the Google Cloud marketing slides and dive into concrete, practical strategies to keep your workloads running without draining your company's wallet.
Bonus: Most of the techniques and features presented can be applied to non-AI related infrastructure as well.
We will cover:
Who's this session for:
Anyone responsible for acquiring and managing the necessary resources for their teams and products.
As we transition into the era of Agentic workflows, where thousands of LLM calls, vector searches and autonomous loops run concurrently - even standard CPU and memory limits are being pushed to their breaking points. Throwing money at the problem is no longer a guaranteed solution, so it's time to learn all the features and tricks available to ensure smooth computing.
In this session, we will move past the Google Cloud marketing slides and dive into concrete, practical strategies to keep your workloads running without draining your company's wallet.
Bonus: Most of the techniques and features presented can be applied to non-AI related infrastructure as well.
We will cover:
- Navigating Scarcity: How to leverage Dynamic Workload Scheduler, Reservation Sharing, and queuing systems to guarantee GPU/TPU availability.
- Agentic Bottlenecks: Mitigating CPU and memory spikes when running highly concurrent AI agents.
- FinOps for AI: Mixing Spot VMs, Committed Use Discounts (CUDs), and secondary node pools to optimize price-to-performance.
- TPUs: What is a TPU? Is it a valid alternative to GPU?
Who's this session for:
Anyone responsible for acquiring and managing the necessary resources for their teams and products.
Ticket prices will go up in...
29
Days
:
14
Hours
:
47
Minutes
:
19
Seconds
You missed out!
Venue address
ICE Krakow, ul. Marii Konopnickiej 17
Phone
+48 691 793 877
info@devoxx.pl
