With infrastructure concerns moving away from training and toward inference, AI-focused cloud providers are dealing with a new set of customer demands and options for building out their data center footprints. Sheer bandwidth isn’t enough when you’re serving a wide range of users and use cases, from LLM prompts to full agentic pipelines.
In this discussion with Konstantinos Mouzakitis, principal solution architect at Nscale, he discusses how that AI cloud provider is thinking about evolving its business and infrastructure along with evolving customer needs. These include distributing inference jobs across thousands of GPUs, adapting to AI-native pricing models, and preparing for a robotics-heavy future.
You can watch the full discussion with Konstantinos here.
To hear even more from him and Nscale, register for our upcoming webinar — Scaling the Intelligence Frontier: From GPU Scarcity to Production Sovereignty — taking place on Feb. 19.



