Building the home for AI requires more than scale. It demands resilience by design

How Khazna NexOps is institutionalizing accountability, data-led operations, and performance consistency across 30+ hyperscale data centers.

Building the home for AI requires more than scale. It demands resilience by design
[Source photo: Krishna Prasad/ Fast Company Middle East]

Long before a server overheats or an alarm escalates, the true test of a data center has already begun inside the operating model that governs every shift, every task, and every decision. As digital infrastructure becomes foundational to AI economies, public services, and national-scale compute, reliability is no longer defined by infrastructure alone. It is judged by whether capacity remains dependable, consistent, and controlled under pressure.

That philosophy underpins Khazna NexOps, a dedicated, insourced operations organization established by Khazna Data Centers to set a new benchmark for consistency, responsiveness, and operational excellence across its growing global footprint. The move signals a decisive shift: from a vendor-driven model to a unified, in-house operating system spanning a portfolio approaching 0.5GW. 

NexOps is not simply an insourcing initiative. It is Khazna’s operating system for reliability, designed to reduce variance, strengthen governance, and ensure that high standards are applied consistently across every site and shift.

For Bart Holsters, Managing Director of Khazna NexOps, the rationale is straightforward. “In a world where minutes of downtime are unacceptable, operations can’t be an afterthought,” he says. “Bringing operations in-house allows us to deliver predictable outcomes: safety discipline, uptime, speed of response, and consistent execution across every site and shift.”

FROM SCALE TO ACCOUNTABILITY

Outsourcing, Holsters acknowledges, can deliver rapid scale for repeatable tasks. But hyperscale infrastructure is not a commodity business. “The value isn’t just in getting work done,” he explains. “It’s delivering predictable performance.”

Insourcing places accountability, governance, and operational decision-making directly within Khazna’s control. The company has developed more than 5,000 operational documents to standardize audit-ready practices across its portfolio. It also embeds competency-linked execution, meaning critical work orders can be performed only by trained, certified, and approved personnel.

This structured capability model spans staffing, processes, KPIs, and governance, with phased digitalization continuing through 2026. It also creates clearer escalation paths, faster decision-making, and stronger oversight. As data centers are judged not only on capacity, but on continuity under pressure, that discipline becomes critical. 

The result, Holsters says, is “a new operating system for reliability”—one that ensures global hyperscale standards are applied consistently, regardless of location.

DATA-LED OPERATIONS IN AN AI HUB

Technology is central to the NexOps blueprint. Working with Presight, Khazna is implementing an AI-powered command-and-control platform from a secure hub in Abu Dhabi. The system continuously monitors energy, cooling, equipment performance, and security, predicting anomalies before they escalate and optimizing site performance around the clock.

This is where resilience becomes operational, not theoretical. It is managed continuously through visibility, disciplined workflows, and the ability to identify and address risk before it affects customers.

Climate intelligence is also embedded at the core of operations through collaboration with AlphaGeo. By integrating physical climate projections, resilience indicators, and socio-economic data, Khazna equips its teams with forward-looking environmental insights. Heat, water, energy availability, and long-term climate conditions are no longer treated as external variables, but as factors that shape maintenance planning, cooling performance, site design, and engineered safeguards.

“Climate becomes an operating input,” Holsters explains. In the short term, it sharpens maintenance planning and cooling optimization. Over the long term, it informs site design, thermal strategies, and engineered safeguards. “Designing for yesterday’s baseline is no longer enough.”

Robotic patrol units further augment operations, supporting inspection routines and identifying early warning signals such as heat anomalies, leaks, or vibration patterns. Yet Holsters is unequivocal about the human role. “Collecting signals is the easy part. Making the right call under uncertainty, that’s where experience matters.”

Rather than replacing operators, advanced analytics enhance their “line of sight,” enabling predictive maintenance, sharper prioritization, and faster intervention. “We’re not planning for human versus machine,” he says. “We’re planning for human operators empowered by better intelligence.”

MEASURABLE GAINS IN PERFORMANCE AND READINESS

The operational reset is already producing tangible results. Khazna has recorded 5.45 million LTI-free hours, with both LTIFR and TRIR reduced to zero, reinforcing that safety discipline is not separate from operational resilience but central to it.

Efficiency metrics have also moved in the right direction. Despite operating in one of the world’s most challenging climates for thermal management, the company achieved an additional ~2.3% improvement in Power Usage Effectiveness, building on already aggressive baselines. At hyperscale, marginal percentage gains translate into meaningful energy savings and resilience benefits across an entire portfolio.

Equally significant is the improvement in operational readiness. By linking competency directly to task allocation and standardizing procedures fleet-wide, Khazna has strengthened training completion rates, compliance outcomes, and audit consistency. As compute densities increase and tolerance margins tighten, these controls serve as practical risk-management tools, ensuring that the right expertise is applied at precisely the right time.

Operational discipline is ultimately proven in the details: work orders completed correctly, incidents escalated consistently, personnel certified before they act, and portfolio-wide learnings captured before issues repeat.

“Performance consistency becomes a system,” Holsters says. “Standard definitions of ‘good,’ clear escalation paths, measurable readiness—that’s how you scale without drifting.”

BUILDING A HOME OF AI

Khazna’s long-term roadmap centers on delivering higher-density, more resilient digital infrastructure at scale, efficiently, predictably, and sustainably, particularly in hot-climate environments.

Holsters views NexOps as foundational to that ambition. “AI readiness is an operating specification,” he says. “As workloads become more dynamic, the margin for inconsistency shrinks. NexOps ensures we can scale our footprint without diluting quality.”

In a sector where headlines often focus on megawatts and square meters, Khazna’s strategy reframes the conversation around operational discipline. Capacity may power the digital economy, but it is consistency, measured, standardized, and data-led that ultimately sustains it.

Through NexOps, Khazna is building the systems, skills, and controls needed to make resiliency repeatable across the infrastructure layer that AI economies will depend on.

ABOUT THE AUTHOR

FastCo Works is Fast Company's branded content studio. Advertisers commission us to consult on projects, as well as to create content and video on their behalf. More

More Top Stories:

FROM OUR PARTNERS