About Nebius:
Nebius is leading a new era in cloud infrastructure for the global AI economy. We are building a full-stack AI cloud platform that supports developers and enterprises from data and model training through to production deployment, without the cost and complexity of building large in-house AI/ML infrastructure.
Built by engineers, for engineers. From large-scale GPU orchestration to inference optimization, we own the hard problems across compute, storage, networking and applied AI.
Listed on Nasdaq (NBIS) and headquartered in Amsterdam, we have a global footprint with R&D hubs across Europe, the UK, North America and Israel. Our team of 1,500+ includes hundreds of engineers with deep expertise across hardware, software and AI R&D.
We are looking for a Technical Program Manager to own the operational readiness and ongoing health of our fleet of Data Centers, both COLO and BTS sites. In this role you will be the single point of accountability for ensuring each site runs as expected — SLAs met, maintenance executed on schedule, and audits passed — across a growing portfolio of landlord-operated and purpose-built facilities. You will operate as the primary interface between Nebius and our data center landlords and operators, and you will partner closely with the Nebius IT team to translate site-level operations into reliable infrastructure for our customers.
This is an individual contributor position for someone who is equally comfortable in a contractual SLA review, a maintenance window planning call, and a physical site audit. You will define the mechanisms that keep our sites accountable and surface risk before it becomes an incident.
- Own the operational health of Nebius COLO and BTS sites, ensuring each facility runs to expectation across power, cooling, space, connectivity, security, and environmental controls.
- Track, monitor, and enforce SLA compliance across landlords and colocation providers; identify breaches, drive remediation, and hold providers accountable to contractual commitments.
- Manage and coordinate site maintenance schedules — preventive and corrective — including planning and approving maintenance windows, reviewing Methods of Procedure (MOPs), and minimizing risk to live workloads.
- Plan and drive site audits covering compliance, capacity, power/cooling performance, physical security, and safety; track findings to closure.
- Serve as the primary day-to-day interface with data center landlords and operators, managing the operational relationship, escalations, and coordination of on-site activity.
- Partner closely with the Nebius IT team on deployments, capacity planning, incident response, and change management at each site.
- Build reporting mechanisms and dashboards that give leadership clear visibility into site health, SLA performance, maintenance status, and open risk across the portfolio.
- Lead incident coordination and post-incident follow-up, including root cause analysis and corrective action tracking with landlords and internal teams.
- Track and manage contractual operational obligations, deliverables, and timelines across multiple sites and providers simultaneously.
The Data Center team is responsible for the physical infrastructure that underpins Nebius' AI cloud. We manage the full lifecycle of our COLO and BTS footprint — from bringing new capacity online to keeping live sites running reliably at scale. We work at the intersection of facilities operations, vendor management, and IT infrastructure, and we move fast because our customers' AI workloads depend on the reliability we deliver.
- 10+ years of experience in technical program management, data center operations, or critical facilities/infrastructure management.
- Experience managing data center infrastructure and operations (power, cooling, space, connectivity) in colocation, build-to-suit, or owned environments.
- Experience managing third-party vendors, landlords, or service providers against SLAs and contractual obligations.
- Demonstrated ability to manage multiple programs, sites, or workstreams simultaneously and drive them to measurable outcomes.
- Bachelor's degree in a relevant field, or equivalent practical experience.
- Direct experience with colocation (COLO) and build-to-suit (BTS) data center models, including operating across multiple landlords and operators.
- Working knowledge of data center SLAs, MOPs/SOPs, maintenance regimes, and audit and compliance frameworks (e.g., Uptime Institute Tier standards, SOC 2, ISO 27001).
- Experience supporting AI/HPC, GPU cluster, or other high-density compute infrastructure.
- Strong familiarity with incident management and root cause analysis in a critical facilities context.
- Experience building reporting mechanisms, dashboards, or operational scorecards for infrastructure health and risk.
- PMP, Uptime ATD, or equivalent program/operations certification.
- Proficiency with program and ticketing tools (e.g., Jira, ServiceNow) and comfort working with operational data.
- Willingness to travel to sites as needed.
Benefits & Perks:
- Competitive compensation
- Career growth and learning opportunities
- Flexibility and ownership
- Collaborative and innovative culture
- Opportunity to work on impactful AI projects
- International environment and talented teams
What's it like to work at Nebius:
Fast moving - Bold thinking - Constant growth - Meaningful impact - Trust and real ownership - Opportunity to shape the future of AI
Equal Opportunity Statement:
Nebius is an equal opportunity employer. We are committed to fostering an inclusive and diverse workplace and to providing equal employment opportunities in all aspects of employment. We do not discriminate on the basis of race, color, religion, sex (including pregnancy), national origin, ancestry, age, disability, genetic information, marital status, veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by applicable law.
Applicants must be authorized to work in the country in which they apply and will be required to provide proof of employment eligibility as a condition of hire.
If you need accommodations during the application process, please let us know.