This job might no longer be available.

Senior DevOps/Site Reliability Engineer

4 years ago

Who We Are

Take-Two develops and publishes some of the world's biggest games. Our Rockstar label creates Grand Theft Auto and Red Dead Redemption, two of the most critically acclaimed gaming franchises in history. Our 2K label creates games like NBA 2K, WWE 2K, Bioshock, Borderlands, Evolve, XCOM and the beloved Sid Meier's Civilization. Our Private Division label publishes Kerbal Space Program and will publish upcoming titles with Obsidian Entertainment, Panache Digital Games and more.

Take 2 Direct to Consumer

The Direct to Consumer team is a (well-funded) startup within Take-Two. We have offices in San Francisco and we’re launching a new office in Vancouver while creating a culture that enables remote work. We're building a commerce and distribution platform for our game labels. We're creating this from the ground up to support our studios. Our team is small and agile – we deploy to production constantly. We user test every week and focus on automation. We believe in giving our studios the flexibility they need to create the world's greatest games, so we plan to offer UI, SDKs, and GraphQL interfaces for our services. We focus on working software over Powerpoint.

Key Responsibilities

Embedded in the core team that builds the platform for our direct to consumer service, building tools to automate the infrastructure for engineering productivity.
Design, build and run our next generation globally distributed infrastructure that would power our direct to consumer experience for millions of concurrent players.
Drive service reliability, scalability and performance for the infrastructure by creating tools for observability into service health metrics, traceability of transactions across distributed systems, and fast identification into root causes of failures.
Automate everything and anything, when possible.
Be part of a rotating on-call team that would help triage, diagnose and find solutions to live issues.
Obsessed with performance and providing excellent experience for millions of users.

Qualifications

8+ years of professional experience, with proven track record of bringing to production highly scalable and robust large-scale distributed infrastructure.
Strong experience in administrating Linux at scale, with implementations using containers and/or service mesh.
Strong experience with CI/CD practices, automating build pipeline and deployments of distributed services, using tools like Jenkins, artifactory/maven, source control systems like perforce/git.
Strong experience with automating infrastructure as code with tools like Pulumi/Terraform.
Experience in building tools for automating, using any of the following: Python, Go, Javascript, Java etc.
Experience in monitoring, reporting and alerting with tools like Grafana, Prometheus, Splunk etc.
Experience with running large scale distributed databases like MongoDB / DynamoDB and/or traditional RDBMS like Postgres / Mysql.
Experience in building and deploying systems on top of AWS and/or GCP.
Self-starter, self-driven to produce results and continually improve.
Experience and comfortable in supporting a live service environment.
Able to work with a distributed team.
BA / MS degree in computer science or a related field.

Create Your Profile — Game companies can contact you with their relevant job openings.

Apply