This job might no longer be available.
Senior Manager, Site Reliability Engineering-Tools
4 years ago
PlayStation isn’t just the Best Place to Play —it’s also the Best Place to Work. We’ve thrilled gamers since 1994, when we launched the original PlayStation. Today, we’re recognized as a global leader in interactive and digital entertainment. The PlayStation brand falls under Sony Interactive Entertainment, a wholly-owned subsidiary of Sony Corporation.
It is an exciting time to be part of SIE’s Site Reliability Engineering (SRE) leadership team. SRE teams operate at the intersection of Software Engineering and Infrastructure Engineering. These teams strive to make the PlayStation Network Platform a highly reliable, scalable, operable and secure product and service. As a SRE Manager, you will be responsible for leading and enabling a team of engineers building and operating services which are “always on”, highly performant, and the foundation on which the PlayStation Network is delivered to customers.
The Site Reliability Tools team within SIE’s Platform Hosting Engineering organization provides critical services used across all platform teams to provide visibility into the performance and availability of PlayStation Network services to our players, partners, and other customers. SREs on Site Reliability Tools teams work closely with developers, operations teams, and leadership to ensure we have the right set of tools to generate, collect, analyze, visualize and alert on operational data so we know exactly what happens across the PlayStation ecosystem and can see problems before they occur and address them as quickly as possible.
Responsibilities
- Lead the teams of software and systems engineers to deliver critical logging, monitoring, tracing, and alerting services across the PlayStation Network Platform, and be directly responsible for PlayStation's stellar uptime record
- Own end-to-end availability and performance of global logging, monitoring, tracing, and alerting services.
- Drive teams to build automation to prevent problem recurrence and automate responses to errors and alerts
- Collaborate across the global organization to gather requirements and ensure service delivery aligns with the needs of all partner teams and overall business objectives
- Manage relationships with 3rd party providers of software and services, to align partner roadmaps with the team’s needs
- Improve upon and deliver the vision for Site Reliability Tools through collaboration with stakeholders and team members across the entire PlayStation ecosystem
- Lead by example, care for your team, and establish credibility with the quality of your team's technical execution
Key Qualifications
- Ability to lead a team of managers and highly technical and skilled engineers developing software, delivering services, and operating critical systems at large scale
- Strong collaboration and communication skills with the ability to partner and influence other managers, engineers and executives
- Proven track record of building, growing, and leading technical teams that effectively deliver services following agile principles
- Equally adept at software development and systems engineering/operations
- Hands on experience in running complex, large scale distributed systems while improving the “illities” (reliability, availability, serviceability) of those systems
- Ability to design and provide operational and infrastructural requirements that promote uptime, speed and security at all phases of the software lifecycle on a global scale.
Required Skills
- Fluency with running distributed services at scale with performance
- Demonstrated experience following software engineering best-practices
- Experience with automation and configuration management tools
- Experience in public cloud services and deployment (Prefer AWS experience)
- Strong software development experience in one of these languages: Go, Perl, Python or Java
- Knowledge of the software development lifecycle with experience integrating Open Source tools
- Able to lead teams to solve complex issues across a cloud-based micro-services environment
- Knowledge and experience with logging and monitoring tools such as Splunk, CA, Datadog, CloudWatch, ELK, Sensu, Zabbix
- Prefer experience with deploying and operating logging and monitoring systems at scale
- Strong hands-on experience in building and maintaining infrastructure for micro services
- Experience with Continuous Integration and Continuous Delivery/Deployment tools like Jenkins, Bamboo, or similar
- Should have experience in developing tools for system configuration, deployment, and monitoring
- Strong belief in driving operational excellence with owning efficiency and automation at the core of operations
- OBSESSIVE desire to automate and improve everything including process improvements, standardizing tools and technologies!
Required Soft Skills
- Desire to champion developer needs and integrate them into your teams’ priorities
- Methodical and systematic problem-solving approach
- Complete ownership of end-to-end solutions and managing their life cycle
- Execution oriented and results driven
- Customer and peer relationship focused with strong interpersonal and communication skills
- Demonstrated ability to effectively partner with local and remote groups of internal customers
- Ability to thrive in a fast-paced team environment
- Ability to learn new skills/technologies quickly and independently
Experience
- BS in Computer Science, Software Engineering, or equivalent experience
- 12+ years professional experience operating systems at scale
- 7+ years experience managing teams
This role requires occasional travel to other SIE locations around the world.
Sony is an Equal Opportunity Employer. All persons will receive consideration for employment without regard to race, color, religion, gender, pregnancy, national origin, ancestry, citizenship, age, legally protected physical or mental disability, covered veteran status, status in the U.S. uniformed services, sexual orientation, marital status, genetic information or membership in any other legally protected category.
We strive to create an inclusive environment, empower employees and embrace diversity. We encourage everyone to respond.
We sincerely appreciate the time and effort you spent in contacting us and we thank you for your interest in PlayStation.
#LI-GM1
Create Your Profile — Game companies can contact you with their relevant job openings.