Monitoring & Incident Management:
- Improve the studio’s reliability through monitoring, rapid response, communication and coordination.
- Develop and manage the deployment architecture for the application, develop the monitoring architecture and implement monitoring agents, dashboards, escalations and alerts.
- Routinely identify operational problems by observing and studying system architecture, functionality and performance results. Troubleshoot procedures with studio architect and investigate surfaced issues; and handling incidents.
- Identify operational priorities by assessing operational objectives. Determine project objectives; such as; efficiency, cost savings, energy conservation, operator convenience, safety, environmental quality; estimating relevance, time, and costs.
Development & Data Analyzing:
- Develop operational solutions by defining, studying, estimating, and screening alternative solutions; calculating economics; determining impact on all systems.
- Create new tools to facilitate automated monitoring of the studio’s operational environment.
- Anticipate operational problems by studying operating targets, modes of operation, unit limitations; monitoring unit performance.
- Improve operational quality results by studying, evaluating, and recommending process re architecting, implementing changes, contributing information and opinion to unit design and modification teams.
- Provide operational management information by collecting, analyzing, and summarizing operating and engineering data and trends.
- Update job knowledge by participating in educational opportunities; reading professional publications; maintaining personal networks; participating in professional organizations.
- Accomplish engineering and organization mission by completing related results as needed.
Operations Engineer Skills and Qualifications:
Mastery of Systems Linux and Networking administration
- High level understanding of Linux/Unix operating systems
- Strong systems engineering and troubleshooting skills
- Strong understanding of TCP/ IP,SSL ,DNS
- Ability to create and maintain technical documentation
- Good understanding of webserver configuration and management ( Apache,Nginx )
- Knowledge in Load Balancing concepts
- Experience with service performance monitoring and automation
- Experience with systems and application security
- Ability to analyze and troubleshoot in networking, performance, system and infrastructure issues using Linux/Unix standard tools.
- Ability to administer networking firewalls
Cloud Management
- AWS Expertise (EC 2,VPC ,S3, RDS, Route53 Integration (DNS),Code deploy,IAM,ACM )
Monitoring Systems
- Nagios, Sensu , Grafana, Munin , Check_MK , Cloudwatch , and/or DataDog .
- Backend - Graphite, Prometheus, influxdb
- Writing checks & scripts
- Log/Application Level (Splunk, Elastic Search, Apache)
- Ability to diagnose infrastructure as a whole
Database fundamentals
- Administer and maintain MySQL and other open source databases
- Write and perform basic queries to evaluate database stability, integrity and performance
- Good to have knowledge in NoSql databases ( Couchbase,mongodb etc )
Scripting
- Shell scripting (BASH)
- Python
Configuration management -
- Chef or Ansible. Puppet
- Provisioning - Packer, Terraform , Could Formation
- Containerisation - Docker swarm or kubernetes or AWS ecs,eks
CI/CD Jenkins, AWS CI/CD
Source code management
Bonus to have (Recommended, but not required):
- Basic knowledge of containers. I.E [Docker/Kubernetes]
- PhP
What we offer you:
- Work in a studio that has complete P&L ownership of games
- Competitive salary, discretionary annual bonus scheme and Zynga RSUs
- Full medical, accident as well as life insurance benefits
- Catered breakfast, lunch and evening snacks
- Child care facilities for women employees and discounted facilities for male employees
- Well stocked pantry
- Generous Paid Maternity/Paternity leave
- Employee Assistance Programs
- Active Employee Resource Groups – Women at Zynga
- Frequent employee events
- Additional leave options for most employees
- Flexible working hours on many teams
- Casual dress every single day
- Work with cool people and impact millions of daily players!
#LI-HK1