This job might no longer be available.

Site Reliability Engineer(SRE)

Zynga

Bengaluru, India

4 years ago

Apply

Monitoring & Incident Management:

Improve the studio’s reliability through monitoring, rapid response, communication and coordination.
Develop and manage the deployment architecture for the application, develop the monitoring architecture and implement monitoring agents, dashboards, critical issues and alerts.
Routinely identifies problems by observing and studying system architect, functionality and performance results. Fixing procedures with the overall studio architect and investigating surfaced issues, and handling incidents.
Identifies operational priorities by assessing operational objectives; determining project objectives, such as, efficiency, cost savings, energy conservation, operator convenience, safety, environmental quality; estimating relevance, time, and costs.
Development & Data Analyzing:
Develops operational solutions by defining, studying, estimating, and screening alternative solutions; calculating economics; determining impact on total system.
Build new tools to facilitate automated monitoring of operational environment.
Anticipates operational problems by studying operating targets, modes of operation, unit limitations; monitoring unit performance.
Improves operational quality results by studying, evaluating, and recommending process re crafting, implementing changes, contributing information and opinion to unit design and modification teams.
Provides operational management information by collecting, analyzing, and summarizing operating and engineering data and trends.
Updates job knowledge by participating in educational opportunities; reading professional publications; maintaining personal networks; participating in professional organizations.
Accomplishes engineering and organization mission by completing related results as needed.

Skills and Qualifications:

Deep understanding of Linux and Networking administration
Solid grasp of systems engineering and troubleshooting skills
Shell scripting (BASH & PHP)
Strong TCP/IP understanding and ability to produce detailed documentation
Write up new and maintain technical documentation
Ability to administer networking firewalls, routers, and switches
S3 Maintenance, Apache maintenance, Load Balancer Management
Puppet Management
Cloud Management
AWS Expertise (VPC, RDS, Route53 Integration (DNS))

Database fundamentals

Administer MySQL and other opensource databases
Write and perform basic queries to evaluate database stability, integrity and performance
Large/Big Data Management
Administer and maintain Aurora infrastructure

Monitoring Systems

System Level (Nagios, Munin, Check_MK)
Writing checks & scripts
Log/Application Level (Splunk, Elastic Searching, Apache)
Ability to diagnose infrastructure as a whole!

Extra Credit to have:

Java
C++
Elasticache
Vertica

What we offer you:

Work in a studio that has complete P&L ownership of games
Competitive salary, discretionary annual bonus scheme and Zynga RSUs
Full medical, accident as well as life insurance benefits
Catered breakfast, lunch and evening snacks
Child care facilities for women employees and discounted facilities for male employees
Well stocked pantry
Generous Paid Maternity/Paternity leave
Employee Assistance Programs
Active Employee Resource Groups - Women at Zynga
Frequent employee events
Additional leave options for most employees
Flexible working hours on many teams
Casual dress every single day
Work with cool people and impact millions of daily players!

Create Your Profile — Game companies can contact you with their relevant job openings.

Apply