This job might no longer be available.
Senior SRE Software Engineer
2 years ago
AppLovin’s leading marketing software provides mobile app developers a powerful set of solutions to grow their mobile apps. AppLovin’s technology platform enables developers to market, monetize, analyze and publish their apps. The company’s first-party content includes over 200+ popular, engaging apps and its technology brings that content to millions of users around the world. AppLovin is headquartered in Palo Alto, California with several offices globally.
AppLovin was named one of the Hottest Adtech Companies of 2021 by Business Insider, as well as a Certified Great Place to Work in 2021 and 2022. The San Francisco Business Times and Silicon Valley Business Journal awarded AppLovin one of the Bay Area’s Best Places to Work in 2019, 2020, and 2021. Our team members are regularly recognized for their work and leadership, including recent award wins in Business Insider’s Rising Stars of Adtech 2022, Glassdoor’s Top CEOs 2019, and the 2021 Women in Content Marketing Awards.
Machine Zone (An AppLovin Company) is a global leader in mobile gaming, with a track record of delivering some of the world’s most successful mobile games including Game of War, Mobile Strike and Final Fantasy XV: A New Empire. We combine the power of technology and creative vision to create experiences that connect people from all corners of the globe. Machine Zone was acquired by AppLovin in May 2020.
The SRE team helps other engineering and product teams improve reliability, scalability and strive for continuous uptime of our games. We are looking for a Senior SRE Software Engineer to come help the team reduce the day to day operational overhead and automate everything possible. You’ll have the opportunity to build large scale infrastructure that serves millions of users around the world as well as build distributed systems for multiple teams to leverage. This is a highly visible role and will require taking extreme ownership and a strong desire for excellence.
Responsibilities
- Develop and maintain shared infrastructure used by product and engineering teams
- Measure SLIs and take necessary action to make systems meet organizational SLOs
- Champion process improvements and automation to reduce toil and improve operational health
- Drive continuous improvement of existing systems and tooling (testing, monitoring, alerting, security, release, etc.)
- Provide input to help other engineering teams design reliable and maintainable systems
- Help define standards, guidelines, and best practices
Requirements
- Experience designing, implementing, operating, and maintaining distributed systems
- Familiarity with SRE principles and practices
- First principles approach to problem solving
- Knowledge of Kubernetes (broad and deep)
- Strive for simplicity but comfortable with complexity
- Love of learning
Nice to Have
- Active contributor to open source and cloud-native community
- Design and implementation of Kubernetes custom resources and controllers
- Experience with: public cloud platforms (e.g. GCP, AWS, Azure), infrastructure as code (e.g. Terraform, Puppet, Ansible), monitoring and alerting (e.g. Prometheus, Thanos, Cortex, Alertmanager, Grafana), logging (e.g. Fluentd, Fluentbit, Logstash, Elasticsearch, openSearch, Loki), datastores and caches (e.g. MySQL, PostgreSQL, CockroachDB, Redis, Memcache), data pipelines (e.g. Kafka, Hadoop, Airflow)
#LI-BN1
AppLovin is an equal opportunity employer and considers qualified applicants without regard to race, gender, sexual orientation, gender identity or expression, genetic information, national origin, age, disability, medical condition, religion, marital status or veteran status, or any other basis protected by law.
Create Your Profile — Game companies can contact you with their relevant job openings.