This job might no longer be available.
Senior Site Reliability Engineer, Observability
1 year ago
WHAT MAKES US EPIC?
At the core of Epic’s success are talented, passionate people. Epic prides itself on creating a collaborative, welcoming, and creative environment. Whether it’s building award-winning games or crafting engine technology that enables others to make visually stunning interactive experiences, we’re always innovating.
Being Epic means being a part of a team that continually strives to do right by our community and users. We’re constantly innovating to raise the bar of engine and game development.
Infrastructure Engineering
What We Do
Our Observability team is looking for a Senior SRE to help us build and operate the infrastructure our teams rely on to keep our platforms, games and online services running. Our Observability team works closely with teams across Epic to implement industry best practices and develop new monitoring capabilities.
What You'll Do
As an SRE on Observability you will tackle problems that impact how we understand and operate our products at scale. Part of this role is advancing the state of the art for observability at Epic. Building tooling to standardize and make our systems easier to understand. In this role you will build and operate the systems that process and transport the large volumes of telemetry data generated by services at Epic.
In this role, you will
- Service Ownership - At Epic we embrace a Service Owner (You build it, you run it) mentality. In this role you will work together with other members of the Observability team to operate the infrastructure our developers depend on to operate their own services.
- Develop and Ship - You will work to modernize key portions of our observability infrastructure. Building new data processing pipelines for telemetry data as well as writing software to automate processes and generate new insights
- Collaborate - You will work with teams across Epic as an observability subject matter expert to provide guidance on observability best practices.
What we're looking for
- Experience with executing meaningful change in a fast-paced interrupt driven environment
- Self-starter, you approach challenges creatively and methodically, seeing them through to final resolution
- Experience working across teams in a collaborative environment
- Ability to adapt and be effective in new situations within a highly dynamic environment
- Experience working with large scale systems in AWS
- Ability to write code for simple services and process automation
- Are familiar with application/service monitoring strategies and technologies.Including projects such as OpenTelemetry, Prometheus, Grafana, FluentD, New Relic, Datadog, Honeycomb and Sumo Logic.
ABOUT US
Epic Games spans across 19 countries with 55 studios and 4,500+ employees globally. For over 25 years, we’ve been making award-winning games and engine technology that empowers others to make visually stunning games and 3D content that bring environments to life like never before. Epic’s award-winning Unreal Engine technology not only provides game developers the ability to build high-fidelity, interactive experiences for PC, console, mobile, and VR, it is also a tool being embraced by content creators across a variety of industries such as media and entertainment, automotive, and architectural design. As we continue to build our Engine technology and develop remarkable games, we strive to build teams of world-class talent.
Like what you hear? Come be a part of something Epic!
Epic Games deeply values diverse teams and an inclusive work culture, and we are proud to be an Equal Opportunity employer. Learn more about our Equal Employment Opportunity (EEO) Policy here.
Create Your Profile — Game companies can contact you with their relevant job openings.