Junior Site Reliability Engineer

3 days ago

At AccelByte, our mission is to empower game creators by providing them with the backend platform and tools required to make scalable, reliable AAA-quality games. The company was founded in 2016 by industry veterans who have engineered online systems for some of the largest game and distribution platforms in the world including Fortnite, Epic Store, Xbox Live, PlayStation Network, and EA Origin. We are backed by top investors including Softbank, Sony Interactive Entertainment, Galaxy Interactive, NetEase, and Krafton. Our latest Series B funding has firmly solidified our place as a top player in the gaming industry. AccelByte’s talent has decades of experience building and shipping some of the largest game and distribution platforms in the world.

We believe that the best companies empower employees to make decisions, obsess about the best user experience, and are not afraid to make and learn from their mistakes. Our culture is based on humility, openness to feedback, drive, and collaboration, which we feel results in the best performing teams. As a company that values diversity, inclusion, and employee growth, our employees have opportunities to work with and learn from teams all over the world. We offer competitive salaries, a full range of health benefits, social activities, career growth opportunities, and an amazing team. Come join us!

Position Summary:

AccelByte is building a 24x7 operations team for AAA multiplayer video games. In this position, we need a driven Site Reliability Engineer who can actively participate in the day-to-day combat by maintaining high reliability of our service and drive prioritization in fixing what may be broken today as well as able to envision, design, and implement processes and technologies to improve the ability to identify, isolate, correlate, and mitigate service impacting problems in the system. Service restoration, and making customers happy are not enough, you must know some coding to automate routine tasks in service metrics gathering, correlating, organizing, and presenting, in addition to detail and in-depth root cause analysis

Essential Functions/Responsibilities:

The Junior Site Reliability Engineer is accountable for the following functions and responsibilities:

  • Build and run service deployment to K8s
  • Containerize service deployment using either docker, pod, or other container engines
  • Provision infrastructure in either AWS, GCP, or other cloud providers
  • Able to handle various troubleshooting and supports related to our infrastructures
  • Perform any other design-related duties as required
  • Maintain infrastructure-related documentation and SRE runbooks.
  • Perform any other design-related duties as required

Qualifications/Experience Required:

  • Minimum 1-2 years of Linux administration
  • Degree in Computer Science or equivalent experience
  • Have fundamental knowledge in Network / SRE / DevOps world
  • Robust knowledge and experience in cloud computing of at least one cloud provider (preferred AWS/GCP)
  • Experience with containerization principles and frameworks such as Docker, Container, Kubernetes, etc
  • Experience in building infrastructure as code (eg: Terraform), configuration management, and package manager (eg: Helm Chart)
  • Experience with automation & CICD tools such as Jenkins, GitLab, and GitHub
  • Software development and scripting experience with Bash, Python, and/or Golang
  • Fluent in English both spoken and written
  • Good communication skills (escalation, explaining the incident)
  • Willing to work on shift (24/7)

Qualifications/Experience Preferred:

  • Experienced with AWS, GCP, or other cloud providers
  • Experienced with GitOps approach & tools like Flux or ArgoCD
  • Experience with monitoring and alerting tools such as Prometheus, Grafana, ELK/EFK, Splunk, Datadog, OpsGenie, PagerDuty, etc
  • Experience with monitoring systems and strategies (System Admin)
  • Contribute to open-source projects and participate in technical communities
  • Experience working for or with AAA game studios
  • JVM tuning and troubleshooting
  • Experience with web services
  • Google Cloud Platform, Azure, and experience with other clouds
  • Deep interest in game development
  • A solid foundation on the distributed system
  • Solid performance and troubleshooting skills

AccelByte Inc is an Equal Employment Opportunity Employer, all qualified candidates and applicants will receive consideration for employment without regard to race, religion, gender, national origin, sexual orientation, marital status, age, or disability. Our culture is innovative and inclusive, and we value our people the highest.

Please visit our career page for a complete listing of our open positions: https://accelbyte.io/careers

Create Your Profile — Game companies can contact you with their relevant job openings.

Related Jobs

Senior / Principal Site Reliability Engineer - Reliability Response
Roblox · San Mateo, CA · 3 months ago
Senior Site Reliability Engineer - Embedded/Product Reliability
Roblox · San Mateo, CA · 5 months ago
Senior Site Reliability Engineer - Traffic Engineering
Roblox · San Mateo, CA · 6 months ago
Senior Site Reliability Engineer - Software Engineering
Bungie · 1 year ago
Site Reliability Engineer
Social Point · Barcelona, Spain · 5 months ago
Site Reliability Engineer
Activision · Vancouver, British Columbia · 6 months ago
Site Reliability Engineer
Moon Active · Warsaw, Poland · 8 months ago
Site Reliability Engineering (SRE)
People Can Fly · Montréal, Quebec · 4 months ago
Principal Site Reliability Engineer - DNS Infrastructure
Roblox · San Mateo, CA · 3 months ago
Principal Software Engineer - Site Reliability
Roblox · San Mateo, CA · 5 months ago
Senior Site Reliability Engineer - DNS Infrastructure
Roblox · San Mateo, CA · 3 months ago
Site Reliability Engineer
Take-Two Interactive · San Francisco, CA · 4 months ago

Jobs at AccelByte

Engineering jobs