This job might no longer be available.

Senior Site Reliability Engineer, Storage

4 years ago

Magic Leap is an eclectic group of people who share a magical vision of the future. And we’re growing. Our mission is to harmonize people and technology to create a better, more unified world. Our vision is to amplify the best parts of you and to advance the human spirit.

Job Description

Provide engineering and support of enterprise-level systems, storage, and backup solutions. Responsibilities include providing technical support and design for enterprise backup solutions and enterprise systems and storage solutions. Act as lead for storage automation projects. Also includes resolving backup and storage-related problems by gathering diagnostic information for submission to the vendor support team, applying and testing fixes, and performing corrective maintenance; supporting software and hardware upgrades, and performing daily storage and backup related tasks such as storage provisioning and adding new systems to storage and backup solutions. Ability to work with system administrators and network.

Principal Duties and Responsibilities

Manage storage focused trouble tickets end-to-end
General management of EMC Isilon storage environment
Act as lead for storage automation projects
NFS/SMB troubleshooting and resolution. Responsible for resolving storage issues
Coordinate and manage migrations/cutovers aligning to ITIL and client requirements
Document and provide best practices on maintenance, configuration, and internal processes

Experience

Overall storage administration experience of 5+ years.
Strong hands-on experience with EMC Isilon administration
Strong hands-on experience in storage automation (Perl, Python, REST API)
Exposure and some experience on other storage technologies e.g. Rubrik, Nimble will be a plus.
Exposure to build and manage storage systems.
Experience in data center migrations and consolidations for storage and backup technology
Experience in installation/upgrade for EMC Isilon clusters
Basic understanding of network technologies and architecture
Basic understanding of UNIX and Windows administration
Basic understanding of backup technologies
Automation using Shell scripting (shell/PERL/Python)
Experience in a large, geographically dispersed environment
Experience in monitoring (h/w and OS) solutions for storage systems
Experience in solution design / configuration/solutions evaluation/ Validation and deployment
Exposure to ITIL processes (incident, problem, change) and working in a process environment is a must
Ability to work with minimal direction or contribute in a team environment
Excellent knowledge in
Excellent verbal, written communications and interpersonal skills
Strong analytical and problem-solving skills.
Excellent Planning and Co-ordination skills
EMC InsightIQ, SyncIQ experience
Understanding of networking and core Internet protocols (e.g. TCP/IP, BGP, IS-IS, DHCP, NAT, IPSEC, ECMP, DNS, TLS, SMTP, HTTP)
Experience using a modern language. Go, Java, Node.js, Ruby, etc.
Ability to script in a shell language (Bash or POSIX Shell)
Experience with public cloud providers (AWS, Google Cloud Platform, etc.) is a plus.
Experience working with containers (Docker, Kubernetes, ECS, etc.) is a plus.
Comfort with frequent, incremental code testing and deployment
Understanding of the role of automation tools (Terraform, Jenkins, Concourse CI, Bitbucket Pipelines, etc.)
Comfort with collaboration, open communication and reaching across functional borders
Ability to remain calm under pressure and take command of a recovery effort.

Education

BA/BS in Computer Science or equivalent experience

Additional Information

All your information will be kept confidential according to Equal Employment Opportunities guidelines.

#LI-LS1

Create Your Profile — Game companies can contact you with their relevant job openings.

Apply