This job might no longer be available.
Senior Site Reliability Engineer, Storage
4 years ago
Magic Leap is an eclectic group of people who share a magical vision of the future. And we’re growing. Our mission is to harmonize people and technology to create a better, more unified world. Our vision is to amplify the best parts of you and to advance the human spirit.
Job Description
Provide engineering and support of enterprise-level systems, storage, and backup solutions. Responsibilities include providing technical support and design for enterprise backup solutions and enterprise systems and storage solutions. Act as lead for storage automation projects. Also includes resolving backup and storage-related problems by gathering diagnostic information for submission to the vendor support team, applying and testing fixes, and performing corrective maintenance; supporting software and hardware upgrades, and performing daily storage and backup related tasks such as storage provisioning and adding new systems to storage and backup solutions. Ability to work with system administrators and network.
Principal Duties and Responsibilities
- Manage storage focused trouble tickets end-to-end
- General management of EMC Isilon storage environment
- Act as lead for storage automation projects
- NFS/SMB troubleshooting and resolution. Responsible for resolving storage issues
- Coordinate and manage migrations/cutovers aligning to ITIL and client requirements
- Document and provide best practices on maintenance, configuration, and internal processes
Experience
- Overall storage administration experience of 5+ years.
- Strong hands-on experience with EMC Isilon administration
- Strong hands-on experience in storage automation (Perl, Python, REST API)
- Exposure and some experience on other storage technologies e.g. Rubrik, Nimble will be a plus.
- Exposure to build and manage storage systems.
- Experience in data center migrations and consolidations for storage and backup technology
- Experience in installation/upgrade for EMC Isilon clusters
- Basic understanding of network technologies and architecture
- Basic understanding of UNIX and Windows administration
- Basic understanding of backup technologies
- Automation using Shell scripting (shell/PERL/Python)
- Experience in a large, geographically dispersed environment
- Experience in monitoring (h/w and OS) solutions for storage systems
- Experience in solution design / configuration/solutions evaluation/ Validation and deployment
- Exposure to ITIL processes (incident, problem, change) and working in a process environment is a must
- Ability to work with minimal direction or contribute in a team environment
- Excellent knowledge in
- Excellent verbal, written communications and interpersonal skills
- Strong analytical and problem-solving skills.
- Excellent Planning and Co-ordination skills
- EMC InsightIQ, SyncIQ experience
- Understanding of networking and core Internet protocols (e.g. TCP/IP, BGP, IS-IS, DHCP, NAT, IPSEC, ECMP, DNS, TLS, SMTP, HTTP)
- Experience using a modern language. Go, Java, Node.js, Ruby, etc.
- Ability to script in a shell language (Bash or POSIX Shell)
- Experience with public cloud providers (AWS, Google Cloud Platform, etc.) is a plus.
- Experience working with containers (Docker, Kubernetes, ECS, etc.) is a plus.
- Comfort with frequent, incremental code testing and deployment
- Understanding of the role of automation tools (Terraform, Jenkins, Concourse CI, Bitbucket Pipelines, etc.)
- Comfort with collaboration, open communication and reaching across functional borders
- Ability to remain calm under pressure and take command of a recovery effort.
Education
- BA/BS in Computer Science or equivalent experience
Additional Information
- All your information will be kept confidential according to Equal Employment Opportunities guidelines.
#LI-LS1
Create Your Profile — Game companies can contact you with their relevant job openings.