Job Description
A large enterprise e-commerce company is looking for a Site Reliability Engineer to join their growing team. As a member of the CICD and Cloud Reliability team this role works to make sure there is a high-performing platform that is also highly available and highly reliable. This person will be part of a team that is execution-oriented, results-driven, and which enables service development by designing, building, deploying and operating cloud infrastructure and CICD services at scale. This role will also be able to exercise troubleshooting skills with the opportunity to zoom in on anything from code issues to packet loss in the network. Primary responsibilities will include contributing to the implementation and delivery of the end-to-end automation platform, to support continuous integration and continuous delivery (CI/CD), with a focus on developer self-service capabilities. This position requires extensive technical expertise and deep knowledge of continuous integration and continuous delivery platform domain expertise, especially in cloud-based service environment. 80% of this role will be focused on supporting JIRA tickets and ad-hoc requests while the remaining 10% will be focused on feature improvements. This role is hybrid with occasional on-site meetings required in Southern California.
We are a company committed to creating diverse and inclusive environments where people can bring their full, authentic selves to work every day. We are an equal opportunity/affirmative action employer that believes everyone matters. Qualified candidates will receive consideration for employment regardless of their race, color, ethnicity, religion, sex (including pregnancy), sexual orientation, gender identity and expression, marital status, national origin, ancestry, genetic factors, age, disability, protected veteran status, military or uniformed service member status, or any other status or characteristic protected by applicable laws, regulations, and ordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application or recruiting process, please send a request to HR@insightglobal.com.To learn more about how we collect, keep, and process your private information, please review Insight Global's Workforce Privacy Policy: https://insightglobal.com/workforce-privacy-policy/.
Required Skills & Experience
BS in Computer Science or equivalent experience
3+ years professional software engineering experience operating at a large scale in high paced environments
Hands-on experience with AWS, Kubernetes, Infrastructure as Code, monitoring and alerting
Experience with building out Kubernetes cluster from scratch preferably using EKS and experience with Kubernetes add-ons
Extensive use of automation for Infrastructure as Code preferably via Terraform
Strong scripting experience in Python, Go or similar language
Experience with continuous integration, continuous delivery/deployment tools like Jenkins and ArgoCD
Strong communication and interpersonal skills
Nice to Have Skills & Experience
Experience with Terraform
Argo CD experience
Experienced user of one or more source code management tools, preferably Git
Benefit packages for this role will start on the 31st day of employment and include medical, dental, and vision insurance, as well as HSA, FSA, and DCFSA account options, and 401k retirement account access with employer matching. Employees in this role are also entitled to paid sick leave and/or other paid time off as provided by applicable law.