Provide consulting services for improved system stability, availability, performance and reliability.
· Assist in determining the impact of operational issues and provide input into their resolution via data extraction and quantification.
· Work through day-to-day support issues, ensure effective and timely resolution of issues in production environment, troubleshoot customer impacting issues.
· Support multiple applications, specifically running Kubernetes/Gloo/AWS/Apigee/PCF/GCP/Java based systems in an enterprise environment.
· Supporting Gloo running on Kubernetes, Apigee opdk and saas, Grafana, Prometheus, Cassandra, Postgres, Spring Boot or Java based applications running on Kubernetes, PCF, and Java application servers.
· Apply GitOps principles to manage infrastructure and application configurations
· Apply monitoring and creating complex alerts and dashboards for production systems.
· Provide capacity analysis, tuning analysis for Apigee and Java applications hosted on LINUX and container platform.
· Available to provide 24X7 on call support on a rotating basis with other team members.
· Lead efforts in troubleshooting, recovery, and root cause investigation.
· Perform analysis of user requirements and problems to automate or improve systems and review system capabilities, workflow, and scheduling limitations.
· Able to follow and develop detailed work plans, schedules, project estimates, resource plans, and status reports.
· Facilitate HA (High Availability) /DR (Disaster Recovery) exercises to ensure that the team are fully prepared in any event.
· Lead root cause analysis session to understand what causes issues in Production and come up RCA Report along with solutions that will prevent them from happening in the future.
· Ensure documentation is created and remain updated for any related work.
· Strong understanding of UNIX operating systems and any scripting language.
· Forecast and plan for rapidly growing environment.
· Evaluate new software product and service solutions.
We are a company committed to creating inclusive environments where people can bring their full, authentic selves to work every day. We are an equal opportunity employer that believes everyone matters. Qualified candidates will receive consideration for employment opportunities without regard to race, religion, sex, age, marital status, national origin, sexual orientation, citizenship status, disability, or any other status or characteristic protected by applicable laws, regulations, and ordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application or recruiting process, please send a request to
HR@insightglobal.com. The EEOC "Know Your Rights" Poster is available
here.
To learn more about how we collect, keep, and process your private information, please review Insight Global's Workforce Privacy Policy:
https://insightglobal.com/workforce-privacy-policy/ .
Expertise in analyzing and troubleshooting large-scale distributed systems.
· Strong experience with Kubernetes Container Orchestration Tool, Gloo, AWS, Apigee API Gateway,
· Experience with REST, SOAP, and GraphQL API support.
· Experience with tools like: Git, Gitlab, Docker, Postman, Splunk, App Dynamics, Imperva WAF and CI/CD tools
· Good Experience in GitOps process, performance measures & tuning, capacity planning and management, contingency, and disaster recovery
· Good understanding and strong experience with Unix/Linux operating systems.
· Ability to debug, optimize code, and automate routine tasks.
· Systematic problem-solving approach coupled with effective communication skills.
· Strong scripting knowledge and experience.
· Good understanding of networking, routing, and TLS/SSL
Masters degree in Information Technology, Computer Science, Computer Information Systems, Computer Applications, related field or its equivalent and 3 years of relevant work experience.
Bachelors degree in Information Technology, Computer Science, Computer Information Systems, Computer Applications, related field or its equivalent and 5 years of relevant work experience.
Benefit packages for this role will start on the 31st day of employment and include medical, dental, and vision insurance, as well as HSA, FSA, and DCFSA account options, and 401k retirement account access with employer matching. Employees in this role are also entitled to paid sick leave and/or other paid time off as provided by applicable law.