An employer is looking to hire Ansible Developers to join their Enterprise SRE COE team. This team is responsible for creating development standards, compliance, and metrics for all of the organization's development teams, in order to reduce outages and issues in production. This role will be supporting their failover automation initiative. This role is critical in transforming manual, complex failover processes into streamlined, push-button automation using Ansible, Ansible Tower, and modern DevOps practices. Youll work closely with database teams, application owners, and network engineers to build a robust automation framework that supports multi-database failover, network rerouting, and application-level resilience. This person will assist in the effort to create a push-button failover system that enables real-time disaster recovery across their critical applications. You will help create a dashboard-drive automation suite that empowers teams to manage failovers with confidence, reduce toil, and improve customer experience during outages.
Responsibilities:
· Design, develop, and maintain Ansible playbooks and workflows in Ansible Tower for disaster recovery and failover automation.
· Collaborate with DBAs and infrastructure teams to automate failover processes across various relational databases (Oracle, MySQL, PostgreSQL, SQL Server).
· Integrate with tools like Pronghorn to automate DNS failovers and routing logic.
· Build self-healing scripts and automation patterns for large-scale, asynchronous, message-driven applications
· Develop templates and reusable patterns for database, application, and network failover scenarios.
· Contribute to the design of a centralized failover console with visual indicators (red/green status) and dependency mapping.
· Work with GitLab for version control and CI/CD integration of automation scripts.
· Support Kubernetes-based scaling strategies during failover (ramp down/up workloads).
· Participate in building a framework for operational readiness, including blue-green deployments, observability, and non-functional requirements.
We are a company committed to creating inclusive environments where people can bring their full, authentic selves to work every day. We are an equal opportunity employer that believes everyone matters. Qualified candidates will receive consideration for employment opportunities without regard to race, religion, sex, age, marital status, national origin, sexual orientation, citizenship status, disability, or any other status or characteristic protected by applicable laws, regulations, and ordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application or recruiting process, please send a request to
Human Resources Request Form. The EEOC "Know Your Rights" Poster is available
here.
To learn more about how we collect, keep, and process your private information, please review Insight Global's Workforce Privacy Policy:
https://insightglobal.com/workforce-privacy-policy/ .
Previous experience working on an Enterprise / Shared Service DevOps team
5+ years experience in Software Development and DevOps best practices
Proficiency in scripting languages, especially Bash and Python, for automation.
Experience with CI/CD tools like Jenkins and GitLab CI/CD, and strong pipeline management skills
Familiarity with version control systems, particularly Git, and collaboration platforms
Knowledge of infrastructure-as-code (IAC) principles and tools, such as Ansible and Terraform
Experience with Ansible and Ansible Tower for infrastructure automation in production environments
Deep understanding of relational databases (MySQL, SQL, Oracle, Postgres) and experience with database failover strategies
Familiarity with load balancing, networking, and asynchronous messaging systems
Experience with Kubernetes and container orchestration
Proven ability to automate complex, multi-step manual processes
Experience with Amazon (AWS) Cloud technologies
Experience with Git versioning and release strategies
Strong problem-solving skills, a proactive approach to system health, and the ability to troubleshoot complex issues
Familiarity with observability tools, monitoring, and alerting systems
A commitment to balancing reliability concerns with continuous innovation and development
Excellent collaboration and communication skills across cross-functional teams
Benefit packages for this role will start on the 31st day of employment and include medical, dental, and vision insurance, as well as HSA, FSA, and DCFSA account options, and 401k retirement account access with employer matching. Employees in this role are also entitled to paid sick leave and/or other paid time off as provided by applicable law.