Top Secret AI Operations Engineer SME

Post Date

Nov 26, 2025

Location

Ashburn,
Virginia

ZIP/Postal Code

20147

Job Type

Contract-to-perm

Job Description

We are seeking a motivated, career- and customer-oriented SME AI Ops Engineer to join our AI innovation team in Ashburn, VA. This is currently a hybrid position with 2-3 days onsite/week. While some formal training is available for mission knowledge and data-driven solutions, much of this learning occurs through on-the-job training and individual initiative. Therefore, we are looking for self-starters who can independently explore and acquire knowledge, build informal support networks, and appreciate the significant challenge of securing the Nation's borders. Responsibilities include but are not limited to:
- Lead the design and operational deployment of Large Language Model (LLM) based applications and AI agents across AWS or Google cloud environments.
- Implement and optimize Model Context Protocol (MCP) for context-aware orchestration and efficient scaling of AI services.
- Design and manage embeddings generation, storage, and retrieval, enabling semantic search and recommendation and develop monitoring, logging, and observability solutions leveraging the Elastic Stack (Elasticsearch, Logstash, Kibana).
- Engineer graph-oriented deployment architectures for dependency mapping, relationship-aware data flow advanced troubleshooting and automate deployment, scaling, and monitoring using modern MLOps and AIOps practices (CI/CD, containerization, orchestration).
- Ensure data quality, lineage, and governance across multiple systems and pipelines
- Collaborate with data scientists, ML engineers, and DevOps teams to deliver high-availability AI-powered applications.
- Build monitoring and alerting solutions to detect anomalies, performance bottlenecks, and failures in AI/ML services and research and implement emerging AI infrastructure and tools and best practices to continuously improve system reliability and scalability.

We are a company committed to creating diverse and inclusive environments where people can bring their full, authentic selves to work every day. We are an equal opportunity/affirmative action employer that believes everyone matters. Qualified candidates will receive consideration for employment regardless of their race, color, ethnicity, religion, sex (including pregnancy), sexual orientation, gender identity and expression, marital status, national origin, ancestry, genetic factors, age, disability, protected veteran status, military or uniformed service member status, or any other status or characteristic protected by applicable laws, regulations, and ordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application or recruiting process, please send a request to HR@insightglobal.com.To learn more about how we collect, keep, and process your private information, please review Insight Global's Workforce Privacy Policy: https://insightglobal.com/workforce-privacy-policy/.

Required Skills & Experience

-Active Top Secret Clearance and ability to obtain public trust
- BS/BA and 12+ years or MS/MA/MBA and 9+ years or PhD/Doctorate and 7+ years.
- Experience in Data Engineering, DevOps, or AI/ML Ops roles.
- Strong coding skills in Python or Java.
- Hands-on experience with vector search and retrieval-augmented generation (RAG).
- Knowledge of security, compliance, and access controls for data and AI systems.
- Strong foundation in data engineering and distributed systems.
- Deep understanding of search, indexing, and graph databases (Elasticsearch, Neo4j, Janusgraph or equivalent).
- Experience with LLM deployment frameworks (LangChain, RAG pipelines, vector databases like Pinecone, Weaviate, or FAISS).
- Proficiency with cloud platforms and infrastructure as code.
- Hands-on with containers and orchestration (Docker, Kubernetes).
- Familiarity with monitoring and observability tools (Grafana, ELK).
- Knowledge of CI/CD pipelines (Harness, Jenkins) and best practices for AI/ML deployment.

Nice to Have Skills & Experience

- Strong foundation in data engineering and distributed systems.
- Deep understanding of search, indexing, and graph databases (Elasticsearch, Neo4j, Janusgraph or equivalent).
- Experience with LLM deployment frameworks (LangChain, RAG pipelines, vector databases like Pinecone, Weaviate, or FAISS).
- Proficiency with cloud platforms and infrastructure as code.
- Hands-on with containers and orchestration (Docker, Kubernetes).
- Familiarity with monitoring and observability tools (Grafana, ELK).
- Knowledge of CI/CD pipelines (Harness, Jenkins) and best practices for AI/ML deployment.

Benefit packages for this role will start on the 1st day of employment and include medical, dental, and vision insurance, as well as HSA, FSA, and DCFSA account options, and 401k retirement account access with employer matching. Employees in this role are also entitled to paid sick leave and/or other paid time off as provided by applicable law.