We are seeking a mid-level data engineer to support healthcare data transformation and integration work involving legacy clinical systems. You'll work closely with JSON and Parquet extracts from the RPMS/VistA ecosystem and help translate these into structured HL7v2 and FHIR-compliant representations.
Your work will directly enable semantic interoperability across tribal and federal health systems, supporting public health delivery for some of the most underserved populations in the country.
We are a company committed to creating inclusive environments where people can bring their full, authentic selves to work every day. We are an equal opportunity employer that believes everyone matters. Qualified candidates will receive consideration for employment opportunities without regard to race, religion, sex, age, marital status, national origin, sexual orientation, citizenship status, disability, or any other status or characteristic protected by applicable laws, regulations, and ordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application or recruiting process, please send a request to
Human Resources Request Form. The EEOC "Know Your Rights" Poster is available
here.
To learn more about how we collect, keep, and process your private information, please review Insight Global's Workforce Privacy Policy:
https://insightglobal.com/workforce-privacy-policy/ .
- 25 years of experience in data engineering or healthcare data roles.
- Strong Python skills, especially with pandas, json, and transformation of semi-structured data.
- Experience with or strong interest in healthcare data standards HL7v2, FHIR, or both.
- Familiarity with reading and transforming Parquet and JSON datasets.
- Ability to reason through and normalize undocumented, legacy healthcare data.
- Comfortable using Git and collaborating via GitLab or GitHub.
- Strong communication and documentation habits.
- Comfortable working in Docker-based environments for development or testing.
- Familiarity with Git-based CI workflows (e.g., GitHub Actions, GitLab CI).
- Writes modular, testable code and is comfortable debugging pipelines in containerized setups.
- Experience managing environments via requirements.txt, pyproject.toml, or similar.
- Familiarity with HL7v2 segments (e.g., PID, OBR, OBX) and FHIR bundles/resources.
- Experience transforming clinical data to meet interoperability or public health reporting standards.
- Exposure to Azure Synapse Pipelines, Spark, or other big data frameworks.
- Experience with Nix, container-based dev environments, or CI/CD workflows.
- Prior experience with federal, tribal, or public health systems.
- Experience with AWS services (e.g., S3, Lambda, Glue) or container orchestration tools like Kubernetes.
- Bonus points for Linux-first workflows and familiarity with Neovim or terminal-based tooling.
- Experience using Nix or other reproducible development environments is a huge plus.
Benefit packages for this role will start on the 31st day of employment and include medical, dental, and vision insurance, as well as HSA, FSA, and DCFSA account options, and 401k retirement account access with employer matching. Employees in this role are also entitled to paid sick leave and/or other paid time off as provided by applicable law.