Job Description
Insight Global is seeking a skilled Entity Resolution Engineer to build a custom Master Data Management solution on top of Databricks using Splink, an open-source Python library, with a primary focus on mastering provider domain data. This is a senior-level, hands-on engineering role requiring the ability to hit the ground running on an aggressive timeline with minimal ramp-up time. The ideal candidate brings strong Python and Databricks engineering fundamentals, a solid understanding of MDM concepts, and ideally experience in the healthcare provider domain — and must demonstrate excellent verbal and written communication skills to collaborate effectively with a cross-functional team of engineers and client stakeholders. If you believe you are the right fit for the role, we welcome you to apply!
Key Responsibilities
• Design, build, and maintain a custom Master Data Management solution on Databricks using the Splink open-source Python library
• Develop and configure entity resolution and record linkage pipelines to master provider domain data
• Write clean, production-quality Python code to support MDM pipeline development and ongoing enhancements
• Implement CI/CD workflows using Databricks Asset Bundles to manage code promotion and deployment across environments
• Collaborate with data engineers and governance stakeholders to ensure mastered data assets are properly integrated and exposed
• Perform QA and validation of MDM pipelines to ensure accuracy and reliability of provider data matching and deduplication
• Participate in design discussions and contribute to architectural decisions for the custom MDM platform
• Operate independently and with accountability — proactively communicating blockers, timelines, and delivery status to the team lead
We are a company committed to creating diverse and inclusive environments where people can bring their full, authentic selves to work every day. We are an equal opportunity/affirmative action employer that believes everyone matters. Qualified candidates will receive consideration for employment regardless of their race, color, ethnicity, religion, sex (including pregnancy), sexual orientation, gender identity and expression, marital status, national origin, ancestry, genetic factors, age, disability, protected veteran status, military or uniformed service member status, or any other status or characteristic protected by applicable laws, regulations, and ordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application or recruiting process, please send a request to HR@insightglobal.com.To learn more about how we collect, keep, and process your private information, please review Insight Global's Workforce Privacy Policy: https://insightglobal.com/workforce-privacy-policy/.
Required Skills & Experience
• 5+ years of hands-on experience with Databricks as a core engineering platform
• 5+ years of Python development experience, with strong proficiency in writing production-grade code
• Demonstrated experience with Master Data Management (MDM) concepts, including entity resolution, record linkage, and deduplication — whether via Splink or comparable approaches
• Experience with Apache Spark for large-scale data processing within a Databricks environment
• Experience implementing CI/CD pipelines, preferably using Databricks Asset Bundles or equivalent code promotion tooling
• Strong accountability mindset — ability to self-manage, deliver independently, and proactively communicate status and blockers without waiting for direction
Nice to Have Skills & Experience
Public/Private Sector provider services experience is a huge plus
Healthcare/Medi-Cal/Medicaid/Medicare Experience
Benefit packages for this role will start on the 1st day of employment and include medical, dental, and vision insurance, as well as HSA, FSA, and DCFSA account options, and 401k retirement account access with employer matching. Employees in this role are also entitled to paid sick leave and/or other paid time off as provided by applicable law.