Job Description
ABOUT THE ROLE
• Joins a rapidly growing legal tech incubation team building AI-powered software products and managed services focused on claims management, claims evaluation, and document intelligence.
• Primarily responsible for designing and maintaining ETL pipelines that ingest, parse, and transform complex customer documents — including legal filings, claims records, and structured/unstructured data sources — into usable data for downstream analysis and application consumption.
• Works alongside MLOps engineers, product directors, and implementation leads in a collaborative, fast-moving environment where automation quality and data fidelity are mission-critical.
RESPONSIBILITIES
• Builds and maintains robust ETL pipelines for extracting, transforming, and loading data from a wide variety of customer document formats, including PDFs, structured forms, and unstructured legal text.
• Develops and refines custom document parsing logic using Python and Regex to accurately extract claims-relevant data fields at scale.
• Designs and exposes backend data services via FastAPI, supporting downstream application and reporting needs across the product suite.
• Collaborates with the DevOps and MLOps teams to ensure pipelines are containerized, scalable, and reliably deployed within the team's cloud infrastructure.
• Monitors pipeline performance, identifies data quality issues, and iterates on parsing logic as new document types and client requirements are introduced.
We are a company committed to creating diverse and inclusive environments where people can bring their full, authentic selves to work every day. We are an equal opportunity/affirmative action employer that believes everyone matters. Qualified candidates will receive consideration for employment regardless of their race, color, ethnicity, religion, sex (including pregnancy), sexual orientation, gender identity and expression, marital status, national origin, ancestry, genetic factors, age, disability, protected veteran status, military or uniformed service member status, or any other status or characteristic protected by applicable laws, regulations, and ordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application or recruiting process, please send a request to HR@insightglobal.com.To learn more about how we collect, keep, and process your private information, please review Insight Global's Workforce Privacy Policy: https://insightglobal.com/workforce-privacy-policy/.
Required Skills & Experience
WHAT WE'RE LOOKING FOR
• 3–5 years of engineering experience with a strong focus on ETL pipeline development and document parsing — hands-on Python proficiency and Regex pattern design are non-negotiable.
• Solid FastAPI experience for building lightweight, performant backend services; comfort with PostgreSQL for data storage, querying, and schema design.
• Familiarity with containerized deployment environments (Kubernetes, Docker) and cloud infrastructure (AWS S3, EC2) is expected at this level.
• Experience parsing complex, variable-format documents — legal, insurance, or financial document backgrounds are a strong plus given the nature of the work.
• Detail-oriented with a strong sense of data quality ownership; able to work autonomously on technically ambiguous problems while communicating progress clearly to non-technical stakeholders.
Benefit packages for this role will start on the 1st day of employment and include medical, dental, and vision insurance, as well as HSA, FSA, and DCFSA account options, and 401k retirement account access with employer matching. Employees in this role are also entitled to paid sick leave and/or other paid time off as provided by applicable law.