Deidentification Data Engineer

Post Date

Nov 07, 2023

Location

Durham,
North Carolina

ZIP/Postal Code

27709

Job Type

Perm

Job Description

Job Description: This position reports to the Data Partnerships Director of Data and Analytics Platforms. This position will be responsible for the management and execution of the deidentication processes applied to the data assets to be included in the Federated Clinical Applications Platform (FCAP), and the management and administration of the FCAP deidentification cloud environment .

The position will be a member of the Data Parternships Data Engineering team and will additionally provide expertise in the development of data integration and delivery pipelines to deliver new data modalities into the FCAP and Data Lake. These solutions will capitalize on technologies to improve the value of analytical data, improve effectiveness of information stewardship, and streamline the flow of data in the organization. Solutions will focus on using state of the art data and analytics tools including traditional and near real-time data warehousing, big-data, relational and document based databases using both extract, load, transform (ELT) toolsets as well as REST APIs and FHIR. The ideal candidate will be comfortable with data science platforms with proven experience leveraging DevOps and Automation/Orchestration tools.

Job Responsibilities:

* Create and follow defined procedures in the deidentification of patient medical information

* Maintain and tune the deidentification environment to perform optimally and comply with policies and standards

* Collaborate with partners on improving the deidentication programs and processes, and work with the partner and the Cloud Team on troubleshooting issues

* Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.

* Recommend design of analytics solutions which improves data integration, data quality, and data delivery with an eye towards re-useable components

* Create and maintain an optimal data pipeline architecture

Required Skills & Experience

Must Haves:

* 5+ years of experience in a hands on Data Engineer role

* Bachelor's degree in a related field

* Experience with relational SQL and NoSQL databases

* Writing and executing Python programs and shell scripts on Linux

* Linux administration experience

* Data Engineering on Microsoft Azure

* Experience with data pipeline and orchestration tools such as Azure Data Factory and SQL Server Integration Services

* Developing on cloud-based analytic platforms such as Azure Synapse

Nice to Have Skills & Experience

Plusses:

* Prior experience in health care IT

* Working knowledge of Azure DevOps & Automation/Orchestration

* Knowledge of open source software solutions and open source as a business model

* Technical breadth across application development, enterprise architecture, or application integration

* Understanding of Agile methodology

* Knowledge of APIs, API Integration, and API Management

Benefit packages for this role will start on the 31st day of employment and include medical, dental, and vision insurance, as well as HSA, FSA, and DCFSA account options, and 401k retirement account access with employer matching. Employees in this role are also entitled to paid sick leave and/or other paid time off as provided by applicable law.