Job Description
One of our large Pharmaceutical clients is seeking a Data Steward to join them for their growing project work. As a member of the centralized Research Data Standards & Governance team, the Research Data Steward holds the responsibility for managing the end-to-end lifecycle of research data operational activities by establishing and maintaining data standards and governance practices, to ensure data quality, accuracy, retention, security and compliance with regulatory requirements.
This role supports a causal biology / target identification program by driving data stewardship and (potentially) data ownership across research (non-clinical and clinical) datasets—focused primarily on public and externally sourced data. The consultant will establish and promote data standards (ontologies, vocabularies, metadata), enable consistent modeling and integration of internal and purchased/public datasets, and oversee governance that helps multiple research “tech builds” scale safely and consistently (e.g., multi-omics, sample/asset metadata, and research data warehousing).
Key Responsibilities
· Serve as a data steward (and potentially a data owner) for causal biology / target identification research datasets, ensuring data is well-defined, governed, and usable across the organization (this is not a clinical data role).
· Define, standardize, and promote controlled vocabularies, ontologies, and data standards for research and multi-omics datasets (e.g., genetics, transcriptomics / gene expression, and related omics domains).
· Support integration and analysis readiness of external datasets (including biobank data) alongside internal data sources.
· Partner with scientists, data product teams, and platform stakeholders to align what is captured for samples vs. omics experiments and ensure consistent linking of assets and metadata.
· Contribute to data modeling efforts and standardization that support downstream visualization and reporting tools.
· Drive stakeholder engagement and influence: communicate standards clearly, advocate for adoption, and coordinate across teams to reduce fragmentation and improve consistency.
· Support multiple research “tech builds” by defining governance, metadata, and standards that enable consistent capture and reuse across projects (e.g., large molecule design initiatives, lab equipment data pipelines, and sample management/collection: what samples are, how they are used, and how they are traced).
· Enable research data capture and storage for cell lines, organoids, and related model systems, including guidance for how to attribute patient populations/indications to molecular models of disease and how to store associated molecular/omics data.
· Support scaling of perturbation datasets by helping build and govern a large warehouse of collected data, ensuring consistent structure, metadata, lineage, and quality controls.
· Oversee public/external data management, including intake of purchased/partnered public datasets, documentation of provenance, and alignment to internal standards.
· Define and operationalize guidance for data terms of use (licenses, permitted use, sharing restrictions, retention) and set/communicate policy expectations for research teams consuming public data.
· Classify datasets and use-cases into high / medium / low risk categories and ensure appropriate governance controls are in place (e.g., access expectations, review steps, and documentation).
We are a company committed to creating diverse and inclusive environments where people can bring their full, authentic selves to work every day. We are an equal opportunity/affirmative action employer that believes everyone matters. Qualified candidates will receive consideration for employment regardless of their race, color, ethnicity, religion, sex (including pregnancy), sexual orientation, gender identity and expression, marital status, national origin, ancestry, genetic factors, age, disability, protected veteran status, military or uniformed service member status, or any other status or characteristic protected by applicable laws, regulations, and ordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application or recruiting process, please send a request to HR@insightglobal.com.To learn more about how we collect, keep, and process your private information, please review Insight Global's Workforce Privacy Policy: https://insightglobal.com/workforce-privacy-policy/.
Required Skills & Experience
- Bachelor’s degree or higher in Biology, Bioinformatics, Computational Biology, Data Science (with strong life-sciences focus), or a related field.
- Hands-on experience with data stewardship and/or data governance concepts (e.g., stewardship operating model, data quality, metadata, ownership, and standards).
- Experience working with data standards, ontologies, vocabularies, or controlled terminologies in a research or life-sciences setting.
- Knowledge of cloud platforms, specifically google cloud
- Understanding of multi-omics data types and workflows (e.g., genetics, transcriptomics/gene expression) and how sample/assay metadata ties to experimental outputs.
- Working knowledge of data modeling and structuring datasets for analysis and reuse.
Nice to Have Skills & Experience
- Familiarity and/or experience with concepts related to interoperability, cloud storage platforms and creating integrated data assets from internal generated and externally purchased data.
- Knowledge of data privacy laws and regulations is a plus, contributing to compliance and best practices in data stewardship.
Benefit packages for this role will start on the 1st day of employment and include medical, dental, and vision insurance, as well as HSA, FSA, and DCFSA account options, and 401k retirement account access with employer matching. Employees in this role are also entitled to paid sick leave and/or other paid time off as provided by applicable law.