Data Scientist

Post Date

Mar 27, 2026

Location

Atlanta,
Georgia

ZIP/Postal Code

30308
US
May 26, 2026 Insight Global

Job Type

Contract

Category

Architect

Req #

BIR-09ec175d-ec1a-4a68-982f-0a0730f79070

Pay Rate

$68 - $85 (hourly estimate)

Job Description

Synthesize.ai – Technical Requirements Document (External Vendor Version)
1. Project Overview
Synthesize.ai is Southern Company’s advanced analytics and machine-learning platform
designed to improve the quality, continuity, and analytical usefulness of AMI interval data
across APC, GPC, and MPC. The platform will use AI/ML to perform data gap-fill, usage
reconstruction, and short-term/long-term forecasting, and enable downstream analytics.
The primary objective for this vendor engagement is to deliver a production-grade, scalable,
accurate ML-based gap-fill and forecasting engine that can be integrated into Southern
Company's data ecosystem.
2. Scope of Work
2.1 Core Functional Requirements
A. Data Gap-Fill Engine
Develop or extend models that can:
• Fill missing AMI interval data at 15-minute and hourly resolutions
• Handle short gaps (<2 hours) and long gaps (1–72 hours) with model-based
reconstruction.
• Produce reconstruction confidence scores for each interval.
• Incorporate:
o Weather (temperature, humidity, solar irradiance)
o Calendar effects (weekday/weekend, season)
o Outage periods (explicitly excluding outage windows from prediction)
B. Predictive Usage Modeling
• Short-term usage prediction (next 24–48 hours)
• Longer-horizon predictions (up to 7–14 days)
• Ability to run inference on millions of meter records across all OpCos
C. MLOps and Deployment
• Run natively on Databricks within Southern Company’s secure cloud (no external
hosting).
• Provide automated pipelines for:
o Training
o Batch inference
o Monitoring and accuracy drift detection
• Deliver source code, notebooks, CI/CD scripts, and documentation.
3. Data Inputs & Volumes
3.1 Data Sources
• AMI interval data
• Weather feeds
• Outage metadata
• Billing calendar markers (not used for billing VEE; for analytics only)
3.2 Data Characteristics
• 96 intervals/day per meter (15-min)
• Multi-year historical availability (1–3 years)
• High presence of gaps, including consecutive missing intervals
4. Performance Requirements
4.1 Model(s) Accuracy
Measurable model KPIs for:
• MAE / RMSE for filled intervals
• Gap-fill bias (interval-level and daily aggregated)
• Performance across:
o 15-minute kWh
o Hourly kWh
o Golden meters (consecutive gap conditions)
o Non-gold meters (random missing intervals)
4.2 Scalability
Model(s) must:
• Process 4.4M meters (15-min resolution) in batch production cycles
• Support parallelization / Spark-based architecture
4.3 Operational Expectations
• Runtime targets defined for daily and weekly pipelines
• Monitoring hooks for:
o Model drift
o Data Anomalies
o Input/outage alignment issues
5. Deliverables
1. ML Models & Codebase
o Gap-fill models (primary deliverable)
o Forecasting models
o Modular architecture for future extensions
2. Documentation
o Model(s) documentation
o Data dictionaries
o Deployment runbooks
3. MLOps Pipelines
o Orchestrated Databricks jobs
o Git-based CI/CD workflows
4. Dashboards / Visual QA Tools
o before/after gap-fill visualizations
o overlay and comparison tools
5. Training & Knowledge Transfer
o Sessions for AMI Data Science team
o Code walkthroughs
o Handover documentation
6. Non-Functional Requirements
6.1 Security
• All work must remain within Southern Company’s cloud environment
• No external data movement permitted
6.2 Governance
• Conform to Southern Company metadata, tagging, and logging standards
• Models must not be used for billing (analytics-only)
6.3 Vendor Collaboration Expectations
• Weekly progress meetings with Solomon, Joyce S. and project team
• Transparent issue escalation
• Ability to collaborate using Jira (AMI team instance)
• PM is already instated
• 1 Data Scientist from AMI DSA will work in this project
7. Evaluation Criteria for Vendors
• Technical strength and ML methodology
• Scalability and cloud architecture alignment
• Experience with utility AMI datasets
• Clarity of proposed MLOps approach
• Documentation quality
• Speed to delivery
• Total cost and licensing structure

We are a company committed to creating diverse and inclusive environments where people can bring their full, authentic selves to work every day. We are an equal opportunity/affirmative action employer that believes everyone matters. Qualified candidates will receive consideration for employment regardless of their race, color, ethnicity, religion, sex (including pregnancy), sexual orientation, gender identity and expression, marital status, national origin, ancestry, genetic factors, age, disability, protected veteran status, military or uniformed service member status, or any other status or characteristic protected by applicable laws, regulations, and ordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application or recruiting process, please send a request to HR@insightglobal.com.To learn more about how we collect, keep, and process your private information, please review Insight Global's Workforce Privacy Policy: https://insightglobal.com/workforce-privacy-policy/.

Required Skills & Experience

3-5 years of experience as a data scientist (or similar title)
3+ years of experience using AI/ML
Strong Python programming experience
Understanding/Experience within AMI data
Background within a utility
Experience with LLM (large language model)
Ability to deliver source code, notebooks, CI/CD scripts, and documentation

Nice to Have Skills & Experience

Databricks

Benefit packages for this role will start on the 1st day of employment and include medical, dental, and vision insurance, as well as HSA, FSA, and DCFSA account options, and 401k retirement account access with employer matching. Employees in this role are also entitled to paid sick leave and/or other paid time off as provided by applicable law.