Job Description
• Implement and optimize advanced fine-tuning approaches (LoRA, PEFT, QLoRA) to adapt foundation models to PG&E's domain
• Develop systematic prompt engineering methodologies specific to utility operations, regulatory compliance, and technical documentation
• Create reusable prompt templates and libraries to standardize interactions across multiple LLM applications and use cases
• Implement prompt testing frameworks to quantitatively evaluate and iteratively improve prompt effectiveness
• Establish prompt versioning systems and governance to maintain consistency and quality across applications
• Apply model customization techniques like knowledge distillation, quantization, and pruning to reduce memory footprint and inference costs
• Tackle memory constraints using techniques such as sharded data parallelism, GPU offloading, or CPU+GPU hybrid approaches
• Build robust retrieval-augmented generation (RAG) pipelines with vector databases, embedding pipelines, and optimized chunking strategies
• Design advanced prompting strategies including chain-of-thought reasoning, conversation orchestration, and agent-based approaches
• Collaborate with the MLOps engineer to ensure models are efficiently deployed, monitored, and retrained as needed
We are a company committed to creating diverse and inclusive environments where people can bring their full, authentic selves to work every day. We are an equal opportunity/affirmative action employer that believes everyone matters. Qualified candidates will receive consideration for employment regardless of their race, color, ethnicity, religion, sex (including pregnancy), sexual orientation, gender identity and expression, marital status, national origin, ancestry, genetic factors, age, disability, protected veteran status, military or uniformed service member status, or any other status or characteristic protected by applicable laws, regulations, and ordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application or recruiting process, please send a request to HR@insightglobal.com.To learn more about how we collect, keep, and process your private information, please review Insight Global's Workforce Privacy Policy: https://insightglobal.com/workforce-privacy-policy/.
Required Skills & Experience
• Deep Learning & NLP: Proficiency with PyTorch/TensorFlow, Hugging Face Transformers, DSPy, and advanced LLM training techniques
• GPU/Hardware Knowledge: Experience with multi-GPU training, memory optimization, and parallelization strategies
• LLMOps: Familiarity with workflows for maintaining LLM-based applications in production and monitoring model performance
• Technical Adaptability: Ability to interpret research papers and implement emerging techniques (without necessarily requiring PhD-level mathematics)
• Domain Adaptation: Skills in creating data pipelines for fine-tuning models with utility-specific content
Benefit packages for this role will start on the 31st day of employment and include medical, dental, and vision insurance, as well as HSA, FSA, and DCFSA account options, and 401k retirement account access with employer matching. Employees in this role are also entitled to paid sick leave and/or other paid time off as provided by applicable law.