Remote Senior Performance Engineer

Post Date

Nov 10, 2025

Location

Palo Alto,
California

ZIP/Postal Code

94304

Job Type

Contract

Job Description

Insight Global is looking to hire a Senior Performance Engineer for a client in the quantum computing space. This is a fully remote contract opportunity with potential for extension and/or conversion to full time employment. Key responsibilities for this role will include:

- Lead the design and build of specialized HPC environments.
- Scale machine learning models on GPU clusters.
- Fine-tune GPU kernels for performance optimization.
- Collaborate closely with scientists to support computational needs.
- Train and scale AI/ML models.
- Increase simulation speed.
- Optimization of the software that runs on their HPC environment.

We are a company committed to creating diverse and inclusive environments where people can bring their full, authentic selves to work every day. We are an equal opportunity/affirmative action employer that believes everyone matters. Qualified candidates will receive consideration for employment regardless of their race, color, ethnicity, religion, sex (including pregnancy), sexual orientation, gender identity and expression, marital status, national origin, ancestry, genetic factors, age, disability, protected veteran status, military or uniformed service member status, or any other status or characteristic protected by applicable laws, regulations, and ordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application or recruiting process, please send a request to HR@insightglobal.com.To learn more about how we collect, keep, and process your private information, please review Insight Global's Workforce Privacy Policy: https://insightglobal.com/workforce-privacy-policy/.

Required Skills & Experience

- Proven experience working in HPC development, with a focus on parallel and distributed computing, and experience with scalable systems.
- 5 or more years of experience in scientific software development.
- Deep experience with GPU architecture and parallel computing.
- Background in kernel optimization and HPC systems.
- Proficiency in CUDA and familiarity with NVIDIA’s HPC software stack, including cuDNN, NCCL, and TensorRT.
- Expertise in MPI, OpenMP, and other parallel programming paradigms.
- Strong understanding of scaling clusters and optimizing software for distributed multi-node environments.
- Strong proficiency in C/C++, Python, and Ansible.
- Familiarity with profiling and performance optimization tools (e.g., NVIDIA Nsight, AMD μProf, Intel VTune).

Nice to Have Skills & Experience

- Bachelor's or master's degree in computer science, Electrical Engineering, Computational Science, or a related field.
- Advanced degrees or relevant certifications are also a plus.
Terraform, Ansible, Packer

Benefit packages for this role will start on the 31st day of employment and include medical, dental, and vision insurance, as well as HSA, FSA, and DCFSA account options, and 401k retirement account access with employer matching. Employees in this role are also entitled to paid sick leave and/or other paid time off as provided by applicable law.