EU - Lead DBA

Post Date

Apr 18, 2025

Location

Novato,
California

ZIP/Postal Code

94949

Job Type

Contract

Job Description

1. Database Design, Tuning & Scaling
Architect and optimize complex PostgreSQL and MySQL database clusters to handle
high-velocity OLTP workloads.
Implement horizontal scaling strategies (e.g., sharding, logical partitioning) to support
increased load.
Tune vertical scaling parameters such as connection pools, caching layers, buffer sizes,
and memory allocation.
Continuously improve schema design and normalization strategies for performance and
flexibility.
2. Query Optimization & Workload Profiling
Analyze query patterns, slow query logs, and application metrics.
Identify long-running or inefficient queries and collaborate with developers to refactor
them.
Implement advanced indexing strategies (partial indexes, BRIN, GIN, etc.) to reduce
query latency.
Conduct proactive workload profiling to prevent performance degradation at scale. This
includes regularly benchmarking representative workloads, identifying emerging hotspots
or contention areas, analyzing I/O and CPU usage under various load patterns, and
forecasting capacity requirements. The consultant will develop profiling automation
scripts, simulate growth scenarios, and implement preemptive optimizations to safeguard
against latency spikes and throughput bottlenecks as the player base or feature
complexity grows.
3. Infrastructure & Platform Engineering
Manage and optimize databases on AWS (RDS, Aurora) Google Cloud Platform
(CloudSQL) and MySQL/Postgres on Data center platforms
Design multi-region, fault-tolerant, and geo-replicated systems to support global game
availability.
Support both cloud-native and hybrid infrastructure environments, including
Kubernetes-based deployments.
Contribute to IaC tooling (Terraform, Ansible) to automate DB provisioning, scaling, and
configuration.
4. High Availability, Replication, and DR
Design and implement high availability (HA) setups using streaming replication, GTID,
failover nodes, etc.
Ensure low-latency and durable replication strategies across regions and cloud
providers.
Define and test comprehensive backup/recovery and disaster recovery (DR)
procedures.
Minimize RTO/RPO for game-critical workloads.
5. Monitoring, Observability & Incident Response
Build and maintain observability dashboards using Datadog, Grafana, CloudWatch,
and native DB tools.
Set up and fine-tune alerts for key metrics (e.g., replication lag, query timeouts, cache hit
ratios).
Participate in 24/7 on-call rotations, triaging incidents and driving resolution.
Perform root cause analysis (RCA) and contribute to post-mortems for high-impact
issues.
6. Team Collaboration & CI/CD Integration
Work closely with DevOps, backend engineering, and SRE teams to ensure seamless
integration into CI/CD pipelines.
Provide tooling and standards for DB migrations, version control, and rollback safety.
Mentor engineers on DB best practices, data modeling, and query tuning

We are a company committed to creating inclusive environments where people can bring their full, authentic selves to work every day. We are an equal opportunity employer that believes everyone matters. Qualified candidates will receive consideration for employment opportunities without regard to race, religion, sex, age, marital status, national origin, sexual orientation, citizenship status, disability, or any other status or characteristic protected by applicable laws, regulations, and ordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application or recruiting process, please send a request to Human Resources Request Form. The EEOC "Know Your Rights" Poster is available here.

To learn more about how we collect, keep, and process your private information, please review Insight Global's Workforce Privacy Policy: https://insightglobal.com/workforce-privacy-policy/ .

Required Skills & Experience

Required Skills & Experience
7+ years managing PostgreSQL and MySQL in production-grade environments.
Hands-on experience supporting OLTP systems with millions of concurrent requests.
Mastery in schema design, performance tuning, and deep query analysis.
Proficiency with AWS (RDS, Aurora), GCP (CloudSQL), and hybrid deployments.
Solid scripting knowledge (Bash, Python) for DB automation and incident tooling.
Experience with observability platforms and instrumentation.
Ability to handle incidents calmly under pressure and provide rapid resolution.
Knowledge of multi-region replication, HA, and DR strategies.

Nice to Have Skills & Experience

Prior experience in gaming infrastructure or other latency-sensitive, high-throughput
domains. This includes working on systems such as online multiplayer games, financial
trading engines, streaming media platforms, or real-time logistics and telemetry services.
Such environments require exceptional response times, resilience under concurrent
load, strict consistency models, and advanced troubleshooting practices to meet
demanding SLAs.
Familiarity with Kubernetes, containers, and DB orchestration in microservices
environments.
Exposure to infrastructure-as-code tools (Terraform, Ansible, Pulumi, Puppet).
Understanding of security, RBAC, AD, and compliance in cloud-hosted DBs

Benefit packages for this role will start on the 31st day of employment and include medical, dental, and vision insurance, as well as HSA, FSA, and DCFSA account options, and 401k retirement account access with employer matching. Employees in this role are also entitled to paid sick leave and/or other paid time off as provided by applicable law.