AI Performance Software Engineer Job at Signify Technology, San Francisco, CA

MnRjMWg3UmVqR3I2UU9WdWwvZUNkam8rMUE9PQ==
  • Signify Technology
  • San Francisco, CA

Job Description

AI Performance Engineer – CUDA & PyTorch Focus

Location: San Fransisco, CA

Compensation: $200,000-$300,000

A stealth-mode AI systems company is reimagining how large-scale inference is done. With generative AI workloads scaling rapidly, inference efficiency has become a critical bottleneck. We're building an integrated hardware-software platform that brings breakthrough performance and usability to production-scale LLM applications.

This is an opportunity to work on a highly technical team spun out of top-tier academic research, focused on the cutting edge of AI, distributed systems, and performance optimization.

What You’ll Do:

  • Drive core research and implementation of performance optimizations for modern AI models
  • Implement advanced techniques like FlashAttention, KV caching, quantization, and model compression
  • Design and build scalable, distributed compute strategies across GPU-based systems
  • Profile, benchmark, and optimize CUDA kernels and AI runtime performance across inference stacks
  • Work across frameworks like PyTorch, ONNX, and vLLM to improve end-to-end efficiency

What We're Looking For:

  • Strong background in CUDA and low-level GPU performance tuning
  • Proven experience building with PyTorch and deploying high-performance ML models
  • Proficiency in Python and C++
  • Experience with large-scale distributed systems in cloud environments (AWS, GCP, or Azure)
  • Exposure to AI compilers or frameworks like MLIR is a plus
  • Interest in system design, scalability, and accelerating LLM workloads in real production environments

If you’ve spent your time making large models faster, leaner, and more efficient—and want to solve hard technical problems at the core of GenAI infrastructure—this role is for you.

Reach out to learn more.

Job Tags

Similar Jobs

Sioli Alexander Pino

Associate Attorney Job at Sioli Alexander Pino

Sioli Alexander Pino is a fast-paced, high-volume insurance defense law firm seeking a motivated, proactive, and detail-oriented attorney to join our diverse litigation team. Our practice focuses on premises liability, personal injury, workers compensation, auto accident...

Eximia Research

Clinical Research Coordinator (Neuro-Psychiatric) Job at Eximia Research

 ...ECG, lab processing) within scope Promote respect for cultural diversity and conventions with all individuals. Understand the disease process or condition under study Other duties as assigned Qualifications Education/Experience: ~ Bachelors degree and... 

Liberty Health

CERTIFIED NURSING ASSISTANT - THE FOLEY CENTER AT CHESTNUT RIDGE Job at Liberty Health

 ...CERTIFIED NURSING ASSISTANT - THE FOLEY CENTER AT CHESTNUT RIDGE Blowing Rock-NC-28605-United States Liberty Cares With Compassion...  ...seeking an experienced: CERTIFIED NURSING ASSISTANT (CNA) Full Time, Nights Job Description: Assist... 

Liberty Healthcare Corporation

Ideal part-time position for former law enforcement professionals Job at Liberty Healthcare Corporation

 ...Experienced law enforcement professionals are encouraged to consider an outstanding part-time job opportunity with Liberty Healthcare and...  ...the region. Is this job the right fit for me? If youre retired law enforcement and interested in staying active in the field... 

The LiRo Group

Scheduler Job at The LiRo Group

We are currently seeking a Scheduler with railroad/transit experience . This position is located in our New York City Office in mid-town Manhattan. Come join our team! We are looking to build services and capabilities through the growth of our key asset- ...