USD per year
Applied Research Engineer
Location: San Francisco, United States / Remote (US) Employment: Full-time Department: Engineering - Backend Experience: 6+ years Salary: $180K - $250K Equity: 1.00% - 1.50%
Technologies and Requirements
Amazon Web Services (AWS), C++, Go, Python, Rust, Torch/PyTorch, LLMs, US citizen/visa only
Overview
Zep is the memory and context layer for AI agents. As a Senior Applied Research Engineer, you will explore novel approaches to memory, context, and context generation, then own those ideas all the way to production. This is a research role with a strong applied focus. The company is looking for engineers who can run rigorous experiments, train and evaluate models, and ship production code that customers depend on.
How We Work
- Small, distributed team working closely together.
- Pair programming on hard problems.
- Design reviews.
- Learning is part of the job.
- Encourage asking questions of customers, teammates, and assumptions.
- Fix pain points when found.
- Expectation to ask questions early, push back when disagreeing, and care about API users.
What You'll Do
- Explore novel approaches to memory, context, and context generation; define problems; run experiments; ship results.
- Own research to production end-to-end including dataset creation and curation, experiment design, evaluation, training and fine-tuning, production deployment.
- Train, fine-tune, and evaluate models on Zep's domain.
- Build evaluation harnesses that catch regressions before shipping.
- Work with model serving stack to operate inference at low latency and reasonable cost on AWS.
What We're Looking For
- 6+ years of production engineering with strong backend systems background; experience shipping services with real throughput and latency requirements.
- Master's degree in Computer Science or equivalent.
- Strong research skills including methodology, dataset creation/curation, experiment design and evaluation; ability to frame open problems and design experiments that answer questions.
- Hands-on experience with model fine-tuning; familiarity with transformer architectures; training/fine-tuning workflows; evaluation; PyTorch and OpenAI Triton for experimentation.
- Experience with model serving technologies such as vLLM, SGLang or Triton Inference Server; operated inference in production.
- Proficiency in Python plus one of Rust, C++, or Go for critical-path code and performance (Python-only not sufficient).
- Hands-on AWS experience in production including deployments, monitoring, scaling, cost/reliability tradeoffs.
Nice to Have
- Published or open-source work in retrieval, memory systems or LLM evaluation.
Tech Stack
Python, Rust/C++/Go, PyTorch, vLLM/SGLang, AWS.
This Role Is Probably NOT a Fit If
- You are an ML researcher or model trainer who hasn't shipped research to production.
- Your background is primarily Python application work without lower-level systems experience.
- You haven't operated production backend systems with real latency or throughput requirements.
Interview Process:
- Screening Call with Daniel (Founder)
- Team Calls (2–3 hours back-to-back; may include a presentation)
- Decision Call with Daniel again
Zep assembles context from chat history, business data, and user behavior to build personalized, fast, and reliable agents via a unified context graph and simple APIs. It supports real-time applications with enterprise-grade compliance.
View Company Profile