Applied Research Engineer
San Francisco / Remote (US)
Full Time
4 hours ago
Senior LevelEngineeringWorldwide
Over $120K

USD per year

Job Description

Zep AI

Agent Context Is Hard. We Fixed It.

Applied Research Engineer

$180K - $250K1.00% - 1.50%San Francisco, United States / Remote (US) Job type Full-time Role Engineering, Backend Experience 6+ years Visa US citizen/visa only Skills Amazon Web Services (AWS), C++, Go, Python, Rust, Torch/PyTorch, LLMs Connect directly with founders of the best YC-fundedstartups. Apply to role 7; ![Daniel Chalef](https://bookface-images.s3.us-west-2.amazonaws.com/avatars/9dd4658ad4e97387766ac3f0a1ee51f1c59c048d.jpg?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=ASIAQC4NIECADDOOQYK220260527us-west-2s3aws4_request&X-Amz-Date=20260527T133649Z&X-Amz-Expires=3600&X-Amz-Security-Token=IQoJb3JpZ2luX2VjEM3wEaCXVzLXdlc3QtMiJGMEQCIA2S8S8B5KIJCtYXkR8lWSunNtLXzkdo3MA6rH1UAiAppTnFDNwgZdQSWRsOipaXVM0yGo9eFppGOoONzBNzSruAwiW8BEAAaDDAwNjIwMTgxMTA3MiIMc2uyb2WC0BfkfKsIDfHvhaqNz75NoLu2maI1aI5gOUgofHqebTy4MreXDtb0KL9Rpdnu2qhOKoqrFn8uZVadtGNGSCybopMVzXC9Ru6OIVjV1MK8ZvQ3tvZ2rIKozU6z9Wcb6h65tvICzukqXcHUqo6CDnD1GiMgwI38rCNwiU8ppYIrDg2tnjBQYH1g1bKbXVxErWQmKWd1sqVPUG5QQANHQWb2F2i6iB673Nuda0fsDEKma0jFE5STa7zu1YdJAeGBenYmh2BUr6j5Q7cPe9WQuKQTHO8jjbNEbWeT3GpUq5vLff4hT8O3fknTtQ3VG3v5mNwiHULPgVkAIBwkZjHibFSwqTD3BZltSGSFmSPW3EHrHu3YTqogNsDzknyQkAUGXxQkhrdW6Yf58s0Lvvt9yPPyFCsa3EPRwaVtDwWuKkrf5z9fOdfmQfp1RwQX5UL5fL2Df8CIE2bOd4P8v0DIFMeV0dbeEr1HdgyfjXZEdS5MWkndnXUA87X6yEhAklOrshT9WZStdqtb0JrkKzEO0v3P6aPWkmVD9YpEYo60xwmZN0vRN3mwJCZTkwgJ3uYwBrjUpzSZMIrskfaLML3P29AGOqYBePDubagJfXl5r206SzfMkomyO5Z6giwNWo4nV9mMmEKstW0S79F2Kdtin7eb6xkobAn1NHMfVgqqksVsGrERivZgrQMhBtad3nAEh3E0KSfEsNSPSbSWEvp89xEWOa5llY9gOV0ztIYWahtKcvJ6ccWl6DoiHyvGY5hPrOhqCST5xGgegkm7Rebn5mxuUk4eBRajcj8LfwDZ5sA1ji4B2A&X-Amz-SignedHeaders=host&X-Amz-Signature=685b5ef3ff64085620a375ee24380e6871a10e2db1239d4a3b0b2e2c1262bfff) Daniel Chalef Founder

About the role

Zep is the memory and context layer for AI agents. As a Senior Applied Research Engineer, you'll explore novel approaches to memory, context, and context generation, then own those ideas all the way to production. This is a research role with a hard applied bent. We're not hiring ML researchers chasing publications. We're hiring engineers who can run rigorous experiments, train and evaluate models, and ship the result as production code our customers depend on. How we work We're a small, distributed team that works closely together. We pair on hard problems, review each other's designs, and treat learning as part of the job rather than something that happens after hours. We ask a lot of questions: of customers, of teammates, of our own assumptions. When we find pain, we go fix it. We expect the same back: ask questions early, push back when you disagree, and care about the people on the other end of the API. What you'll do

  • Explore novel approaches to memory, context, and context generation. Define the problem, run the experiments, ship the result.
  • Own research to production end-to-end: dataset creation and curation, experiment design, evaluation, training and finetuning, and production deployment.
  • Train, finetune, and evaluate models on Zep's domain. Build the eval harnesses that catch regressions before they ship.
  • Work with our model serving stack to operate inference at low latency and reasonable cost on AWS.

What we're looking for

  • 6+ years of production engineering with a strong backend systems background. You've shipped services with real throughput and latency requirements.
  • Master's in Computer Science or equivalent.
  • Strong research skills: methodology, dataset creation and curation, experiment design, and evaluation. You can frame an open problem and design experiments that actually answer the question.
  • Hands-on experience with model finetuning. Working familiarity with transformer architectures, training and finetuning workflows, and evaluation. PyTorch and OpenAI Triton for experimentation.
  • Working experience with model serving technologies: vLLM, SGLang, or Triton Inference Server. You've operated inference in production.
  • Python, plus high proficiency in one of Rust, C++, or Go. You can work in critical-path code and on performance. Python-only is not enough.
  • Hands-on AWS experience in production: deployments,...
How to Apply
About Zep

Zep assembles context from chat history, business data, and user behavior to build personalized, fast, and reliable agents via a unified context graph and simple APIs. It supports real-time applications with enterprise-grade compliance.

View Company Profile