Job Description for Research Manager, Interpretability at Anthropic

About Anthropic

Anthropic’s mission is to create reliable, interpretable, and steerable AI systems that are safe and beneficial for users and society. The team includes researchers, engineers, policy experts, and business leaders focused on building beneficial AI systems.

About the Interpretability Team

Mission: Reverse engineer how trained models work.
Focus: Mechanistic interpretability to discover how neural network parameters map to meaningful algorithms.
Approach: Treat neural networks like biology or neuroscience or as binary computer programs to reverse engineer.
Goals: Create a scientific foundation for mechanistically understanding neural networks and making them safe.
Research highlights include resolving "superposition" issues, decomposing models into interpretable components, and building circuits to understand model computation.
Notable publications and resources are available for deeper understanding.

About the Role

Position: Manager on the Interpretability team.
Responsibilities: Support expert researchers and engineers in mechanistic understanding of large language models.
Importance: Accelerate research by managing team execution, careers, performance, relationships, and hiring.
Collaboration: Partner closely with an individual contributor research lead.
Note: Individual contributor roles (Research Scientist or Engineer) are available if preferred.

Responsibilities

Partner with research lead on direction, project planning, execution, hiring, and people development.
Maintain high standards for execution speed and quality; improve team processes.
Coach and support team members’ impact and career development.
Drive recruiting efforts including planning, process improvements, sourcing, and closing hires.
Identify and support collaboration opportunities across Anthropic teams.
Communicate updates and results to other teams and leadership.
Maintain deep understanding of technical work and AI safety implications.

Candidate Fit

Required:

Experienced manager (2–5 years) leading technical research/engineering teams.
Background in machine learning, AI, or related technical field.
Enjoy people management; skilled in coaching, mentorship, performance evaluation, career development, hiring.
Strong project management skills including prioritization and cross-functional collaboration.
Experience managing teams through ambiguity/change.
Quick learner with ability to understand complex technical topics; motivated to learn about Anthropic’s research.
Strong verbal and written communication skills.
Passionate about transformative impact of advanced AI systems and ensuring positive outcomes.

Strong Candidates May Also Have:

Experience scaling engineering infrastructure.
Experience with open-ended exploratory research agendas aimed at foundational insights.
Familiarity with mechanistic interpretability work.

Location Policy

Expected in San Francisco office 3 days per week.

Compensation

Annual Salary Range: $350,000 - $500,000 USD

Logistics

Education: At least a Bachelor’s degree in a related field or equivalent experience required.
Hybrid policy: Staff expected in office at least 25% of the time; some roles may require more presence.
Visa sponsorship available but not guaranteed for every role/candidate; immigration lawyer support provided if offered a position.

Additional Notes

Anthropic encourages applications even if candidates do not meet every qualification. The company values diversity and representation due to AI’s social/ethical implications. Safety tips regarding recruitment communications are provided.