USD per year
Senior Site Reliability Engineer
Senior Site Reliability Engineer, Data Platform Engineering Summary The Wikimedia Foundation is looking for a Senior SRE to join our team, reporting to the Engineering Manager of the Data Platform Engineering SRE team. As a Senior SRE, you will be responsible for operating the systems supporting our data-oriented teams (Kubernetes, >6PB Hadoop, OpenSearch, Airflow, Superset, Kafka, etc), helping design and implement new systems and solutions, and ensuring that our systems scale to meet demand. In this role, you will interact with our client teams, support them in whatever adventure they are on, investigate incidents, migrate services to Kubernetes, … You are responsible for:
- Simplifying our operations by standardizing how we deploy services and how we benefit from virtualizing and containerizing our applications
- Supporting our users, removing roadblocks, and making them more productive!
- Monitoring of systems and services, optimization of performance, and resource utilization
- Proactively identifying sources of instability in distributed systems and analyzing how complex systems fail from a reliability and resilience perspective.
- Automation and streamlining of tasks, as well as identifying process gaps
- Collaborating with a global and asynchronously communicating team (don’t worry if you have never worked remotely, we’ll help you get used to it)
- Mentoring peers in your areas of technical and operational strength
- Expected to travel domestically or potentially internationally 2-3 times a year for team gatherings and conferences
Our backlog has even more details. Skills and Experience:
- 5+ years of experience in an SRE/Operations/DevOps or software engineering role
- Experience with running applications and services at scale
- Proficiency with shell and a programming language used in an SRE/Operations engineering context (Python, Go, Ruby, etc.)
- Comfort with Open Source configuration management and orchestration tools (Puppet, Ansible, Terraform etc.)
- Communicative technical English
- Virtualization of data and compute
Qualities that are important to us:
- Share our values, appreciate our code of conduct, support our team norms, and work in accordance with all three
- Customer-oriented. We’re here to help, not to block.
- Strong English language skills and ability to work independently, as an effective part of a globally distributed team
- Comfortable working in the open
- Passionate about supporting our communities
Additionally, we’d love it if you have:
- Experience with Kubernetes and Ceph
- Experience with operating a data platform
About the Wikimedia Foundation
The Wikimedia Foundation is the nonprofit organization that operates Wikipedia and the other Wikimedia free knowledge projects. Our vision is a world in which every single human can freely share in the sum of all knowledge. We believe that everyone has the potential to contribute something to our shared knowledge, and that everyone should be able to access that knowledge freely. We host Wikipedia and the Wikimedia projects,... The Wikimedia Foundation is a charitable,... *As an equal opportunity employer* ,...diverse workforce,... The Wikimedia Foundation is a remote-first organization with staff members including contractors based in 40+ countries. Salaries at...US$ 113082 to US$ 175725 ... *Please note...countries:* Australia; Austria; Bangladesh; Belgium; Brazil; Canada; Colombia; Costa Rica; Croatia; Czech Republic; Denmark; Egypt; Estonia; Finland; France; Germany; Ghana; Greece; India; Indonesia; Ireland; Israel; Italy; Kenya; Mexico; Netherlands; Nigeria; Peru; Poland; Singapore; South Africa; Spain; Sweden; Switzerland; Uganda; United Kingdom; United States of America ... Our non-US employees are hired through a local third party Employer of Record (EOR)... All applicants can reach out... ***If you are a qualified applicant requiring assistance or an accommodation...contact us at recruiting@wikimedia.org or +1 (415)8396885.
We are the nonprofit that hosts Wikipedia. We support the people, technology, and policies that enable reliable information to be shared with the world.
View Company Profile