Platform Engineering – Staff Site Reliability Engineer

at SolarWinds (View all jobs)

Bangalore, India

Req ID: 200458

At SolarWinds, we’re a people-first company. Our purpose is to enrich the lives of the people we serve—including our employees, customers, shareholders, Partners, and communities. Join us in our mission to help customers accelerate business transformation with simple, powerful, and secure solutions.

The ideal candidate thrives in an innovative, fast-paced environment and is collaborative, accountable, ready, and empathetic. We’re looking for individuals who believe they can accomplish more as a team and create lasting growth for themselves and others. We hire based on attitude, competency, and commitment. Solarians are ready to advance our world-class solutions in a fast-paced environment and accept the challenge to lead with purpose. If you’re looking to build your career with an exceptional team, you’ve come to the right place. Join SolarWinds and grow with us!

Your Role:

We are seeking a Staff Site Reliability Engineer (Infrastructure & Site Reliability Engineering) with extensive experience in AWS, AZURE, Kubernetes, GitOps to lead our Site Reliability Engineering (SRE) team. The successful candidate will deeply understand SRE practices and have a track record of implementing high-quality site reliability engineering practices (SLAs, SLOs, Proactive Alert Management, Incident Response/Review, Postmortems, etc.).

In this role, you will work with our SRE and cross-functional engineering teams to develop and operate our development and production infrastructure and operations

Your Impact:

  • Work collaboratively with software engineering to define infrastructure and deployment requirements
  • Be the driving force behind our automation and observability initiatives
  • Build and maintain operational tools for deployment, monitoring, and analysis of cloud (AWS & AZURE) infrastructure and systems
  • Leading the response to production incidents, conducting postmortems and continuous improvement and be on on-call rotation
  • Establish and drive operations performance through SLOs
  • Provide project management, sprint planning, and road-mapping support to the SRE team
  • Expert level technical skills and able to provide mentoring to team members
  • Our team uses practices to maximize our development velocity, including but not limited to: continuous integration/deployment, code review via GitHub pull requests

Ideal Attributes

  • Strong customer orientation
  • Excellent interpersonal and organizational skills
  • Attention to detail and focus on quality
  • Strong communication skills to effectively liaise with both technical and non-technical staff
  • Ability to act decisively and works well under pressure
  • Must be a collaborative problem solver
  • Strong bias for ownership and action

Your Experience:

  • At least 8 + years of experience designing, building and maintaining SAAS environments
  • 5+ years of experience designing, building and maintaining AWS/AZURE infrastructure with Terraform
  • Experience building and running Kubernetes clusters
  • Experience with observability (monitoring – logging, tracing, metrics)
  • Experience with GitOps CI/CD processes
  • Experience with scripting with Python, Go (Golang), bash, or PowerShell and AWS CLI tools
  • Experience with security operations – security policies, infrastructure, key management, setup of encryption at rest and transport

SolarWinds is an Equal Employment Opportunity Employer. SolarWinds will consider all qualified applicants for employment without regard to race, color, religion, sex, age, national origin, sexual orientation, gender identity, marital status, disability, veteran status or any other characteristic protected by law.

All applications are treated in accordance with the SolarWinds Privacy Notice: https://www.solarwinds.com/applicant-privacy-notice