Staff Software Engineer, Reliability - Slack
Denver, CO  / Reno, NV  / Philadelphia, PA  / Phoenix, AZ  / Tampa, FL  / Manchester, NH  / Salt Lake City, UT  / Atlanta, GA  / Remote, OR  / Irving, TX  / Detroit, MI ...View All
View Less
Share
Posted 1 day ago
Job Description

To get the best candidate experience, please consider applying for a maximum of 3 roles within 12 months to ensure you are not duplicating efforts.

Job Category

Software Engineering

Job Details

About Salesforce

We're Salesforce, the Customer Company, inspiring the future of business with AI+ Data +CRM. Leading with our core values, we help companies across every industry blaze new trails and connect with customers in a whole new way. And, we empower you to be a Trailblazer, too - driving your performance and career growth, charting new paths, and improving the state of the world. If you believe in business as the greatest platform for change and in companies doing well and doing good - you've come to the right place.

About the team
The Reliability and Incident Automation team builds tools and products that underpin Reliability, Service Ownership and Incident Management at Slack.

We seek diverse perspectives and strategies with a focus on how to keep Slack reliable, empower service owners and learn from incidents. We collaborate with product and infrastructure engineering teams to continuously improve shared technology and processes, and maintain incident management as a foundational skill set of all engineering teams at Slack.

Slack has a positive, diverse, and supportive culture. We want people who are curious, inventive, and inspired to do their best work every single day. In our work together we aim to be smart, humble, hardworking and, above all, collaborative. If this sounds like a good fit for you, please apply and connect with our team.

What you will be doing:

  • Lead engineering development on internal products and tools with a focus on prototyping and iteration for high velocity. Engage with teams and users to build features that have a delightful user experience and make their lives better.

  • Build tooling and services that handle failure gracefully and without interrupting incident response in an environment that requires rock solid reliability and interacting with a variety of external systems, such as Observability, Monitoring, Alerting and Ticketing, to provide real time information to incident responders.

  • Provide mentorship and guide the team forward through technical expertise.

  • Facilitate and participate in incident investigations and reviews (aka postmortems) for major incidents at Slack and drive program improvements for Incident Analysis and Review across Slack Engineering.

  • Run training and workshops to teach Incident Responders and Commanders across Slack about the principles of incident management and the tactical ways in which we perform incident response. Be a peer and mentor to engineers who are new to on-call work and various roles in incident response.

  • Be a service owner for the software and tooling we write and develop. You will participate in an on-call rotation, assist with triage, address production issues, and respond to incidents. Participate as an Incident Commander at Slack.


What you should have:

  • You have 7+ years of experience in Reliability, Incident Management and/or operating distributed systems at scale.

  • You have experience with functional or imperative programming languages - e.g., PHP, Python, Ruby, or Go.

  • You write understandable, testable code with an eye towards maintainability.

  • You are a strong communicator with a positive attitude, and empathy. Explaining complex technical concepts to designers, support, and other engineers is no problem for you.

  • You possess strong computer science fundamentals: data structures, algorithms, programming languages, distributed systems, and information retrieval.

  • Strong UX and design sensibilities, and a desire to sweat the small stuff.

  • Self-awareness and a desire to continually improve.

  • Experience with large scale distributed systems and cloud-based environments.

  • You enjoy helping onboard new team members, mentoring, and teaching others.

  • You have a Bachelor's degree in Computer Science, Engineering or related field, or equivalent training, fellowship, or work experience.


Bonus points:

  • You are passionate about Site Reliability Engineering (SRE), Resilience Engineering and Learning from Incidents

  • Experience building tools or applications with Python and Go

  • Curiosity for gaining valuable insights via analytics and metrics

  • You have experience in responding to and coordinating incidents in previous roles

Accommodations

If you require assistance due to a disability applying for open positions please submit a request via this Accommodations Request Form.

Posting Statement

At Salesforce we believe that the business of business is to improve the state of our world. Each of us has a responsibility to drive Equality in our communities and workplaces. We are committed to creating a workforce that reflects society through inclusive programs and initiatives such as equal pay, employee resource groups, inclusive benefits, and more. Learn more about Equality at www.equality.com and explore our company benefits at www.salesforcebenefits.com.

Salesforce is an Equal Employment Opportunity and Affirmative Action Employer. Qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender perception or identity, national origin, age, marital status, protected veteran status, or disability status. Salesforce does not accept unsolicited headhunter and agency resumes. Salesforce will not pay any third-party agency or company that does not have a signed agreement with Salesforce.

Salesforce welcomes all.

For Colorado-based roles, the base salary hiring range for this position is $185,800 to $269,500.

Compensation offered will be determined by factors such as location, level, job-related knowledge, skills, and experience. Certain roles may be eligible for incentive compensation, equity, benefits. More details about our company benefits can be found at the following link: https://www.salesforcebenefits.com.
Salesforce.com and Salesforce.org are Equal Employment Opportunity and Affirmative Action Employers. Qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender perception or identity, national origin, age, marital status, protected veteran status, or disability status. Headhunters and recruitment agencies may not submit resumes/CVs through this Web site or directly to managers. Salesforce.com and Salesforce.org do not accept unsolicited headhunter and agency resumes. Salesforce.com and Salesforce.org will not pay fees to any third-party agency or company that does not have a signed agreement with Salesforce.com or Salesforce.org.

 

Job Summary
Start Date
As soon as possible
Employment Term and Type
Regular, Full Time
Required Education
Bachelor's Degree
Required Experience
7+ years
Email this Job to Yourself or a Friend
Indicates required fields