#48 Cutting Down "Toil" aka Manual Work in Software

Reliability Enablers

25-06-2024 • 44 mins

Sebastian and I scoured Chapter 5 of the Site Reliability Engineering (2016) book to find nuggets of wisdom on how to reduce toil.

We hit the jackpot with concepts like:

* what is toil according to a 5-point criteria

* why even care about toil?

* where you can find toil in your software system

* Google’s goal for how much work (%) should be toil

* the fact that toil isn’t always all that bad

Don’t have time to listen to what we learned or added to the concepts?

Check out the takeaways toward the end of this email.

But first…

Before we jump into the takeaways, here’s a new segment I’m trying out for newsletters. I’ll highlight a new reliability tool that I think could help you.

Do you struggle to visualize your Kubernetes workloads?

In that case, have you heard of kube-ops-view?

It helps you visualize your complex K8s clusters and everything inside them.

For a deeper rundown, visit the LinkedIn post I made about kube-ops-view which shares a few more details.

Back to our original programming…

Here are key takeaways from our chat

* Define and Identify Toil

Regularly evaluate your tasks. Identify work that is manual, repetitive, and potentially automatable. Recognize it as toil and prioritize its reduction.

* Prioritize Automation

Look for repetitive tasks in your workflow and automate them using tools and scripts to reduce manual interventions and increase efficiency.

* Embrace the Role of an SRE

Realize that the role of an SRE is to improve system reliability proactively. Focus on long-term improvements rather than just responding to immediate issues.

* Address Common Sources of Toil

Identify frequent sources of toil like context switching, on-call duties, and release processes. Implement solutions to automate and streamline these areas.

* Adopt a Toil Elimination Mindset

Cultivate a mindset focused on eliminating toil. Regularly discuss and explore automation opportunities with your team to improve processes.

* Develop a Culture of Continuous Improvement

Encourage a culture that values reducing manual, repetitive work. Advocate for proactive problem-solving and continuous process enhancement within teams.

Until next time, happy toil hunting!



This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit read.srepath.com