As I’ve mentioned before, I’m a huge fan of the Toyota way. One of the biggest lessons I’ve taken away from them is The Five Whys. The idea is, if you encounter a problem and ask why five times, you’ll arrive at the root cause of the issue. You can then focus on solving the root cause rather than the symptoms to have a greater impact as not solving the root cause just causes the symptoms to reappear at a later time.
There’s also what’s called a Kata. Kata is a Japanese term for an exercise you practice. The idea is in martial arts, if you practice the Kata enough, it’ll become a reflex you use when you need it. A wax on wax off type of deal that can be applied to just about anything.
These are well documented, so I recommend you check out the wiki article on it if you’re interested in learning more about it.
While this blog post is able to be read on its own without any prior knowledge, it is a part of our Lessons in Factorio series, so if this is your first time reading one of this series’ blog posts, I recommend starting from the first one and working your way through them. The first post can be found here: Why I Learned from Factorio: Lean Networking
Applying Five Whys to Factorio.
In Factorio, the need to solve the root cause is a lot more obvious than real life. this is mostly due to the benefit that all the work is visible as inventory and identifying constraints is pretty easy. This was great because it was my Five Whys Kata that really built the habit of solving the root cause in my day job, which could also be applied to Factorio.
Improving Throughput in Factorio.
When improving the throughput of my factory, I always follow the same workflow:
- Identify which science is the constraint.
- Do this by looking at your science labs. The science you’re missing in some of your labs is the constraint. If you increase the throughput there, you increase the throughput of your research.
- Trace back why that science isn’t being made faster.
- Look at the area where you are producing that science.
- If all the machines are active, simply build more and stop here.
- If they are not all active, determine which material is missing and…
- Look at the area where you are producing that science.
- Trace back why that material isn’t being made faster or delivered quickly enough.
- Repeat until you’ve found the root cause.
- Increase the throughput of that material.
- Verify
- Check to see if the throughput of that science has been increased, and if so, has it been increased enough.
- Repeat the whole process if you want to continue making improvements.
Root causes can be:
- Enough subcomponents being made, but not delivered. Need to improve transport or belting.
- Not enough components made, needs more automation machines, smelters, drills, etc.
- Not enough power.
This workflow is merely for improving the thruput of your factory, not for developing new components.
Applying the Five Whys to Solve More General Issues.
Interestingly enough, you can apply the Five Whys more broadly than this specific workflow. You can also use it to guide what you should do next to solve any problem.
Here’s an example:
- Why don’t we have enough power?
- Because it’s becoming a pain to build more.
- Why is it a pain to build more power?
- Because the amount of power we need is a lot.
- More steam at this point requires mining more coal which we don’t have.
- Solar panels are available, but labor intensive.
- We don’t have nuclear researched.
- At this point, I’ve chosen to go down the solar panel why path. It may be worth going back and checking out the other two later if I’m not satisfied with fixing solar.
- Because the amount of power we need is a lot.
- Why is solar labor intensive?
- Because construction robots aren’t really an option.
- Why aren’t construction robots an option?
- Because they take a boatload of power and we don’t have any to spare.
Applying Kata to IT.
Now that I’ve developed this Kata in Factorio, the habit naturally carried over into my work in IT and I’ve found great success in doing so.
Tracing Technical Issues.
The workflow for tracing throughput issues in Factorio feels very similar to tracing for bandwidth issues in networking.
Here’s an example of this:
- Identify which network path has a potential constraint.
- Get a source and destination IP.
- Trace back why bandwidth is being constrained.
- From the source, run a bandwidth test to the destination.
- If bandwidth looks fine, stop here and look into the application itself.
- Otherwise, looks like it could be a network issue. Move onto the next step.
- From the source, run a bandwidth test to the destination.
- Trace back which link is constraining the bandwidth.
- Repeat until you’ve found the root cause.
- Increase the throughput of that link if possible.
- Verify.
- Check to see if the throughput of that network path has increased and if so, has it been increased enough.
- Repeat the whole process again if you want to continue making improvements.
Now please don’t flame me for skipping steps or not going into detail here, I’m trying to teach you general techniques here. There are better resources available if you’re trying to learn how to troubleshoot network issues.
Applying the Five Whys to IT.
You’ll need to be careful when using this. It’s really easy to start blaming the organization you’re working in for issues since most issues aren’t actually technical.
It is a good tool for helping remove work off your plate, though. It can also keep you focused on fixing issues at the point of most leverage, meaning it’ll have the greatest impact.
An example of this is working to reduce the amount of after-hours calls I get:
- Why do I get so many calls after-hours?
- Because there are a lot of network disruptions.
- Why are there so many network disruptions?
- Infrastructure is aging to the point where failures are more normal and there’s no redundancies.
- Why is there aging infrastructure with no redundancies?
- I started recently in this job and I haven’t yet had the time to prioritize working on it yet.
- Why haven’t I been able to prioritize working on updating the infra and building redundancies?
- I’ve been responding to network disruptions and I’m the only one who can fix them
- Why am I the only one who can resolve network disruptions?
- There’s not enough documentation and training for the rest of the team to handle it.
It may look a bit grim at the moment, but at least now we know what we need to do to break the cycle.
Thank you for checking this out. I hope you learned something new or enjoyed reading this. If you had any comments, questions, or just wanted to share your thoughts on this article, you can contact me at blog@e-mayhem.com
e-Mayhem helps companies successfully deliver business projects. We also help companies avoid losses associated with IT disruptions and security threats. You can learn more about our services at e-mayhem.com or by emailing sales@e-mayhem.com
Comments are closed