Root cause analysis – Take 1

One of the processes we were planning to introduce was some root-cause analysis for things that happen along the way. As luck would have it, one of the project managers came to a conclusion he needed something like that today.

A bug that was discovered was the trigger, and the three team members involved were asked to analyze what had happened, why and how it can be circumvented in the future. The task was given to them only, but since I already had intervened by giving them the draft form I prepared, one of them asked me to participate.

The first thing was to focus – they weren’t sure how to start this. I started asking questions. I already had the background for the defect, and my goal was to stay as much away from technical stuff.

So starting at the beginning: Was the requirement clear? All agreed that it was. So we moved on to design processes and communication. All participants were sincere and open. So finally we came to the following points:

  • The feature implementation was based on an engine developed in a former release. So the solution had to fit the former implementation. Ironically, the engine was generic in its design, but did not anticipate this usage scenario.
  • The feature did not have a complete design, and the interfaces and content between the different components were not completely defined. So some impedance emerged, that caused the bug to appear.
  • Although planned originally, the different members did not develop in parallel, but in different time segments. Although there was a feature lead (the one in charge of integration of the feature) there was no team integration – just the lead, with multi-tasked help from the others, who were working on other things at the time
  • In order to make sure the feature was implemented correctly, one of the team members needed to run a test procedure for the feature, but ran just parts of it.
  • The current design is not documented anywhere. But we can do brain surgery to remove it from the people’s heads.

So, here are a few anti-patterns that come up:

  • Wrong reuse of component – The engine was not modified to enable this feature, and the solution was patched around it.
  • The design was not complete at the “last responsible moment”. No BDUF, but when needed, the people continued based on assumptions, rather than agreement.
  • No focused development. No “done”-ness
  • Current Knowledge is not being documented.

Coming off the discussion, I asked the Project Manager, what was his expectation from this. His answer was to make the people stop and think, and learn from this forward.

The funny thing is that the conclusions pertain to managerial directions. Apart from the responsibility for running the procedure, reuse methodology, planning, focused communication and knowledge acquisition are the responsibility of the team manager.

I’m going to fill the form tomorrow, and see how it looks. But apart from the documentation, I cannot see any action items. Just suggestions to change the process. It would be the project manager’s decision on if and how to implement them.

Leave A Reply

Your email address will not be published. Required fields are marked *