Pick Learning Over Blaming

What opportunities has your organization missed out on by focusing on blaming rather than learning?

Pick Learning Over Blaming
Image by 🆓 Use at your Ease 👌🏼 from Pixabay

In a recent post, I talked about the importance of learning from our mistakes. I recently stumbled upon this Reddit post (thanks to someone crossposting it on LinkedIn).

Here is the title of the reddit post:

Accidentally destroyed production database on first day of a job, and was told to leave, on top of this i was told by the CTO that they need to get legal involved, how screwed am i?

I responded on LinkedIn, but I thought I would capture that response here for posterity. Mostly because I feel like it is a real-world example of failing to understand what I was talking about in my most recent post.

My Initial Reaction on LinkedIn

This employee just dodged a bullet. Who wants to work at a company that is more interested in blaming than learning?

from the Reddit thread:

"Unfortunately apparently those values were actually for the production database (why they are documented in the dev setup guide I have no idea). "

This is the real problem. Firing the new employee is missing the real lesson.

Then there is lesson number 2 - probably more important.

"from what I can tell the backups were not restoring"

They should have frequent backups with an easy way to restore them and test those restores (that seems like the missing part). Destroying the production database may be painful but should never be fatal. If so, they'll have a lot of fun when the next wave of ransomware attacks hits...

Honestly reading that thread, the CTO is the one who should be fired for his attitude.

"The CTO told me to leave and never come back. He also informed me that apparently legal would need to get involved due to the severity of the data loss. I basically offered and pleaded to let me help in some way to redeem myself and I was told that I "completely fucked everything up".

If the CTO was willing to learn from his mistakes, I would consider keeping him on, but as is - he should be the one to go. That company should also consider giving whoever replaces him some leadership training because the existing CTO is showing about zero leadership skills.

Analysis

As you can see this organization has at least three major flaws (in order of least important to most important)

  1. Their onboarding gives a new hire the ability to wipe out the database. This is totally unnecessary and preventable.
  2. They were not in the habit of testing their backups. The Reddit post doesn't come right out and say this, but it does say the restore failed. This implies that either they haven't been testing restores or their testing is inadequate.
  3. The most serious flaw is that they have a culture of blaming instead of a culture of learning. If they had a culture of learning, there would be plenty of lessons to learn from the first two flaws. As is, they are doomed to repeat those same mistakes, and even if by some miracle they manage to address those 2 particular flaws others will crop up and go unfixed because everyone will be afraid to report them for fear of getting fired.

A Better Way

In the Reddit thread, someone mentioned an incident at Amazon that showed a much better way to handle this situation. I am generally not a fan of Amazon, but in this case, I have to give them credit.

For starters, their initial response after realizing what had happened was: "At this point, the focus shifted to preventing additional service impact and recovering the missing ELB state data." as opposed to the Reddit post where "The CTO told me to leave and never come back. He also informed me that apparently legal would need to get involved due to the severity of the data loss." This was before they had fixed the symptoms.

Next Amazon, tried to prevent future issues: "We have made a number of changes to protect the ELB service from this sort of disruption in the future." The last sentence of their statement also talks about learning: "We will do everything we can to learn from this event and use it to drive further improvement in the ELB service."

Nowhere in there, do they actually talk about blame. They do matter-of-factly mention what happened in the very beginning: "Unfortunately, the developer did not realize the mistake at the time." That's the only mention of the individual developer in the entire document. It is solely focused on learning from what happened and preventing it in the future. This is what we should all aspire to.

Want Help Building a Learning Culture?

We help companies create healthy human-centered development processes. Part of that is a focus on learning over blaming. If you struggle with that, we'd be glad to help. Let's set up a call.