Why Smart Engineers Write Bad Code.

Ever wonder why there is so much bad code out there? You are not alone.

Why Smart Engineers Write Bad Code.

James McNally recommended this book to me a few years ago. It took me a while to get around to reading it. It was well worth it. It was an enjoyable and insightful read. The book examines the fundamental question of why we end up with bad code. You can define bad code in various ways: old code that people are afraid to change, code riddled with bugs and security flaws, projects that ran way over schedule and budget, etc. However you define it, it's clear that there is an abundance of bad code out there. How did we get here? and How do we fix things?

The book contains 11 chapters, and I took some notes as I went through each. I'll just share those here.

Chapter 1 - Early Days

This chapter discusses the early days of computing and the author's early forays into writing code. As you might imagine in the early days there were lots of limitations. Many were imposed by the hardware. There was limited RAM and disk space. Things like long variable names and comments were considered wasteful. Programs were also much smaller - fewer moving parts. Also a lot of variations due to the different types of hardware. A lot of the languages and constructs were hardware-specific. There wasn't a lot of interoperability. The chapter talks a bit about Assembly, BASIC, and Fortran. It also talks about the wide use of global variables and GOTO statements. The author also talks about the difficulties writing reusable code using functions in early versions of BASIC due to the way they were defined and the way variables were passed.

Chapter 2 - The Education of a Programmer

Chapter 2 talks about the author's experience getting a computer science degree from Princeton. The big point in this chapter is the disconnect between academia and industry. This chapter also documents the rise of Pascal and structured programming and the downfall of the GOTO statement. The author also points out the downsides of being self-taught, most notably that it encourages arrogance.

Chapter 3 - Layers

This chapter differentiates between simple scripts that one might in academia and more complicated programs one might write in industry. The author makes a big deal out of APIs. He makes the point that many bugs are the result of a programmer improperly using an API because of miscommunication or misunderstanding between the API developer and the coder using it. He also makes the point that on larger projects, it's not just about your code working but about how your code interacts with other pieces of code and with other coders. Another topic of discussion is maintenance and the need for coders to communicate over time. He also talks about how seemingly small decisions can have rather large consequences.

Chapter 4 - The Thief in the Night

This chapter is all about the rise of the C programming language. The Thief in the Night references C's obsession with performance. In the name of performance, C makes some design decisions that give programmers plenty of rope to hang themselves with. Most of the problems occur due to the way C handles pointers, arrays, and strings. These decisions make it very easy for coders to write code susceptible to buffer overflows. The book does a great job of illustrating how a buffer overflow can be exploited using the Morris Worm as an example.

Chapter 5 - Making It Right

This chapter talks a lot about bugs. It breaks things down into defects, faults, and failures. A defect is a line of code or piece of logic that is incorrect. When it gets executed it leads to some variable having the wrong value - that is the fault. The failure is when, due to the fault, the system misbehaves in a way that the user notices. It also gets into the disconnect between programmers and testers in the age of separate testing departments.

Chapter 6 - Objects

This chapter talks about the rise of object-oriented programming. The author is somewhat skeptical. It talks about the rise of Simula, C++, Objective C, and SmallTalk. Interestingly the author makes the point that the Unix philosophy of connecting a bunch of disparate programs with pipes actually did a much better job of meeting the goals of OOP than OOP itself did.

Chapter 7 - Design Thinking

This chapter talks about the shift from trying to "test in" quality to trying to "design it in". It expands a lot on OOP and design patterns. It talks about the benefits of encapsulation and separation, but also the drawbacks. The biggest drawback is the communication that needs to happen between the user of a class and its designer. OOP enthusiasts claim that the user doesn't need to know how it works and yet sometimes that is necessary or useful. The discussion reminds me of Joel's Leaky Abstractions. On the benefit side, inheritance and interfaces do make unit testing much easier, which is a step in the right direction. It doesn't solve all problems, but it at least solves the ones that we can think of. The author also rails against premature optimization. He makes the point that by optimizing you intentionally make your code complicated because the essence of optimizing is finding special cases.

Chapter 8 - Your Favorite Language

Chapter 8 talks more about C buffer overflows and worms/viruses. There is some discussion about code pages and UTF. This leads into a discussion of error handling versus exception handling. The key differentiator being that errors force the coder to take some action to "do the right thing" (ie handle the error) whereas exceptions automatically handle the exception and force users to go out of the way to "do the wrong thing" (ie ignore the exception). It's all about the default behavior. With errors, they just get dropped on the floor. With exceptions, they get your attention. This all leads into a discussion on choosing and evaluating languages and the fact that it isn't really taught anywhere.

Chapter 9 - Agile

The author is just as skeptical of Agile as he is of OOP. In both cases, it seems that the biggest source of his skepticism is that the biggest proponents of each had a conflict of interest in that they were trying to sell their languages or consulting services and that may have led to some bias. Some of his criticisms include that Agile is good for small projects but doesn't scale, and that Scrum is more about project management and relatively silent on technical skills (unlike XP). There is an interesting comparison between agile and command-and-control style management. The author paints Agile as optimistic. Put the right people in the room. Empower them and get out of the way and good things will happen. Any problems we encounter can be fixed along the way. He contrasts that with the pessimism of command-and-control management. If left to their own devices people they will cause problems, so therefore we need to anticipate every possible problem and try to design it out upfront with policies and procedures.

Chapter 10 - The Golden Age

One of the author's major complaints with OOP, Agile, and many other fads is that none of them are really based on good scientific research. It's woven throughout the book. In Chapter 10 he refers to the 70s as the Golden Age in that during the 70s there was a bunch of research going on. He talks about how a lot of that research still applies today. Unfortunately for a variety of reasons which he outlines, we slowly moved away from that research foundation. He does credit Steve McConnel's Code Complete as one of the few modern software development books to reference academic studies (perhaps I'll have to move that up my list of books to read).

Chapter 11 - Future

In this chapter, the author lays out a plan for "fixing" the profession of software engineering. By "fixing" he means making it more rigorous and more of an engineering discipline. Part of the premise of the book is that this "rigor" is what is missing and leads to all this bad code. His focus is on education.

Here are his top points for how to improve education. I'm just putting the bullet points here - if you want more information, you'll have to go read the book yourself.

  • Force students to learn something new
  • Work to level the playing field
  • Teach students to work with larger pieces of software
  • Emphasize writing readable code
  • Relocate certain well-understood topics
  • Pay attention to empirical studies
  • Set a goal of eventual verification and licensing

Overall Impression

Overall I liked the book. It was entertaining. I agree with most of his conclusions. I'm not sure about the need for certification or licensing. The biggest disconnect I see between academia and industry is the type of projects students versus engineers work on. Students tend to work individually on small projects. Engineers tend to work collaboratively on much larger projects. The other big distinction is that many (if not all) student projects are green field projects that start from a blank slate whereas most engineers are tasked with maintaining existing legacy applications. Those are really two different skillsets. What works in one arena does not necessarily work in the other.