Episode #3

How To Avoid Dumb Code Mistakes Part III

Continuing on from part III, this episode focuses on reducing silly mistakes when programming. Do you truth-table logical possibilities to ensure you don't miss a critical branch in your logic? When you discover a bug, do you search the rest of your code for parallels that are caused by similar mechanisms or the same misunderstanding? And, lastly, are you running all those tests and linters you wrote on a regular basis?

August 09, 2020

Show Notes

No notes available for this episode.

Screencast.txt

Transcribed by Rugo Obi

In software, mistakes matter.

This is especially true for indie hackers because the buck stops at them... they have to take ultimate responsibility for any problems in their code.

My approach for dealing with this is to create a checklist of common dumb errors, and this is the third part of that series.

Tip 8; Enumerate Logical Possibilities (e.g. With A Truth Table)

Tip 8, have you enumerated all the logical possibilities?

The best way to do this is with a truth table. You’re looking at one right here from Wikipedia.

It's a table that summarizes how we can combine the values of two or more propositions using some sort of logical operator.

Here we have proposition 'P', and proposition 'Q'. And here’s the logical conjunction of them: the 'AND' operator. True & true is true, true & false is false, etc. etc.

Using a truth table acts as a sort of logical checksum that you've considered all the possibilities. If you failed to consider one of these, you probably have some sort of bug in your system.

Say you're programming a job board, it has a filter for the degrees of commitment. You have a checkbox for part_time, and a checkbox for full_time.

The non-rigorous way to program this is to just dive right in. This is what I initially did back when I first built a job board like this.

Here's the code I had that was buggy.

The way this works is that it looks if the part_time checkboxes is filled and if it is, it filters for all the part_time ones.

Otherwise, if the checkbox for full_time is checked, it filters for all the full_time ones.

At first glance, that seems kind of plausible, but, in fact, it's buggy in a pretty serious way.

The problem is that it only considers half of the logical possibilities.

In this file, I've created a truth table of all the logical possibilities for combining those two booleans. Here are the possibilities.

In the first column is the full_time checkbox, and in the second column is the part_time checkbox.

So first is the possibility where both are checked. This is something I did not consider in my previous code.

Then there's the two possibilities where one is checked, either the full_time or the part_time. The star * here indicates that I had considered it.

And finally, here's the case where neither are checked. I also did not consider that option.

Let's open up my original bad code to see how it fares when we consider it under the lens of the truth table.

So let's imagine case one where both checkboxes are checked.

In this case, my code will hit this line. If the part_time boolean is true, the part_time checkbox is ticked, that evaluates to true. Therefore the $query is where part_time is true. This bit, being in an else statement, is never executed.

The result of this is that records where the full_time attribute is ticked, but not the part_time one, will not be returned when both are ticked. I.e full_time jobs will not be returned from this query.

This is obviously a severe bug.

In the new and improved code over here on the right, I have addressed this particular logical case by having a line for when full_time and part_time are both ticked.

In that case, I add no filters whatsoever. The idea being that no filters corresponds to all records.

As for the fourth logical case here, by random chance that happens to work with my existing code here, because when there are no filters I should return all records. And that's the default not shown in this particular function. So that works.

However, in my more rigorous code here, I've explicitly documented that case and made it clear to all of the programmers what's happening.

One last tip on truth tabling:

When dealing with truthiness condition predicates, you’ll need to consider two to the power of n (2^n) possibilities to be complete. I.e. here we have two predicates, two to the power of two is four, so there are four different conditions to consider. Here we have three inputs, three predicates, and there are two to the power of three possibilities to consider. And with four, we get two to the power four or 16 possibilities.

Tip 9: When You Find A Bug, Proactively Look For Parallel Issues.

When you fix a bug, have you searched through your code for parallel instances of that same bug?

Many of my bugs over the years have been due to a misunderstanding of some programming language concept or some API.

Therefore, when I find one mistake in my code, it's darn likely there are others caused by that same misunderstanding.

Therefore, I find it’s worth greping through my codebase or even through multiple code bases for any signs of similar bugs after discovering the initial bug.

I can give you an example of this.

So here's my product page in Oxbridge Notes, and this is the product description here. You can see it has multiple paragraphs at the moment.

Anyway, one of the authors in my system sent me a complaint that whenever he created these descriptions in the kind of dashboard for authors, his new lines between paragraphs got stripped out by my system. And therefore, it was just one big, ugly, blob of text.

I traced this bug, back to this line of code here (normalized_attributes) being active on the description field. It should not have been active on the description field.

What was happening behind the scenes here is that on this description field here, squish was being called automatically as one of the ways to normalize a field. Let me show you what that actually means in context.

So, here I have some code to set a description with some new lines. And if I print that subject.description, you can see that the new lines are preserved.

However, if I print subject.description.squish, the new lines are no longer there. In fact, let's show you what the description field looks like normally, and then with the squish... You can see they're gone.

Removing the description field from the normalized_attributes macro here fixed the proximate problem, but I was lucky enough to grep through the entire code base and check out all other instances where I called normalize_attributes, in order to check whether or not I had other instances of this bug.

And sure enough, for example, here in the tutor file. I was also doing it to the tutor :description which was a very similar system. Therefore I deleted that version too.

In checking for parallels like I did, I got two or three times as much bang out of my debugging time.

Tip 10: Run Your Linters And Tests Consistently

Tip 10, those linters and tests that you wrote, are you actually running them before every one of your deploys?

Tests slow you down. Sometimes they can take 5 minutes, 10 minutes, 20 minutes... God knows how long, in order to fully execute.

Time and time again, I see undisciplined programmers, including me when I'm misbehaving, trying to outrun these constraints, due to hubris and usually with disastrous consequences.

This is cowboy-coding.

Ultimately, you need to convince yourself that waiting those extra 10 minutes or so to deploy is worth it.

Many people, including myself, use little hacks to get around our impatient tendencies.

For example, on the Semicolon&Sons codebase, I'm only able to deploy after running my continuous integration tests in Circleci.

Here we can see that once those tests pass, I’ll get a job to deploy on Heroku and migrate and all that kind of stuff.

Of course, it's always possible for me to deploy without this, I just have to rely on my own discipline not to find a way to deploy outside of the usual rules.

Another common technique to ensure that unit tests, at least fast tests and linters always get run before every single push to GitHub; git-hook. For example, here's a pre-push git hook I used in one of my codebases.

This code, essentially runs a whole lot of linters command like rspec, like brakeman, that looks for, I think security holes, Rubocop is a Ruby linter, ESLint is a JavaScript linter, Stylelint is a CSS linter, Shellcheck is a shell linter, and so on and so forth. Here's a scan for DEBUGGER_STATEMENTS that accidentally make it into production - you don’t want that. And then it only pushes if all of these checks pass.

Much to my dismay, I found that sometimes in GitHub, code that failed these tests would appear there. This was because a colleague managed to skip these git hooks.

That's because these git hooks usually get run when you do a standard git push, but there is an option here, --no-verify, that allows you to push without it, and my colleague used to use that, and that was extremely frustrating. It pissed me off. In my opinion, that is total abandon and arrogance to think that you're better than your tests, and it made me lose faith in their professional abilities.

Before concluding, I'd like to point out how these tips build on one another and overlap. For example, if you proofread all your commits, then you'll probably notice that the changes you made were not in the right file. Or if you executed every line of code you changed, you probably also noticed the bug as well.

However, because we are imperfect humans, we do these overlapping checks in pursuit of some rigor, that's really the best we can hope for.

Thanks for watching.