Teach Programming Test-First

Fun & Innovative / 18 Mar 2022 / Alex Bolboaca / No Comments

I have been looking throughout my career for ideas and practices that would make software developers happier and more productive. I have come to the conclusion that the main way to achieve this goal is to teach developers much earlier about techniques that improve flow and reduce mistakes. The fundamental technique that we have at this point to achieve these goals is Test Driven Development (TDD).

TDD has a mixed reputation within the industry. A small number of people who have practiced this technique for years swear by its benefits. Yet much of the research has produced mixed results. Moreover, many programmers have not heard about TDD, or actively resist using it.

I believe that the benefits of TDD are quite obvious: fast feedback, a continuous cycle of improvement, and building on top of solid ground because all the code written is covered by tests. The people who practiced TDD for a long time will quickly confess to these benefits and the positive influence they have on their code and well-being.

But the same people immediately admit the main challenges of adopting TDD: it requires a different mindset that we’re used to as programmers. The way we learn programming has always been through pushing us to find the solution to a given problem. TDD requires us to analyze a problem, split it into smaller parts, solve them in a cycle, keep improving the code while we’re at it, and trusting that we will reach a good solution at the end.

I believe therefore that the reason why the research is inconclusive regarding TDD is because it is ignoring the long time necessary for this mindset change. Again and again research results are contrary to the daily experiences of those people who have made TDD their own. And while it makes sense to look at the big picture from a research perspective, I don’t find it as a good avenue for my goal to improve software development. I’d rather look at the people who are achieving results with TDD and learn from them.

So here’s a wild idea: if TDD requires a mindset change for people who already know programming, what if we taught programming by using TDD? After all, the TDD cycles are similar to the learning process – partially because TDD is a learning process. Beginners in programming would believe that TDD is part of programming, like many experienced TDDers keep saying. Programming constructs could be introduced organically, like they appeared: to solve various problems, most of them related to code duplication. In addition, we could train the design sense developers so much need in order to produce software designs that serve their products better.

Well, I have done a few experiments around these ideas and the results were very promising. These experiments were limited in time and scope, and were performed empirically rather than scientifically, but I believe there are enough interesting observations to deserve more study.

I am of course wary of the possible ramifications of this way of teaching programming. One clear consequence relates to the limits in applicability of TDD, which we need to take into consideration. It is clearly the case that TDD is not the best fit for creating new algorithms. I wouldn’t expect a cryptography expert to test-drive a new encryption algorithm, or a data structures expert to test-drive a higher performance sorting algorithm. These domains require other approaches. But the converse is also true: most people who know programming work in software development, writing code that needs to work and be maintained for long periods of time, while using off-the-shelf algorithms, yet we still teach them TDD as a separate mechanic from programming.

I will end with this: it is my hope that more people would repeat these experiments in a more scientific manner and report on their results in a proper scientific manner. While I am open to supporting these approaches, I lack the necessary skills and time to perform these experiments on my own. I have decided to open up this wiki so that we can discuss the merits and critique of this learning approach, in the hope that we can all learn from it.

The goal of software development education should be as much to create good habits as it is to teach software development practices. These good habits develop only with practice and time. Early practice will pay off with enormous interest, so I hope you will seriously consider the early practice of TDD for students.

Observations and Experiments

I have made a few experiments teaching programming to beginners by using a test-first / test-driven approach. The experiment went as follows:

Sit / stand side by side with the student
Explain the minimal syntax for writing a unit test, and setting up the harness
Pick a simple problem to solve
Write a first test as an example to the student
Ask the student to write the code that would make the test pass in any way they find natural. This code is typically not syntactically correct, since they know too little about the syntax to write it properly – but that’s fine
Discuss the differences between the natural language and the programming language syntax
Run the test repeatedly after each change in the code. Sometimes the code doesn’t compile, otherwise the test doesn’t pass – and explain that it’s fine
When the test passes, celebrate
Add more tests or
Look at the code and see if any improvements can be suggested

We look for two types of improvements, inspired by the four elements of simple design described by Kent Beck:

improve names: it’s fine to use one-letter names or random names in the beginning, but once the tests pass we can discuss about slight improvements of names. We just strive for a small improvement, not for finding the best names
notice and remove duplication: we look at code patterns that repeat and discuss ways to remove duplication.

Removing duplication leads to a purposeful introduction of programming language constructs to the student. For example:

if a hard-coded number repeats multiple times, we introduce the concept of constant
if a few lines of code repeat multiple times, we introduce cycles or functions
if a few functions use the same data, we introduce objects etc.

The result of the experiments have been promising:

students are actively engaged with the learning process
writing the tests and the natural language version reduces the fear of failure
the programming language constructs are introduced naturally, with a clear purpose, and they make sense to the student
the tests provide a great safety net allowing the natural mistakes of learning to be limited in time and scope

Some of the possible challenges have proven to be less difficult than expected:

writing tests requires less knowledge of the language than writing code, so it’s actually quite easy for beginners to copy a test sample and make small adjustments
students find the test-first approach natural, since they don’t know another approach

It is important to clarify that these have been very limited experiments in scope and size (probably around 5 people), and have been performed empirically.

Common Disputes

When discussing this method, a few common disputes appear that we will address next.

Isn’t it difficult for students to do something extra to learning programming?

This statement comes from a mindset of people who have learned programming first and TDD later, most likely without mastering the technique. The experienced TDDers will tell you that TDD is programming, it is not extraneous to programming.

Moreover, from a beginner’s perspective, whatever they learn as programming is programming. Tests are not an extra thing to do, they are part of programming, because they help you solve problems while building on solid ground. In terms of flow, this beats the alternative of writing hundreds of lines of code and only then digging for errors.

Another argument is that tests are easier to write than code. Tests require less syntactic knowledge than writing the production code and are therefore easier to learn.

Finally, there’s an argument by analogy. If you were a prospective accountant, would you find it extra work to start from the fundamental idea of double-entry book keeping, or would you find it as normal accounting? As a person who learns how to ride a bicycle, would you complain about the extra weight of the learner wheels? The same happens with tests.

What if you meet a problem that can’t be solved with TDD?

This is a valid concern, since we have little evidence on the limits of applicability of TDD. What we can say based on the industry experience with TDD, is that most of the problems from software development can be solved with TDD. However, problems of a highly algorithmic nature may require different approaches; notable examples include cryptography, high performance sorting, or Sudoku solver. It’s worth noting though that most software developers use off-the-shelf algorithms, rarely building their own. But when they do, they should be aware of the limits of TDD.

Hypothesis

Here are a few hypothesis that deserve further investigation:

The benefits of TDD reported by high performers can be obtained for the others by teaching TDD early
A test-first approach to programming is more effective than the traditional one due to improved flow, organic introduction of programming constructs, and building of the duplication detection skill early on
Beginners in programming will have no resistance to learning through a test-first approach; in fact they will find it natural and easier

Remarks

We can add to the learning method more than writing tests first. For example, in the case of languages with manual memory management like C++, we can introduce in the learning cycle a memory leak detector. This allows students to find memory leaks early on instead of hunting them down after writing hundreds of lines of code.