Specification and good specification: What’s the difference?

This is a guest post by Thomas Sundberg

What makes one specification a bad specification and another specification a good specification? What is the fundamental difference between two specifications?

Software is special

One important difference between physical things and programs is that physical things are complicated to test. Physical things usually need special tools to make it possible to test them automatically. Otherwise, they are tested manually.

Manual testing is expensive and a lot of work is done to automate testing. Perhaps even more in the physical world than in the world of computers.

Software is easy to test. It is also cheap to test. To run a test suite doesn’t cost a lot of money.

The biggest problem is that so few developers write testable code. Writing testable code is hard and it is even harder if you start from the wrong end, adding the tests last. Writing testable code it’s easier if you work test first. Often really easy. Unfortunately, it is still a skill many developers have to learn.

A good specification, however, will support writing testable code and will act as an acceptance test.

Easy to understand

A good specification is a specification that isn’t ambiguous. It is also including. Anyone who can understand the problem should be able to read the specification and say “Yeah, that’s about right”.

Including also implies that the reader should not have to learn a special skill, such as programming, to be able to understand the specification.

The language must be as natural as possible. The room for interpretation must be so small that it really doesn’t exist. Concrete examples are nice because if they are valid, you can’t argue with them. A good specification contains good examples that illustrate what should work when we are done implementing something.


A good specification is possible to execute. If the execution passes, then you can assume that the specified, wanted behavior is implemented and works.

A programming language can be used to specify the behavior of a program in such a way that it is executable. But a programming language is not including. Just because you can read natural language doesn’t mean that you can read the code for a computer program.

Reading and understanding a programming language is a skill most people don’t have. It is difficult to read and understand a program where the core intent may be hidden behind a lot of details. The aim is a format that anyone can read and understand. Most people are able to understand short examples. This helps to achieve an inclusive specification, while still being executable.  


A formal language called Gherkin fulfills the requirements that a good specification is:

  • Including – everyone can read and understand examples
  • Executable – it is easy to transform examples specified using Gherkin to a programming language and execute them

Good specifications can be expressed using Gherkin.

Gherkin examples

An example of Gherkin may look like this:

Feature: Refund item

  Scenario: Jeff returns a faulty microwave
    Given Jeff has bought a microwave for $100
    And he has a receipt
    When he returns the microwave
    Then Jeff should be refunded $100

It follows a strict formal format and is possible to execute. But it is human readable. I won’t describe it, you can read for yourself.

This example has a few nice properties.

  • It exemplifies a rule, Jeff is allowed to return the Microwave. The rule as such isn’t expressed, but there is an example of a situation where a return is allowed.
  • It doesn’t give away any implementation details such as is this a web application or not. You can’t tell from the example.

This example alone is not sufficient for creating a complete system for returning goods. But it exemplifies one thing that should work.

Another example of Gherkin may look like this:

Feature: Search feature for users
  This feature is very important because it will allow users to filter products

  Scenario: When a user searches, without spelling mistake, for a product name present in inventory. All the products with similar name should be displayed

    Given User is on the main page of www.shop.com
    When User searches for phones
    Then should the search page should be updated with the lists of phones

Is this a good or bad example?

In my opinion is a bad example because:

  • The scenario headline is long and rambling. It doesn’t even fit my screen. I have to scroll to read it.
  • It is generic, it talks about User. It is probably a user that uses the application, but it is much nicer to talk about a person. Someone you can picture in front of you. Or create an image of and post on the team wall.
  • It tells me about the technology. I can understand that this is a web application. Information that is uninteresting to know if I want to understand how the system is supposed to work. It is easy to get lost among technical details about a web application when the interesting question is “Is this example valid?“.
  • It has, maybe, incidental details. Is it important if the spelling is correct or not? What happens if the search criteria are misspelled? Should no result be returned? Do we need to implement a spell checker to know when the search word is spelled correctly? I don’t think the spelling matters. And yet it is mentioned.
  • I wouldn’t choose the Feature description that states “This feature is very important because it will allow users to filter products”. The feature as such is created, that it is important is obvious for me. It wouldn’t exist unless it wasn’t important. This is perhaps not the main reason why I think this is a bad example, this just add to my opinion.

Good Gherkin and bad Gherkin

Gherkin is a nice, formal, language that can be used to create a good specification. But as with any tool, it is possible to use in a bad way. It is possible to create really bad specifications using Gherkin. The difference between good and bad Gherkin is in the details.

Both examples above follow the same format. They are both relatively short, one is four lines long and the other is three. Clearly, you can’t use the length to determine if an example is good or bad. That is at least the case with short examples. Long examples, longer than say 5 – 7 lines Gherkin are usually not good because they are unfocused and contain many details about the execution rather than the desired behavior.

One difference is that the first example is more concrete than the last example. The last example is a bit generic and generically is bad in this domain. Examples should be very concrete. Too generic examples are examples of bad specifications.

An example of bad Gherkin can be when you specify a web application and the examples you create talks about details seen on the screen rather than the desired behaviour. Screen details are important, but talking about a specific button or link does not describe the behaviour the user should experience. We don’t see that in the web example above. But it would be easy to add and it would make the example worse.

The user is interested in carrying out a specific task, not navigating a web application. The example must therefore talk about the goal of the user and the business rules that can be applied. An example may be a user who wants to return an item they bought. There are rules regarding item returns that the application must support. The customer must be able to present a receipt. The item may not have been purchased a long time ago. These are examples of rules that are important and will remain the same if the application is a web application or a manual process in a store. Talking about screen details will obfuscate the important behavior and focus on implementation details.

Trigger discussions

Examples may also trigger interesting questions such as: “How should the system behave when a customer wants to return an item but doesn’t have a receipt?” Is it possible? It might be possible under some circumstances.

Good examples trigger interesting discussion about the behavior the system should support. It is easy to loose sight of the behavior when navigation details are discussed.

Remember, software development is about learning. One way of learning is discussing a problem. Examples may be the best way to bring down the discussion from a very high altitude where it is easy to talk about generic behavior and therefore be ambiguous.


A good specification is:

  • Easy to understand
  • Executable
  • Acting as acceptance criteria

Expressing examples using Gherkin doesn’t make the examples good specifications. It is possible to be too generic when using Gherkin and therefore misses the opportunity to create a good, executable, specification. Therefore, the advice is to look carefully at the way you use either Gherkin or other method expressing a specification. 




I would like to thank Malin Ekholm and Alex Bolboaca for feedback and proofreading.

Interested in learning more about BDD? Have a look at the BDD with Cucumber workshop Thomas is holding in Bucharest. 

Photo source

More from the Blog

Leave a Comment

Your email address will not be published. Required fields are marked *

    Your Cart
    Your cart is empty
      Apply Coupon
      Available Coupons
      individualcspo102022 Get 87.00 off
      Unavailable Coupons
      aniscppeurope2022 Get 20.00 off
      Scroll to Top