Continuous Integration – Even testing tools have bugs
Part of my work in the first Quarter of 2017, I’ve been spending a lot of time working on CI/CD (Continuous Integration & Continuous Deployment). (I will be publishing a lot more on it in the near future, stay tuned.)
If you aren’t familiar with CI/CD, here is a brief rundown. The automated tools use a test suite in order to validate that any new code hasn’t broken any existing functionality. This happens automatically, and so the developer doesn’t have to perform basic checks for each new piece of code added.
But, there lies the problem. The test suite is only as good as it is written. And it might be lying.
Testing the positive and the negative
The first rule of building a test suite is “boundaries”. When building a test suite, a developer should look for the correct response. But they should also test for potential incorrect responses, and make sure that errors are handled as well.
In testing parlance, the boundary concept provides a good starting point. If the application accepts an input, test what happens if that input is zero characters, 1 character, the maximum number of characters, and then the max + 1. Once those boundaries are tested, it’s a valid assumption that anything past the boundary will react in the same way.
Just a side note. This isn’t always true. In its defense, however, Boundary Value Analysis has shown that testing the boundaries has more value than testing non-boundary conditions. It also provides a framework for testing that will cover most conditions, leaving only the less likely non-boundary conditions on a case-by-case basis.
Now, when testing the boundaries, the programmer will have conditions that will “fail” but correctly. The test would check for failure, and report OK when this happens.
Testing the tests
That’s the obvious part. But tests are only as good as the testing tools that are being used.
What if those tools are lying to you?
Many of the tools that we use also have integrated testing suites. The first line of defense is to use the tools own test suites to make sure the tools are working correctly. Of course, the test suites are only as good as the author, so bugs may still appear.
A comprehensive testing system, therefore, would also provide some dummy data for the tools to test. More importantly, this payload should have a bunch of failure conditions in it, which match the expected usage but should be caught.
For example, a developer might have a test which checks to make sure all fields on an HTML page are visible and enabled. If the test runs successfully, they are all valid. But, what if the test is missing one of the input fields, for some reason? It would still report success even if that field were not in a valid state. The test doesn’t see the field, so it doesn’t test it.
Case in point
Our team manages the Developer Tutorials system on SAP.com. It’s a great setup. We use Github to host MarkDown files and images for lots of tutorials. Our web server polls this system every 15 minutes, and updates our web site based on the contents.
Our system uses a testing suite on each tutorial. Part of the test is a spelling checker, which is run on each tutorial before it is published. We use a testing tool called mdspell, from the project node-markdown-spellcheck. It works great.
We assumed, since it regularly caught spelling errors, that it was working fine.
Imagine our surprise when a bug report came in. Spelling errors in a tutorial. Hm. So we checked the tutorial, and yes, words were wrong. We tested the markdown file, but it tested fine. So we opened it, and the spelling errors were still there.
Eventually, it became clear. There is a bug in the spelling checker. The tool is ignoring certain sections of text under certain unique conditions. And we didn’t realize it.
This was our mistake. We ignored a fundamental rule of testing. We assumed success was fine. What we should have done was create a test file (or better yet, several test files) pulled from our own working copies, but filled with known errors. Errors we know should be detected.
So, we are working to fix this, and adding in a negative success test is part of the solution.
When writing test suites, be sure to include negative test samples. Make sure that your testing system fails when you expect it to.
TL;DR: Even testing tools have bugs. Make sure you look for them.