Добавил:
Upload Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
97-things-every-programmer-should-know-en.pdf
Скачиваний:
37
Добавлен:
12.05.2015
Размер:
844.5 Кб
Скачать

Take Advantage of Code Analysis Tools

The value of testing is something that is drummed into software developers from the early stages of their programming journey. In recent years the rise of unit testing, test-driven development, and agile methods has seen a surge of interest in making the most of testing throughout all phases of the development cycle. However, testing is just one of many tools that you can use to improve the quality of code.

Back in the mists of time, when C was still a new phenomenon, CPU time and storage of any kind were at a premium. The first C compilers were mindful of this and so cut down on the number of passes through the code they made by removing some semantic analyses. This meant that the compiler checked for only a small subset of the bugs that could be detected at compile time. To compensate, Stephen Johnson wrote a tool called lint — which removes the fluff from your code — that implemented some of the static analyses that had been removed from its sister C compiler. Static analysis tools, however, gained a reputation for giving large numbers of false-positive warnings and warnings about stylistic conventions that aren't always necessary to follow.

The current landscape of languages, compilers, and static analysis tools is very different. Memory and CPU time are now relatively cheap, so compilers can afford to check for more errors. Almost every language boasts at least one tool that checks for violations of style guides, common gotchas, and sometimes cunning errors that can be difficult to catch, such as potential null pointer dereferences. The more sophisticated tools, such as Splint for C or Pylint for Python, are configurable, meaning that you can choose which errors and warnings the tool emits with a configuration file, via command line switches, or in your IDE. Splint will even let you annotate your code in comments to give it better hints about how your program works.

If all else fails, and you find yourself looking for simple bugs or standards violations which are not caught by your compiler, IDE, or lint tools, then you can always roll your own static checker. This is not as difficult as it might sound. Most languages, particularly ones branded dynamic, expose their abstract syntax tree and compiler tools as part of their standard library. It is well worth getting to know the dusty corners of standard libraries that are used by the development team of the language you are using, as these often contain hidden gems that are useful for static analysis and dynamic testing. For example, the Python standard library contains a disassembler which tells you the bytecode used to generate some compiled code or code object. This sounds like an obscure tool for compiler writers on the python-dev team, but it is actually surprisingly useful in everyday situations. One thing this library can disassemble is your last stack trace, giving you feedback on exactly which bytecode instruction threw the last uncaught exception.

So, don't let testing be the end of your quality assurance — take advantage of analysis tools and don't be afraid to roll your own.

By Sarah Mount

Test for Required Behavior, not Incidental Behavior

A common pitfall in testing is to assume that exactly what an implementation does is precisely what you want to test for. At first glance this sounds more like a virtue than a pitfall. Phrased another way, however, the issue becomes more obvious: A common pitfall in testing is to hardwire tests to the specifics of an implementation, where those specifics are incidental and have no bearing on the desired functionality.

When tests are hardwired to implementation incidentals, changes to the implementation that are actually compatible with the required behavior may cause tests to fail, leading to false positives. Programmers typically respond either by rewriting the test or by rewriting the code. Assuming that a false positive is actually a true positive is often a consequence of fear, uncertainty, or doubt. It has the effect of raising the status of incidental behavior to required behavior. In rewriting a test, programmers either refocus the test on the required behavior (good) or simply hardwire it to the new implementation (not good). Tests need to be sufficiently precise, but they also need to be accurate.

For example, in a three-way comparison, such as C's strcmp or Java's String.compareTo , the requirements on the result are that it is negative if the left-hand side is less than the right, positive if the left-hand side is greater than the right, and zero if they are considered equal. This style of comparison is used in many APIs, including the comparator for C's qsort function and compareTo in Java's Comparable interface. Although the specific values -1 and +1 are commonly used in implementations to signify less than and greater than, respectively, programmers often mistakenly assume that these values represent the actual requirement and consequently write tests that nail this assumption up in public.

A similar issue arises with tests that assert spacing, precise wording, and other aspects of textual formatting and presentation that are incidental. Unless you are writing, for example, an XML generator that offers configurable formatting, spacing should not be significant to the outcome. Likewise, hardwiring placement of buttons and labels on UI controls reduces the option to change and refine these incidentals in future. Minor changes in implementation and inconsequential changes in formatting suddenly become build breakers.

Overspecified tests are often a problem with whitebox approaches to unit testing. Whitebox tests use the structure of the code to determine the test cases needed. The typical failure mode of whitebox testing is that the tests end up asserting that the code does what the code does. Simply restating what is already obvious from the code adds no value and leads to a false sense of progress and security.

To be effective, tests need to state contractual obligations rather than parrot implementations. They need to take a blackbox view of the units under test, sketching out the interface contracts in executable form. Therefore, align tested behavior with required behavior.

By Kevlin Henney

Test Precisely and Concretely

It is important to test for the desired, essential behavior of a unit of code, rather than test for the incidental behavior of its particular implementation. But this should not be taken or mistaken as an excuse for vague tests. Tests need to be both accurate and precise.

Something of a tried, tested, and testing classic, sorting routines offer an illustrative example. Implementing a sorting algorithm is not necessarily an everyday task for a programmer, but sorting is such a familiar idea that most people believe they know what to expect from it. This casual familiarity, however, can make it harder to see past certain assumptions.

When programmers are asked "What would you test for?" by far and away the most common response is "The result of sorting is a sorted sequence of elements." While this is true, it is not the whole truth. When prompted for a more precise condition, many programmers add that the resulting sequence should be the same length as the original. Although correct, this is still not enough. For example, given the following sequence:

3 1 4 1 5 9

The following sequence satisfies a postcondition of being sorted in non-descending order and having the same length as the original sequence:

3 3 3 3 3 3

Although it satisfies the spec, it is also most certainly not what was meant! This example is based on an error taken from real production code (fortunately caught before it was released), where a simple slip of a keystroke or a momentary lapse of reason led to an elaborate mechanism for populating the whole result with the first element of the given array.

The full postcondition is that the result is sorted and that it holds a permutation of the original values. This appropriately constrains the required behavior. That the result length is the same as the input length comes out in the wash and doesn't need restating.

Even stating the postcondition in the way described is not enough to give you a good test. A good test should be readable. It should be comprehensible and simple enough that you can see readily that it is correct (or not). Unless you already have code lying around for checking that a sequence is sorted and that one sequence contains a permutation of values in another, it is quite likely that the test code will be more complex than the code under test. As Tony Hoare observed:

There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies and the other is to make it so complicated that there are no obvious deficiencies.

Using concrete examples eliminates this accidental complexity and opportunity for accident. For example, given the following sequence:

3 1 4 1 5 9

The result of sorting is the following:

1 1 3 4 5 9

No other answer will do. Accept no substitutes.

Concrete examples helps to illustrate general behavior in an accessible and unambiguous way. The result of adding an item to an empty collection is not simply that it is not empty: It is that the collection now has a single item. And that the

single item held is the item added. Two or more items would qualify as not empty. And would also be wrong. A single item of a different value would also be wrong. The result of adding a row to a table is not simply that the table is one row bigger. It also entails that the row's key can be used to recover the row added. And so on.

In specifying behavior, tests should not simply be accurate: They must also be precise.

By Kevlin Henney

Соседние файлы в предмете [НЕСОРТИРОВАННОЕ]