jsjacobAtWork-old: September 2010

Image © iStockphoto.com / budgetstockphoto

We're moving to soft-delete for many objects in our application. By saving old versions of objects we hope to correlate the effects of changes over time. This is a big deal for us because, with detailed history, we expect to increase our predictive capability by several orders of magnitude. It's also a big deal because it's a radical shift in our data model and we built many features assuming deleted objects go away. Once we decided we wanted the capability, however, it made sense to start sooner rather than later. I'm definitely excited about the analytics possibilities, but I'm also excited about a potential user-facing feature soft-delete enables: undo.

I dislike confirmation boxes. You know: "Are you sure you want to do this? Okay or cancel". I find them only incrementally better when they explicitly indicate the action to be performed: "Save these changes or discard these changes?".

Surely it's better to ask a user before making a permanent change than not to ask? Well, yes. But the problem is that there are too many permanent changes presented to the user. So many, I propose, that users learn to approve changes without really thinking about it. Hence, the intended purpose of the confirmation box -- to force the user to think about the change -- is lost in a flurry of okay/cancel dialogs.

The solution is not more confirmation boxes. The solution most certainly is not more annoying confirmation boxes.

The solution is to make the change not permanent. Put another way, if an action is not permanent then there is no need to ask the user for confirmation. Rather, the user is free to revert (i.e., undo) the operation should the user decide the operation was not intended.

From a task-accomplishment perspective it seems like a small, or maybe non-existent, difference. The task was performed or not performed. But from a user perspective it means the difference between an application that is frightening or inviting. Confirmation boxes might as well show skull and crossbones. Undoable operations encourage the user to explore boldly.

I could go on about the benefits of undo over confirmation, but I think others have written much better articles than I. I recommend these two articles.

Follow me on Twitter @jsjacobAtWork.

Image © iStockphoto.com / Evgeny Rannev

I love Continuous Integration. Just as I can't imagine writing software without continually building the code, I can't imagine committing code without a full test suite automatically running. However, the automated test suite sometimes gets stuck on things on which we think it shouldn't get stuck.

We run a Hudson instance as our continuous integration server. Hudson monitors Subversion for check-ins and starts a new build/test cycle within five minutes of a change. At present, a successful full build/test takes just over an hour with most of the time spent in tests.

Ideally we don't want any tests to fail so any test failure should mean a necessary code change to fix the test. But in practice there are some tests that are okay to fail temporarily. The vast majority of the "okay to fail" tests are ones using external sandbox systems. External sandbox systems go down. As long as they go down only occasionally and temporarily and not when we are actively developing code against them, it really doesn't bother us that much. Except that the corresponding tests that try to communicate with them fail.

The problem is that, in the Maven process, failing tests in one module prevent tests running in dependent modules. In our project hierarchy these external systems are used in a project on which many other projects depend. We would like to continue testing these dependent projects especially since most of the tests do not depend on the external system.

There may be a way to restructure Maven projects to reduce the test dependencies, but I think that would lead either to the external tests moving to a different project from the classes they test or to an explosion of projects causing a source control nightmare. I don't like either of these scenarios.

I'm thinking more along the lines of test levels I learned at Pacific Bell many years ago. At Pacific Bell test levels were arranged according to the extent of the system covered by the test. Applying these test levels to our current application:

Unit: the individual method or class using test data in development environment.
One-up/one-down: partner methods and classes using test data in development environment.
End-to-end: full application (backend and UI) using test data in development environment.
Test environment: full application (backend and UI) using test data in test environment.
Production environment: full application (backend and UI) using production data in production environment.

I believe our current automated tests apply to the first four levels. If we categorize our existing tests each will fit sufficiently into one of these levels. For example, a test that verifies a class in isolation is categorized as Level 1. And on the other end of the spectrum, a test that verifies connectivity with a sandbox environment is categorized as Level 4 (while potentially just a one-up/one-down test, an external system that might better be classified as "test environment" rather than "development environment" which is not available in the lower three levels. A simple test that used a mock system could be categorized at Level 2 because a mock system implies not using the external system).

I imagine configuring Maven to run four passes of tests across the projects with each successive pass adding a new level of test. During the first pass only unit tests are run. During the second pass only one-up/one-down tests are run. The third and fourth passes run end-to-end and test environment tests respectively. At any point an failure prevents the later tests from running. However, the key is that a higher-level test failure will not prevent a lower-level test from running no matter how the project dependencies are configured.

This would give us more information about where problems lie before digging into the console output. For example, if a release build fails because of tests in level 2 we know something is very wrong and the release must be stopped. On the other hand, a release build failing in level 4 tests might be okay because the lower three levels passed in all projects: maybe the problem is limited to a specific external sandbox system. We could decide to accept the risk and build the release packages.

And hey, we can start saying things like, "Please run a level 3 diagnostic."

Follow me on Twitter @jsjacobAtWork.

jsjacobAtWork-old

Saturday, September 25, 2010

Are you sure you want to read this? Ok / Cancel

Thursday, September 9, 2010

Categorizing automated tests — Make it so!