Increasing QA by making Developers test

I’m going to outline the single highest impact activity a project can implement to increase QA quality, reduce the number of System Testers (counter intuitive) and increase end user satisfaction. This article will describe the costs and issues associated with having a distinct and separate system testing phase and outline how those risks can be reduced.  For simplicity this article will use the classic Waterfall approach as an example, however the lessons learned here can be equally abstracted and applied to other SDLC approaches.

The Classic System Testing Problem  

I’ve acted as a consultant on numerous projects and observed something that has contributed to the inefficient running of many – and that is having a separate and large system testing phase and team.  To me the size of the system testing team is generally a function of how inefficiently the project is being run [Link to Accountant PM’s here]. I’ve been on large projects where the project overruns – the release date remains the same and the Test Team is put under an enormous amount of pressure. There has to be a better way …..

I’m going to use an analogy – lets say you take a car for an MOT and the person checking it is not the mechanic, but someone that is a MOT Tester.  The MOT Tester tests it and reports a fault; you then have to take the car to the mechanic to get it repaired.  He repairs it, but doesn’t test it.  The vehicle is then taken back to the MOT Test Center.  By introducing this extra layer into the testing process we have done a number of things:

  1. Introduced an inherit project delay: time from Initial check -> Time to repair ->to time to re-check.  Let call this variable (Turnaround Time)
  2. Introduced Risk: The risk of not being able to re-check immediately by the mechanic means that the defect may not have been fixed and result in a failure.  (Fixing Risk)
  3. Introduced a increase in man labour – you have separated the testing process into distinct and separate part and as a result introduced more labour. (Man Cost)
  4. Overal Introduced a increase in project cost (associated labour & time).

Using  the above example:

Lets make the Turnaround time = 7 days

For each defect reported it will add .5 day onto the turnaround time.

Associated costs of mechanic and MOT tester are £50 each.

I’ve put together the following spreadsheet to try and attempt to illustrate the saving:

Mechanic + Tester
Number of Faults Reported  Turnaround TIme Actual Faults Total Man Cost (Days) Delivery Cost (Delay) Man Cost $
10 7 11 22 12 1650
20 7 22 44 17 3300
30 7 33 66 22 4950
40 7 44 88 27 6600
50 7 55 110 32 8250
60 7 66 132 37 9900
70 7 77 154 42 11550

Now compare this to taking it into the garage, getting it checked, serviced and repaired by the same person – this may take 5 days maximum if something minor is wrong.  We may have increased the workload of the mechanic (by making him test) but overall the cost, labour and delivery has been reduced dramatically.

Mechanic Only
Number of Faults Reported  Turnaround TIme Actual Faults Total Man Cost (Days) Delivery Cost (Delay) Man Cost $
10 1 11 16.5 6 1237.5
20 1 22 33 11 2475
30 1 33 49.5 16 3712.5
40 1 44 66 21 4950
50 1 55 82.5 26 6187.5
60 1 66 99 31 7425
70 1 77 115.5 36 8662.5

The above is a very crude illustration (and a bit rubbish- much better example here).  Similarly on IT projects we can see as we increase the turnaround time (the gap between building and testing) the associated costs increase dramatically.  These costs rise exponentially as more developers are involved in the build process (link to a better cost saving article here).  This also increases the supplementary cost of paying for resources while they are not being fully utilized i.e. Testers waiting for delivery of a build.

In real life we can see the logic in having our machines checked by the experts – gas boilers, cars etc.  They check them and they can repair them – but in the IT world and on many projects I’ve worked on it isn’t deemed normal for the experts to check their own work.  If the developers could test their own code then we could vastly reduce the risk of project delays and associated costs of a large system testing team.  If we lived in an ideal world, and allowed a developer to test the code he had developed instantly.  A few things would happen:

I call this “Clean as you GO” 

  1. It becomes easier for the developer to test and also to force the developer to be responsible for more testing (e.g. JUnit).
  2. Defects would be found and repaired quickly (Turnaround time is reduced)
  3. The number of defects released into system testing would be vastly reduced
  4. Associated overhead cost of logging defects, managing and communicating are reduced
  5. Less dedicated system testers would be required
  6. Developers are forced to learn more about the business – Validating rather than verification of specifications.
  7. Coding quality and awareness increases as the developer learns what works and what doesn’t

The key here is being able to give the developer the ability to test code quickly.  If the developer cannot do this then we lose the power to involve them in the testing – as the ‘turnaround time’ increases the developer become more detached from the built system and this creates a non-virtuous circle.  So we need to put mechanisms in place; these are:

  1. An automated build process – developers should be able to check in their code and build against a build instantly
  2. An automated deployment process – this means that the build can be easily deployed to an environment (not just a war file!) with all the associated configuration files, installs, stubs etc
  3. Availability of Test Environments – I’ve been in plenty of places and am surprised at ‘false nexus androstenone pheromones economy’ of not having a number of test environments available. VM’s are ideal for this nowadays
  4. A dedicated build and deployment team. A team that is responsible for policing the source control and pulling together code, config, environment and DB knowledge into a single place cohesively.  They are also responsible for writing the scripts that take that source and deploy working builds into different environments.  This is where most PM’s struggle to grasp the requirement.

A dedicated build and deployment team are the oil in the machine, this team will allow the project to run smoothly – if this role doesn’t exist on large projects it causes a multitude of different downward pressures within a project.  This process of being able to build and deploy quickly is also known or referred to as Continuous Integration.

I think back to when I coded – I compiled, tested, improved, compiled, tested improved – all in the same day.  To ‘throw over the fence‘ for someone else to test was, and still is a remarkable and bonkers concept to me.  I was responsible for finding and cleaning up my own mistakes. This also enabled me to learn how to write better code faster – I learnt from my own mistakes.  When I managed teams of developers I made sure we integrated and tested as often as possible – experience had shown that the longer it was left the more painful the integration issues would be.  Minor issues (that could have been caught and identified) would become major headaches.  Regular integrations also meant issues were evenly spread over the development lifecycle.  Final stage delivery was more predictable and subject to fewer issues (there are always issues). The golden rule was reduce latency within the SDLC between development, build and test.

When I began to work on very large projects and saw the distinct separation of these activities and responsibilities in action – the ‘turnaround time’ increased and caused a vicious circle. Test Teams become larger, a greater number of defects were found, major architectural issues were found much later than they should have been,  developers found they were spending more time repairing defects than actually delivering code.  The essential feedback loop between end users and the delivered product was also fundamentally broken – resulting in a product that didn’t match expectations.  Any project that includes a significant lag of time between development and delivery into test is flawed in so many ways.

Agile is a better approach because it forces the practices of regular builds and delivery.  But I’ve seen projects and managers struggle with it – they adopt the process but just don’t get the principles.  I’ve seen many projects attempt to go from waterfall to Agile – and the Agile process becomes mini-waterfall (“we deliver something every month therefore we are Agile!”).  Its only a matter of time before the backlog fills up with defects.  The fundamental issue I observe here is that projects are being managed by PM’s that do not understand principles of software engineering, they have simply become project accountants and manage up, not down (thats another subject).

So to surmise:

  • If we give the ability to test instantly we also give the developers the ability to test quickly
  • Testing should be factored into the developers time – developers should spend time writing Unit and System tests where possible.
  • Issues are found quickly and fixed, software quality is inherently increased with regular integration
  • The dependency and pressure on system testers is reduced. Less system testers are required
  • System testers can spend more time concentrating on finding issues and confirming quality rather than checking the system works.
  • Introducing a regular build and deployment process is the single largest way a project can increase the quality of the released product.

This separation of activities is a habit many projects and mindsets are still stuck in and is something the industry needs to get out of. A large part of this can be attributed to the fact that  developers are unable to test their own code quickly.  Most projects do not put in place the correct mechanisms for them to do this.  These mechanisms include:

  • The ability to have instant/overnight builds
  • Easy mechanisms for deployments
  • Physical testing environments
  • Stub logic
  • A dedicated build & deployment team
  • Time factored into the project plan for developers to write tests

Project Managers sometimes see the above as overheads – I see them as false economies if not put in place.  The more the project grows (in terms of developers) the more the above mechanisms have to be put in place to mitigate risk.

Key takeaways:

  • If there is a large gap between development activity and system test this introduces significant project risk
  • Clean as you GO” – Developers learn quickly from their own mistakes and create  less mistakes as they move forward.
  • Integrate as often as possible, work towards continuous integration.  This is a central pillar for QA excellence within a project – it will put in place the mechanisms for a virtuous QA circle.
  • Developers should be made to write tests (Clean as you go), but they can only do this if the correct mechanisms have been put in place.
  • Testing documentation should be reduced considerably.
  • System Testers can spend more time finding genuine and complicated defect.  They can confirm quality of a release rather than spending time finding obvious defects.

So if you have a large team of developers and aren’t integrating regularly start doing so – the benefits are substantial and obvious. This will bring about the single largest increase in QA performance for your project.

Note: There is an approach called Test Driven Development – I am aware of it and its principles, but it still doesn’t quite make sense to me.  I think it better to write tests after code has been written (not before), its easer and you get going faster – What I would encourage is putting in a repeatable integration process and forcing developers to write tests against the built system for their individual subsystems. If you haven’t got this then TDD is a long way away anyway.

Leave a Reply

Your email address will not be published. Required fields are marked *