A week ago I read a brilliant post by Alex Berg on patterns for apex testing you can find it here. We then had a good discussion (along with Andy Ognenoff) on Google+ about the different viewpoints we had around how testing should be done. I then disappeared for a week on holiday formulating a rough blog post on the topic in my head. I also want to make it clear that this is a technical discussion not a commentary on anybody’s work. So here goes.
I am going to start of by generalising the way in which we can go about setting up our tests with data:
- Creating the data in the test method inline
- Extracting the data creation to methods in the test class
- Creating an external class which is a factory for creating data (I will differentiate between variations on this method later)
We will discuss all of these in an attempt to decide “which is best”. Before we start however, I would like to clarify, I am looking at this from the viewpoint of a developer performing unit testing. In his brilliant book the Art of Unit Testing Roy Osherove starts clearly by explaining the difference between integration testing and unit testing. It is a difference that I can say I think is often not understood;
Unit testing the is testing of a single unit of code where a unit is defined as a function or method. Integration testing is the testing of two or more dependent units together.
When we are discussing most tests in Salesforce we mean unit tests, testing a single method in a class. When we pull many methods together to move an SObject or piece of data through its lifecycle then we are performing integration testing. Unit testing is there so that when another developer goes and makes a change to the functionality of the system’s code or object behaviour the tests will fail (which is good news) so the system can be refactored properly to allow the existing functionality (if required to occur as well as the new behaviour). These points are key to the discussions.
Creating Data Inline
This is the situation as shown below
The method creates its test data in lines 4 and 5 before inserting it into the database on line 8 where a trigger or validation rule will perform an action we wish to assert upon. The benefit of this way of performing test data creation is that it is quick and easy, we know any data created by code called during a test method gets automatically destroyed so it is safe to create data like this. However, as soon as we have another test method on this object or in this class using that object, we are likely to have duplicate code. This is evidently bad for when we update methods or objects in the future. This will be exacerbated in the situation where we have to create multiple objects (either related objects or for use in lists etc.) and become unwieldy.
Extracting to a Method
In this situation we extract the code on lines 3-8 to a method for reuse in the test class multiple times (see below):
Here we have extracted the creation of the object for testing out to a separate method that is still within our test class. If we wish, we can create many other similar methods for other objects needing to be setup in this test class which will again give us more controlled and useful code.
I believe this to be the ideal situation for unit testing. The data required for any units of code under test is visible near to the code. It is clear as to what the actual test is doing and if the system is edited to alter the behaviour such that the tests fail, then the data is in a central place for updating this area of the codebase to get it working. It also allows any developers that are adding or updating code to the class under test to add new tests quickly and easily with the object needed. You are also adding the minimum amount of required data at all times which is important for speed.
Many will argue that having a centralised set of classes that handles all data creation is better (for integration testing I agree). Let’s discuss this.
Test Data Factory
In this situation we will have a test data factory class that we can call to create SObject instances for use in the tests. This system is the one described by Alex on his blog post.
On its positive side, this means that we have a single class (or set of classes) that we call in order to create some test data. If the system changes then we merely change the data once and this should update every test method where the data is used. It means that the same data is being used consistently across the entire test system as well which is helpful, and if written in a correct way, the data creation methods can be called to provide demo data for salespeople by calling the methods in the system.
The problems arise when we consider the following. Firstly, what happens when someone has two test cases where they needs the data slightly differently in each situation? We could setup the classes so the factory methods take in a series of parameters to create a personalised object, but this removes the power of having consistent data and also leaves us with hard to read code. This data must now also be closely regulated by the development team as any changes to it could have repercussions throughout the system’s unit tests. As the system grows in size and complexity, the factory can get very entangled and difficult to maintain for a series of small tests and can also lead to an increase in the time it takes to run tests. For example, we are writing a unit test that asserts whether a particular field value has been set properly. We do not need any related objects or other data created to perform this test, but it is more than likely that our test data creation methods will create surrounding data by default as it is being used throughout the system. For anyone wanting to go into the system and quickly create a test or update/add some functionality, they must then check the test factory is updated correctly and that this does not impact upon other areas of unit tests – developer flexibility is hampered as the data creation class becomes locked down and protected. In such large systems as well, and breaking of the test data class can cause all the unit tests to fail whether they are related or not as background data that is not needed fails to get created.
Having used this method of doing testing as well, you can find it leads to people writing tests without really understanding what they are doing and why they need it. It is good for a developer to write a unit test where they must set their data up for the test. That way they can know for definite what they have tested and be sure they are happy it works.
I personally employ this method, but use it for integration testing. If a change to the data can cause more that just the unit tests associated to that object to fail, you are not doing unit testing. Unit testing is about creating small piece of data to test small pieces of functionality so that if a change is made to a method, we can check it still works and acts as we expect it to. Having the test data being created in the test class (albeit in neatly extracted methods) allows us to know exactly what we are testing (and why). Having a full data creation factory is much more useful for integration testing as it can create a single set of data for use in testing the system or for use in demonstrations, but without interlinking the system to tightly.
Closing Thoughts
I would love to hear what people think. As I have said, these are personal preferences based upon many hours of writing and fixing tests and also having done testing in more mature languages such as C#.
An aside:
I had a friend who told me the biggest risk Salesforce had was that it is so accessible that non-developers often start writing code and it reminded him of the MS Access boom when many poorly written systems started to fall over as they had been written by people that were not thinking far enough ahead. I am not suggesting that anybody reading this post is like that or that Salesforce should become less accessible. I do however wonder whether we would have a Billion Lines of Apex still if had all been written by people who were pure developers. For another post however.