To A/A test or not is a question that invites conflicting opinions. Enterprises, when faced with the decision of implementing an A/B testing tool, do not have enough context on whether they should A/A test. Knowing the benefits and loopholes of A/A testing can help organizations make better decisions.
In this blog post, we explore why some organizations practice A/A testing and the things they need to keep in mind while A/A testing. We also discuss other methods that can help enterprises decide whether or not to invest in a certain A/B testing tool.
Why Some Organizations Practice A/A Testing
A/A testing is done when organizations are taking up a new implementation of an A/B testing tool. Running an A/A test at that time can help them with:
- Checking the accuracy of an A/B Testing tool
- Setting a baseline conversion rate for future A/B tests
- Deciding a minimum sample size
Checking the Accuracy of an A/B Testing Tool
Organizations who are about to purchase an A/B testing tool or want to switch to a new testing software may run an A/A test to ensure that the new software works fine, and that it has been set up properly.
In an A/A test, a web page is A/B tested against an identical variation. When there is absolutely no difference between the control and the variation, it is expected that the result will be inconclusive. However, in cases where an A/A test provides a winner between two identical variations, there is a problem. The reasons could be any of the following:
- The tool has not been set up properly.
- The test hasn’t been conducted correctly.
- The testing tool is inefficient.
Determining the Baseline Conversion Rate
Before running any A/B test, you need to know the conversion rate that you will be benchmarking the performance results against. This benchmark is your baseline conversion rate.
An A/A test can help you set the baseline conversion rate for your website. Let’s explain this with the help of an example. Suppose you are running an A/A test where the control gives 303 conversions out of 10,000 visitors and the identical variation B gives 307 out of 10,000 conversions. The conversion rate for A is 3.03%, and that for B is 3.07%, when there is no difference between the two variations. Therefore, the conversion rate range that can be set as a benchmark for future A/B tests can be set at 3.03–3.07%. If you run an A/B test later and get an uplift within this range, this might mean that the result is not significant.
Deciding a Minimum Sample Size
A/A testing can also help you get an idea about the minimum sample size from your website traffic. A small sample size would not include sufficient traffic from multiple segments. You might miss out on a few segments which can potentially impact your test results. With a larger sample size, you have a greater chance of taking into account all segments that impact the test.
“A/A testing can be used to make a client understand the importance of getting enough people through a test before assuming that a variation is outperforming the original.”
Problems with A/A Testing
In a nutshell, the two main problems inherent in A/A testing are:
- Ever-present element of randomness in any experimental setup
- Requirement of a large sample size
We will consider these one by one:
Element of Randomness
As pointed out earlier in the post, checking the accuracy of a testing tool is the main reason for running an A/A test. However, what if you find out a difference between conversions of control and an identical variation? Do you always point it out as a bug in the A/B testing tool?
The problem (for lack of a better word) with A/A testing is that there is always an element of randomness involved. In some cases, the experiment acquires statistical significance purely by chance, which means that the change in the conversion rate between A and its identical version is probabilistic and does not denote absolute certainty.
“Suppose you set up two absolutely identical stores in the same vicinity. It is likely, purely by chance or randomness, that there is a difference in results reported by the two. And it doesn’t always mean that the A/B testing platform is inefficient.”
Requirement of a Large Sample Size
One problem with A/A testing is that it can be time-consuming. When testing identical versions, you need a large sample size to find out if A is preferred to its identical version. This in turn will take too much time.
“The amount of sample and data you need to prove that there is no significant bias is huge by comparison with an A/B test. How many people would you need in a blind taste testing of Coca-Cola (against Coca-Cola) to conclude that people liked both equally? 500 people, 5000 people?” The entire purpose of an optimization program is to reduce wastage of time, resources, and money. They believe that even though running an A/A test is not wrong, there are better ways to use your time when testing. In the post they mention, “The volume of tests you start is important but even more so is how many you *finish* every month and from how many of those you *learn* something useful from. Running A/A tests can eat into the “real” testing time.”
Other Methods and Alternatives to A/A Testing
A few experts believe that A/A testing is inefficient as it consumes a lot of time that could otherwise be used in running actual A/B tests. However, there are others who say that it is essential to run a health check on your A/B testing tool. That said, A/A testing alone is not sufficient to establish whether one testing tool should be preferred over another. When making a critical business decision such as buying a new tool/software application for A/B testing, there are a number of other things that should be considered.
That said, there is still a set of experts or people who would opt for alternatives such as triangulating data over an A/A test. Using this procedure means you have two sets of performance data to cross-check with each other. Use one analytics platform as the base to compare all other outcomes against, to check if there is something wrong or something that needs fixing.
And then there is the argument—why just A/A test when you can get more meaningful insights by running an A/A/B test. Doing this, you can still compare two identical versions while also testing some changes in the B variant.
When businesses face the decision of implementing a new testing software application, they need to run a thorough check on the tool. A/A testing is one method that some organizations use for checking the efficiency of the tool. Along with personalization and segmentation capabilities and some other pointers mentioned in this post, this technique can help check if the software application is good for implementation.