Which Type of Optimization Test Should I Run?

Ecommerce managers and agencies are always looking for the quick win during testing. But, if a test is poorly constructed, it can become a costly lesson of what “not” to do. By knowing which tests to use, you can avoid problems like low confidence in your findings. You can also ensure that you are testing your experience thoroughly by examining which elements work together, and which are catastrophic to the user experience.

A/B

Also known as split tests, A/B tests are simple, logical, and easy to read. They can quickly evaluate which experience will perform better. It is often preferred to split the number of tests evenly to get as clear a read as possible on the final results. There is more information on this in our previous A/B test vs Multivariate test blog post.

A/B/n

Similar to A/B testing, but with more possible outcomes. Instead of there being only two possibilities, the traffic is split between three or more experiences. These tests do not necessarily consider every testing element together. All of the experiments in an A/B/n test may be completely different from one another.

Multivariate Full Factorial

Also referred to as full factorial multivariate, this method is modular. You build several experiences and combine them in every possible permutation. This means as new variables are added, the total amount of possible combinations increases in an exponential fashion. This is essential for determining how well your experiences play together.

Algorithmic

Multivariate testing can create hundreds of possible variations. For large ecommerce platforms that can manage this level of traffic, algorithmic testing may be advantageous. Allowing a platform to adjust the levels of play in an experiment based on the results it is observing means that you can reduce the cost of testing (which we will go over in a future post). Now just because the algorithm can reduce costs does not mean that it can create value where there is none – if the variables are bad, then the experiment will still fail. The primary argument against popular algorithms (like Google’s Multi-Armed Bandit) is that it follows a less logical methodology of achieving significance. In short, you reduce costs – but are less certain of your results.

A/A

A/A tests are interesting practices in applied statistics. You split the traffic down the middle and watch to see the variance in your final results. A/A tests help users become comfortable with the level of uncertainty that is commonplace for website optimization. A/A tests will help you optimize your website, and they will also provide insight on the statistical anomalies that are present on your website.
Some of the reasons that A/A data doesn’t align with typical expectations include:

• Small Sample sizes lead to skewed data
• Incorrectly configured
• For RPV (Revenue per Visit) tests – Outliers have begun to occur (Typically an order that is 3 standard deviations above the average order value)
• Sometimes there will be a small preference for the baseline. This could be dependent on how your test provider fires the tag. By moving out the page load time, a testing vendor can bring the overall CR (conversion rate) or AOV (Average order Value) down by a small amount. This amount of time is both unavoidable and benign in most cases.

Keep in mind that you do not have to maintain the same test type throughout your campaign. Like the iterative design methodology used by Human Computer Interfaces, testing can be reworked through multiple cycles of amplification to minimize loss and maximize gain. A typical approach could be starting with a full factorial multivariate test, then eliminating weak performers and rolling the remaining variables into an A/B/n test, then using the best performer as an AB test.