Archive

Posts Tagged ‘online testing’

Marketing Optimization: How to design split tests and multi-factorial tests

January 23rd, 2012 No comments

I’ve got a research question. Now what do I do with it?

A few weeks ago, Daniel Burstein wrote a blog about writing research questions. In that blog post, we emphasized the importance of asking “which” rather than “what” questions because a “which” question is clearly testable.

You might ask, “Which page format results in the most lead submissions?” or “Which price point generates the most revenue?” Both questions are clearly stated and include two key pieces of information:

  • An independent variable you are going to test
  • The dependent variable you will use to measure your results

 

To know if something is better, first you must know if it is different

With the research question on paper, we can easily create a hypothesis. For the former question: “All page formats will result in the same number of lead submissions.” This type of hypothesis is so famous in research circles that it has a name: “The Null Hypothesis.”

In general terms, the null hypothesis states that varying the independent variable will result in no change to the dependent variable.

In other words, you’re testing to see if changing the page (the independent variable) will change the number of leads (the dependent variable). After all, if there is no change, one cannot be any better than the other.

Why not “The new layout will result in the most lead submissions,” you ask. Because there is no concrete reason to know that there will be a change. Besides, if you already knew the effect of A on B, why would you need to test it?

 

Control vs. Treatment(s)

In most cases, there will be an existing page that all new versions will be compared to. This page is termed the “Control,” and all new pages are dubbed “Treatments” to guide comparisons later.

The next step in testing your research question is to decide on the most appropriate test structure. This will depend on the number of variations you will be testing, and on the amount of traffic your site receives. At MECLABS, our research analysts do this visually using a small flowchart to represent the flow of traffic to the control and treatment pages.

Take your latest research question and write it down. Below it, write out the following until you have listed all the variations to be tested.

 

Click to enlarge

 

At the right hand side of the page, write “All Traffic.” At this point, you need to determine if your traffic should be evenly split between all the tests or if you will pull only a small portion of  traffic into the treatment pages and maintain most of the flow to the existing Control page.

At MECLABS, our analysts use the Test Protocol document to determine how many site visits are required to achieve valid results given a set of treatments and typical conversion rates on the existing page. This process is covered in our Online Testing Course.

 

Split tests

Draw lines between “All Traffic” and the pages to the left showing the split and mark each with a percentage of traffic to be sent in that path (See below). This design is called a split test. It is very important that traffic is randomly split between the treatments and control. In a high traffic site, the percentage sent to the control can be higher than what is sent to the treatments, as long as you will easily meet the required minimum sample size.

 

Click to enlarge

 

Multi-factorial tests

The split test design works for tests of only one step, but sometimes we need to test more than one step in a process. We have two independent variables that we will manipulate separately. For example, if your research question is, “Which checkout process generates the most revenue?” you might want to test several variations of cart layout and payment page layout at the same time.

If you were to test [Cart and Payment Treatment 1] against [Cart and Payment Treatment 2], your results might tell you that [CT and PT 1] produced 15% more revenue than [CT and PT 2], but you would never learn that Cart Treatment 1 paired with Payment Treatment 2 would have yielded an even higher lift!

Essentially, you have two research questions: “Which cart design will generate the most revenue?” and “Which payment design will generate the most revenue?” This means you have two independent variables and one dependent variable.

 

To test multi-step processes, researchers use a research design called a factorial test. Each variation in each independent variable is tested together so that all combinations are tested. A typical factorial design is represented below.

 

Click to enlarge

 

Because the traffic is sent evenly to each pairing, the factorial research design accounts for the natural dependency between steps 1 and 2. If a viewer does not like Cart Treatment 1, they will not proceed to the Payment step, but since you have also tested other combinations of Cart and Payment, you can assume the effect is balanced out.

A factorial test requires a lot more traffic than a split test to achieve validity, but it also gathers a lot more insight. From the results of a factorial test, you can infer not only the winning combination but also which treatment of each step was most successful. This subtle distinction comes in handy if you then wanted to test further refinements of the process.

 

Click to enlarge

 


There are some situations that cause problems with research design. It may not always make sense to pair all the possible combinations together, in which case a factorial design is not possible and a split test should be used instead.

Don’t make the mistake of forming all but one or two pairs of the factorial design. An asymmetrical design does not neutralize the dependency of the second step on the first. In other words, if every factor isn’t matched with every possible other factor, you could overlook a potentially big lift.

 

Traffic volume is crucial for factorial tests

One common reason some marketers don’t run multi-factorial tests is a low-traffic page. For example, with only 3,000 hits a month, a 7% historical conversion rate, and six treatment pairs (2 payment designs x 3 cart designs), it could take as much as three years to validate the factorial design shown above!

When faced with an unreasonable completion time, you have a few choices to make. You can test fewer treatments, resulting in quicker accumulation of hits on each treatment, or you can test one step of the checkout process at a time.

You also have the option to test pairs of pages in a split test, losing the additional insights given by the factorial design. All of those options will reduce the time needed to validate the test.

 

Sequential tests

Some marketers try to learn about which treatment works best through sequential tests. Essentially, one page was live, or one email was sent, and then the page was changed, or another email was sent. One treatment is left online for a set period, followed by the next treatment, and so forth. This is usually because there was no test design to begin with, and marketers are comparing results after the fact.

This could also be because marketers do have a test design but are unable to split traffic. After all, if you can only direct traffic to a single page design at a time, you can only test pages sequentially. (However, with the wide availability of both free and paid optimization tools, this situation has become quite rare.)

Sequential tests are extremely prone to history effects, where an outside event or phenomenon affects the viewers’ behaviors on the site from one moment in time to another (see our Online Testing Course for more information on History Effects).

For example, an email sent out to the mailing list will increase traffic to whatever homepage treatment is currently online, distorting the actual effect of the design changes. This effect is usually noticeable as a sudden rise on an analytics traffic or conversion chart. Although it is not an optimal research design, this type of study can distinguish between a control and a treatment page. Results should only be interpreted if the possibility of history effect has been considered and found insignificant.

 

Related Resources:

Marketing Optimization: You can’t find the true answer without the right question

Artificial Optimization: Why at least 40% of marketers shouldn’t test

Marketing Optimization: How to determine the proper sample size

 

 

Email Testing: More specific subject line improves open rate by more than 35%

July 1st, 2011 1 comment

“She was here on earth to make sense of its wild enchantment and to call each thing by its right name.” – Boris Pasternak (Doctor Zhivago)

Sometimes, as marketers, this is one of our biggest challenges. We must make sense of the “wild enchantment” inherent in our audience so we can call each offering we have by its right name. After all, the way we think about our products can be vastly different from the way our audience thinks about them. This is why specific words matter, and for more than just SEO.

Let’s take a look at a recent test conducted by MarketingSherpa (sister company to MarketingExperiments), to determine which words best tap into the audience’s motivations. Read more…

Live Experiment (Part 2): Real testing is messy

June 10th, 2011 3 comments

As discussed in my last blog post, we reviewed how the marketing force of 200 marketers at Optimization Summit was utilized to design the following test:

.

Control (click to zoom)                              Treatment (click to zoom)

You can read the experiment details, how we got 200 marketers to agree, and few insightful reader comments in Wednesday’s post. In this second post now, as promised, we will look at the results and what insights might be gained from them. Read more…

Landing Page Optimization: Is it actually possible to optimize a landing page?

June 6th, 2011 3 comments

falcor the optimizerLast week I spoke on the “Overcoming Operational Barriers to Optimization Implementation” panel at Optimization Summit. The buzz was loud there, the birds were flapping their wings, the bees were swarming, everyone was talking about Landing Page Optimization.

Aside from this event, there are start-up companies with a central focus around this one discipline. There are courses, webinars, books, and even theories about Landing Page Optimization. So here’s the million-dollar question, “Is it possible to optimize a landing page?”

I could sit down with you and tell you a number of ways that you could improve your page:

  • Greet the visitor with a clear headline
  • Eliminate multiple, equally weighted objectives
  • Reduce the number of fields you have
  • Be more transparent about your shipping rates
  • Add or remove copy

And on, and on, and on…but none of this is landing page optimization.

Even if you tripled your sales by taking my advice, implementing what you saw in a webinar, or going live with a new design provided to you by an Internet marketing company, you still have NOT optimized a page.

To delve into what optimization really is, let’s take a look at the three types of marketing managers I most often come across… Read more…

Evidence-based Marketing: Marketers should channel their inner math wiz…not cheerleader

June 1st, 2011 2 comments

Some of my favorite tweets on #SherpaLPO (the hashtag for Optimization Summit in Atlanta) reflect the stark difference between evidence-based marketing and “song and dance” marketing…

Landed safe and sound in Atlanta, ready to nerd it up tomorrow with fellow website optimizers #SherpaLPO http://ow.ly/57ghH

@DesignerMeg


Getting ready to geek out with @MarkKilens and @mgieva at #SherpaLPO

@mcdmiller

To use a high school analogy, marketers are often thought of as the popular people – the Student Government president, the captain of the football team (or perhaps curling team for our Canadian friends).

But the 139 marketers listening to Dr. Flint McGlaughlin teach right now in our pre-Optimization Summit Landing Page Optimization Workshop in Atlanta (the next stops of this workshop will be in New York and San Francisco) are not seeking to learn about better ways to add a winning smile or flashy move to their marketing campaigns.

Evidence-based marketers are a little different. They are the chess club president or captain of the academic team (don’t worry, popularity comes when you start marketing based on business intelligence, instead of just intuition, and your campaigns produce results). Read more…

Online Testing: 3 takeaways to get the most out of your results

May 13th, 2011 No comments

Professional infiltrator should seriously be added to my job description. Because once again, I gathered some marketing “intel” from somewhere that I wasn’t quite, how do I put it…invited to?

This time, I stepped out of the MECLABS classroom and into a beautiful, oceanfront hotel in Jacksonville Beach for a MarketingExperiments Landing Page Optimization Workshop. There, I blended into a crowd of about 70 marketers; listening to the presenters, Director of Training, Chuck Coker, Senior Optimization Manager, Adam Lapp (Mr. Lapp when class is in session) and MECLABS Managing Director (CEO), Dr. Flint McGlaughlin.

As I sat there taking notes, one subject really stood out to me, and that was testing. It seemed as though many questions were geared towards that. “How big does your sample size have to be in order to know your test results really worked?” “How long should you test for?” “What should you test?” etc… Read more…