Real Designers Test

Usability testing is often considered the purview of "the research team" but if you're designing things you should be testing them too. I'm not saying that usability testing isn't in the domain of a user research team, it is evaluative research after all. However, I have heard many user researchers complain they are often bogged down with so much testing that they rarely get to do anything else. While I was running the Mozilla Open Web Apps Ecosystem UX team I worked with our fantastic Head of User Research to put a system in place letting the designers on my team take the reigns of usability testing as an unobtrusive addition to their process. Later we rolled it out to other design teams at Mozilla. Here's how we did it, how it helped us do better work, and how it helped us sell our work to our stakeholders.

A quick note

We chose to use UserTesting.com to help us get a lot of this done without having to spend a lot of time on it. I think their service is amazing for this kind of testing but you could certainly set up a similar program using something different. There were a variety of reasons we went with UserTesting; primarily the ability to create templates in the tool and their system for sourcing participants. I bring it up so you can keep in mind: this can be done with other tools if you would rather use something different.

Understanding the goal (and limitations) of running these tests

It is very important to be clear about why we are doing this. Everyone needs to understand what these tests are and most importantly what they are not. We are NOT validating big ideas (e.g. should this feature exist?) We are NOT using these tests to do any kind of generative research. These tests are very traditional, narrow usability tests. We should have already answered those other questions before running these tests.

The questions we often asked were like:

  • Can people use this as we intended?
  • Have we organized the information here in a way that makes sense?
  • Does this feel overly long or difficult?
  • Is any part of this delightful?

Beyond the obvious reasons of wanting to answer these questions, running these tests make sure we have answers when personal opinions about design start showing up as they inevitably do. We could counter, "I don't know if the way you have this laid out is going to make sense to users…" with our test results: including videos of real users telling us why it made sense to them. If you want to make your design process move faster keeping these kinds of speculations to a minimum is essential.

The hardest part of a usability study is…consistency

Most researchers will tell you the hardest part of running a study is finding participants. That's true if you are talking about a team that has formal training running studies. It wasn't a given that any of our designers had ever designed or run a usability study before. Only about half of them had some experience with usability tests. I couldn't send half my team into training on to make our goals about testing a reality. Instead, I did what anyone looking to embed knowledge does. I embedded it in the tool. I made templates.

The most important part of any data gathering activity is consistency. If you don't gather your data using a consistent process, you don't have data. Templatizing brings consistency to the process even for testers with minimal up-front training. It also has the side benefit of being a good intro to designing a study.

The specific template you create isn't as important as the act of creating it - as long as you adhere to the general best practices of designing a usability test. The next most important thing is making sure everyone always uses the templates. We created out templates in UserTesting, but you can just as easily do it in a google doc, on a wiki, or whatever else works for your team.

Running the test

We ended up mostly running studies with four to six participants. Four was ideal for us because the tests were limited to 15 minutes: long enough that we started seeing the patterns at that point while only having to watch an hour of videos. It's important to keep in mind that this is not widely generalizable research. It isn't going to tell you some fundamental truths about humanity; just if your prototype is on the right track. When we occasionally had skeptics about sample size we would point out that we were not running any kind of predictive statistics that required a large sample to have adequate power. If the objector was willing to watch all the videos with us, we would run six or eight tests total. This strategy worked amazingly well for getting buy-in from skeptical stakeholders by giving them some ownership over the results.

A quick aside: Large sample sizes

Once, I ran a test with 50 participants because we had unlimited testing credits and I wanted to put Nielson's work on small sample sizes to the test. The main thing I learned was, "Don't run a test with 50 people because no one wants to watch that many videos." When we compared it to the test we ran on the same prototype with six participants we didn't find anything new from the larger group. This isn't always going to be the case. You will probably miss somethings testing with smaller groups. You will need to weigh the costs of running a much larger population through your test. If you are adhering to not asking big questions, as noted above, these large studies are likely overkill.

Once you have all that data you have to do something with it

Group analysis is the best way to use this newly acquired data. We could reliably launch a test and have it close 30 minutes later; so I liked to run them over lunch and schedule an analysis session with interested stakeholders right after lunch. Another great strategy is to run the tests earlier in the day and make the analysis session a working lunch. People love to come to meetings with free food. Your tools and team will largely dictate if this kind of scheduling can work for you. However you do it, getting everyone to participate in the analysis phase is a must. Giving stakeholders shared ownership of the output of the test leads to significantly less push-back against the results. Group analysis with stakeholders is awesome, I can't recommend it enough, but it comes with a familiar problem. Consistency.

Templates to the rescue again. Just like with the test design the most important aspects of your analysis templates are: make one then everyone uses it for the analysis. Worksheets are my go to for these workshop-style sessions. For us, these were a make it once and use it every time artifact. This lowers the pressure on your designers running these sessions. They just print out enough worksheets and go.

After you have your consistent analysis tool, going over the ground rules is next. This will make or break your session. Make sure everyone agrees to them before you start. Write them down somewhere everyone can see them. When we watched the videos everyone took notes on a worksheet. One sheet per video session. After we watched all the videos we discussed any patterns we had seen, questions that the test raised, and talked about ideas for changes we needed to make. All of these topics should be part of the worksheet so the participants can write them down as the thoughts occur to them.

That's it. You're testing regularly now, just like the diagrams in all those Human-centered Design Books say you should!

Test templates and analysis worksheets are really the key here. They bring the consistency that is oh so important to a successful test. With those tools in hand it doesn't matter what testing platform you use. For us, the biggest barriers to doing testing were: lack of training and finding the time to do it. This process makes running simple usability tests something any member of the product development team can do in almost no time at all.

Still have questions? I'd love to discuss with you and help you get your team testing regularly.