Methods

Now that we've looked at remote testing versus lab testing, we can delve deeper into remote testing for a more comprehensive analysis. There are two methods of remote usability testing—synchronous (moderated) and asynchronous (automated)—each with its own benefits and drawbacks. The major difference between the two methods lies in whether there is temporal division from the participants in addition to geographical division; that is, synchronous testing allows for a facilitator to administer the test and ask follow-up questions of the participants in real time, whereas asynchronous testing consists of a series of predefined, automated tasks that the participant completes at will, with no test facilitator present [4].

Synch Perks

For most purposes, synchronous testing seems to be the superior method, as it combines a one-on-one moderated approach with a natural setting; however, for tests that require quantity over quality, or which are operating on a very tight schedule, asynchronous tests are better than no tests at all.

Research shows that there seems to be no contest with respect to the quality of data gathered. Synchronous testing is more appropriate than asynchronous testing when qualitative data is needed and/or desired by the researchers [4]; because self-reporting by participants tends to be problematic, qualitative data is much more valid when the test administrator is on hand [2]. In synchronous testing, the moderator has the opportunity to ask follow-up questions about participants’ actions in real time as well as ask the participants to elaborate on comments they’ve made throughout the test. As Baravalle and Lanfranchi point out, quantitative data can “reveal the presence and magnitude of the [usability] problem but not its nature” [3].

For example, imagine a participant who is having trouble completing a task on a website, such as locating a person’s name with no context as to who the person is in relation to the website. The participant might then become frustrated. In an asynchronous test, the participant’s frustration will not be evident; the only indication of inability to find the information will be the length of time it takes to complete the task in comparison to other participants. In a synchronous test, however, the moderator has the opportunity to note the participant’s frustration and address it either during the test session by altering the task, or after the test session is over in an informal interview.

Synch Hurdles

Establishing a good rapport with participants and building trust can be difficult with the geographical barrier of synchronous testing [2], [6], but asynchronous testing allows for no interaction with the participants at all [4]. All other things being equal, interacting with participants has been shown to elicit more qualitative feedback about usability concerns. Additionally, test administrators may miss contextual information and nonverbal cues—such as body language, emotional state, or level of focus—with remote synchronous testing that they are privy to with traditional lab testing [5], [6], but again, any live interaction provides more qualitative feedback than none at all.

The Case for Synch

Perhaps the most poignant difference between synchronous and asynchronous testing is the results of the tests themselves. According to studies, asynchronous tests find fewer usability problems than lab testing, while synchronous tests have been shown to identify results similar to lab testing [4], [9]. Combining these findings with the reduced cost, a more representative user base, and a flexibility of scheduling associated with remote testing makes it obvious that remote testing is not only a viable solution, but perhaps even a better one than traditional lab testing. If a functionally similar set of results is seen with lab tests and remote synchronous tests, then performing lab tests may, in the future, be seen as a primitive option rather than the industry standard.

Best Practices: RST

In preparing for remote synchronous usability testing, a number of issues need to be taken into account and addressed, many of a technical nature. Test administrators should send instructions to participants beforehand, so that they are aware of any downloads or system requirements, as well as to introduce them to the context of the test [7] while taking care not to bias them in any way [6].

Bias can occur especially from the language used in the instructions, so be sure to remain as neutral as possible. For example, if a participant is going to be testing an application that needs to be downloaded, do not include in the instructions that “This download will take forever. Go get yourself a snack.” To that end, participants should be required to download as few applications as possible to avoid system malfunction and user frustration [10].

Testers should have contingency plans in place in case of malfunctions or problems with systems [7], and should assess security and firewall issues ahead of time [10]. Pilot tests should be conducted to eliminate any kinks in the test process [6].

For example, imagine a participant taking part in a usability test during his lunch break at work. He has been given no context or guidelines about the test; he just signed up one day for a scheduled time slot and now he’s going to take the test. The moderator calls him on the phone and asks him to connect to WebEx. The participant says, “I don’t have an account with WebEx.” So the moderator waits while the participant signs up for a free trial of WebEx. Then the moderator says, “Please turn your camera on.” The participant doesn’t have a camera. “Okay,” the moderator says. “Forget the camera. Share your screen with me.” The participant does not know how to go about sharing a screen, so he takes several minutes to figure it out. Finally the screen is shared and the test is ready to begin. The moderator says, “Can you please navigate to Facebook?” The participant says, “We can’t access Facebook at work.”

All of these issues can be addressed in pretest screening, pretest guidelines, and pilot testing. While our research does not recommend whom to conduct the pilot test on—participants or fellow testers—a case can be made for both. If the budget permits, it would be more realistic and introduce less professional bias if the first participant test acted as the pilot study and their data was thrown out. However, if time and cost prohibit this, conducting a pilot test on a fellow tester will likely identify any major technical hurdles.

There is almost universal agreement on what constitutes the best practices for deploying remote synchronous usability tests. Because there are currently no software tools or applications specifically tailored for remote synchronous testing, moderators have had to make do with the tools available. The most cited method of conducting usability evaluations in separate locations but at the same time involves utilizing the following: a screen-sharing tool such as WebEx; usability testing software with video and mouse capture options like Morae; and a telephone connection, including Skype or a simple cell phone with speakerphone enabled [2], [10].

Deploy!

The ideal deployment of the remote synchronous test is as follows:

Before the test, the test moderators give the participants a guide outlining the steps they’ll be asked to take. This includes accessing a screen sharer like WebEx (for free); it should be made very clear to participants that their screen will be shared with the testers. Also, if the moderators would like video of both the mouse movement and the participant’s face, testers must make sure that the participant’s computer has a camera.

Moderators should suggest to participants that they close all other applications and browsers before beginning the test, particularly to avoid security or confidentiality issues with email or social networking sites.

This guide can also go into some detail about the nature of the test and the context in which the participant is expected to be tested, depending on the moderators’ discretion. Applications, apart from a Web browser, that the participant will need should be delineated ahead of time, particularly if they require download. For example, if moderators will be using Skype to communicate in real time, participants should be made aware of this. These steps should help reduce technical hurdles during the testing.

When it’s time for testing, the moderator should have everything set up before contacting the participant. This includes the screen-sharing tool like WebEx (with or without camera enabled), a recording tool like Morae, and a telephone or other tool like Skype or WebEx's audio conference. Once all systems are enabled, the moderator should contact the participant at the scheduled time. Now the participant has to share his/her screen (with live video or not) using WebEx or similar tool with the moderator, at which point the moderator can begin recording with Morae or a similar tool. Even with this preparation, moderators should be prepared to spend about ten minutes getting the participants set up before actual testing begins.

This is the best process for remote synchronous usability testing as Design For Use understands it. If any readers have experiences or best practices that either contradict or expand upon this information, please share them with us!

References

[1] Andreasen, M. S., Nielsen, H. V., Schroder, S. O., & Stage, J. (2006). Usability in open source software development: Opinions and practice. Information Technology and Control, 35(3a), 303-312.

[2] Andreasen, M. S., Nielsen, H. V., Schroder, S. O., & Stage, J. (2007). What happened to remote usability testing? An empirical study of three methods. Proceedings from CHI 2007. San Jose, CA.

[3] Baravalle, A., & Lanfranchi, V. (2003). Remote Web usability testing. Behavior Research Methods, Instruments, & Computers, 35(3), 364-368.

[4] Bastien, J. M. C. (2010). Usability testing: A review of some methodological and technical aspects of the method. International Journal of Medical Informatics, 79, e18-e23.

[5] Brush, A. J. B., Ames, M., & Davis, J. (2004). A comparison of synchronous remote and local usability studies for an expert interface. Proceedings from CHI 2004. Vienna, Austria.

[6] Dray, S., & Siegel, D. (2004). Remote possibilities? International usability testing at a distance. Interactions, March + April, 10-17.

[7] Gardner, J. (2007). Remote Web site usability testing: Benefits over traditional methods. International Journal of Public Information Systems, 2007(2), 63-72.

[8] Gough, D., & Phillips, H. (6/9/2003). Remote online usability testing: Why, how, and when to use it. Retrieved from http://www.boxesandarrows.com/view/remote_online_usability_testing_why_how_and_when_to_use_it.

[9] McFadden, E., Hager, D. R., Elie, C. J., & Blackwell, J. M. (2002). Remote usability evaluation: Overview and case studies. International Journal of Human-Computer Interaction, 14(3&4), 489-502.

[10] Seffah, A., & Habieb-Mammar, H. (2009). Usability engineering laboratories: Limitations and challenges toward a unifying tools/practices environment. Behavior & Information Technology, 28(3), 281-291.

[11] Tullis, T., Fleischman, S., McNulty, M., Cianchette, C., & Bergel, M. (2007). An empirical comparison of lab and remote usability testing of Web sites. Proceedings from Usability Professional Association Conference, 2002. Orlando.

Published by Design For Use.