Online, unmoderated user testing tool

8 Apr 2009 - 1:04am
5 years ago
9 replies
3656 reads
Toby Biddle
2009

A new online, unmoderated user testing tool has recently launched -
www.loop11.com. Anyone used it have any thoughts?

Comments

8 Apr 2009 - 4:44am
Harry Brignull
2004

Looks interesting - I've registered for the beta.

The list of sites tested is impressive (ebay, yahoo, apple...) but I have a
feeling these aren't actually clients but site's they've tested
independently.

Also I'd love to know who's behind this. Loop11 people - care to speak up?

Harry

--
http://www.90percentofeverything.com

8 Apr 2009 - 6:39am
Jared M. Spool
2003

On Apr 7, 2009, at 11:04 PM, Toby Biddle wrote:

> A new online, unmoderated user testing tool has recently launched -
> www.loop11.com. Anyone used it have any thoughts?

We haven't used this tool in particular, but, from their site, it
looks similar to a slew of other tools on the market.

These tools are limited in value because of four key factors:

1) The pool of invited participants is critically important. In
Loop11, it seems you have to invite your own pool , which means you
have to use standard recruitment techniques to source, schedule, and
incent participants in the study. This will probably triple (or more)
the costs. (Many unmoderated tools offer their own pre-recruited
pools, which keeps costs down, but are often low quality participants,
such as people who only participate to get the incentive and don't
really use the design.)

2) You are limited in the tasks your participants can perform. For the
software to work, the site has to know when a task is completed. For
example, when evaluating a travel site, you have to know what page the
user will end up on. If the confirmation page for a trip booking is
computer generated, this might not be possible. Even if it is, can the
system tell if all the values were properly entered?

3) We know from our research at UIE that participants who are actually
interested in the task (for example, currently planning a vacation in
Paris) will behave substantially differently than those who are asked
to pretend to do a task. They take more time, are more discriminating
on the results, are more likely to be frustrated when key information
is missing, and are more likely to be delighted when the design meets
their needs. Yet, these systems usually require that every user take
the same path through the system, which means recruiting people with
identical interests (every participant has to be actively planning
their vacation to Paris and desiring the same dates & hotel
requirements).

4) The site reports standard analytic measures: time on task, "fail
pages", common navigation paths. But it's extremely difficult to come
to the correct inference based on these measures. For example, does
longer time-on-task or time-on-page imply frustration or interest?
Does a deviation from the common navigation path imply clicking on the
wrong element or curious exploration of additional features? Without
talking to the individual, it's hard to even know if a reported
measure is good or bad, let alone the action the team should take
based on the reported result.

In the ten years since I first started seeing these tools on the
market. I've never seen results from a study that the team could
actually interpret and act on. In one study a few years back with a
major electronics retailer, we conducted an in-lab study with 18
highly-qualified participants that was comparable to a 60-participant
Netraker (a Loop11 competitor from the past). The task was to find the
laptop computer of your dreams and put it in the cart.

In our study, all 18 participants were in the market to buy laptops,
had spent at least a week thinking about the laptop they wanted and
its requirements, and were given the cash to make the purchase (they
would keep the laptop after the study). In the Netraker study, they 60
randomly selected participants from a panel of thousands who
reportedly were in the demographic groups of the site (unverfiable)
and hadn't thought about laptop purchases until the instructions for
the test had popped up.

In the Netraker results, 94% of the participants completed the tasks
and the average time was 1m 18s. In our study, only 33% of the
participants completed the task and the average time was 18 minutes.

Why do you think there were such striking differences? Which study
would you pay more attention to?

Beware of VooDoo measurement techniques.

Hope that helps,

Jared

Jared M. Spool
User Interface Engineering
510 Turnpike St., Suite 102, North Andover, MA 01845
e: jspool at uie.com p: +1 978 327 5561
http://uie.com Blog: http://uie.com/brainsparks Twitter: jmspool
UIE Web App Summit, 4/19-4/22: http://webappsummit.com

8 Apr 2009 - 8:29am
James Page
2008

Jared,
As one of the people developing one of the other tools on the market,
webnographer <http://webnographer.com> , I will point out some corrections
to the points that you raise.

The first point is that there is a remote data collection tool, and a remote
method. Collecting the data is only the first part. The data collecting part
is easy, getting the method right is hard, and it has taken us a long time
to get right.

For the data gathering part there is a long history of academic projects
that use the concept of using a proxy and injecting Javascript. It was first
done with WebQuilt
<http://portal.acm.org/citation.cfm?doid=502115.502118>application at
Berkeley
and then by UsaProxy <http://fnuked.de/usaproxy/publications.htm> from
the University of Munich.

I think it may be a bit voodoo like to say that "I have seen one, I know
them all.".

When we have run tests remote vs lab we have managed to get smiler findings
from both, and according to our clients who tell us that the cost runs at
about 1/3rd of a traditional lab based test.

For the example you gave, the differences in findings is due to this
difference participant's knowledge and intention, not the difference in
methods used! With our tool we often carry out tests using real users that
are about to carry out a purchase. You cannot compare two methods if
participants don't have the exact scenario - and demographic, and for your
example the key difference was participant's knowledge and intention. The
Lab users in your test have been thinking about buying a computer for at
least a week, while remote users did not. This explains the difference in
your findings.

You mention the challenge of participant recruitment. Remote is easier.
The participant can do the study from home, therefore they do need such a
high incentive, as when they do a test from a lab. But getting the right
participant is one of the most overlooked areas for both lab, and remote.

We find out that many issues are down to the configuration of the users
machine. These issues you can not discover in a lab, where the machine
is normally the same. The sort of issues we have found are down to Screen
Size, having a pdf reader installed, and browser issues. We also find many
issues that are cultural. How many web sites have an International Scope,
but are only tested with participants in one town, let alone throughout a
whole country.

You mention that you are limited in the tasks your participants can perform
on remote. This is not the case with Webnograpther. Our system does not
require that every user take the same path through the system.

James
http://blog.feralabs.com

PS: If you want to debate which method is more voodoo I am up for a public
Oxford style debate in Brighton when you are in the UK in July.

2009/4/8 Jared Spool <jspool at uie.com>

>
> On Apr 7, 2009, at 11:04 PM, Toby Biddle wrote:
>
> A new online, unmoderated user testing tool has recently launched -
>> www.loop11.com. Anyone used it have any thoughts?
>>
>
> We haven't used this tool in particular, but, from their site, it looks
> similar to a slew of other tools on the market.
>
> These tools are limited in value because of four key factors:
>
> 1) The pool of invited participants is critically important. In Loop11, it
> seems you have to invite your own pool , which means you have to use
> standard recruitment techniques to source, schedule, and incent participants
> in the study. This will probably triple (or more) the costs. (Many
> unmoderated tools offer their own pre-recruited pools, which keeps costs
> down, but are often low quality participants, such as people who only
> participate to get the incentive and don't really use the design.)
>
> 2) You are limited in the tasks your participants can perform. For the
> software to work, the site has to know when a task is completed. For
> example, when evaluating a travel site, you have to know what page the user
> will end up on. If the confirmation page for a trip booking is computer
> generated, this might not be possible. Even if it is, can the system tell if
> all the values were properly entered?
>
> 3) We know from our research at UIE that participants who are actually
> interested in the task (for example, currently planning a vacation in Paris)
> will behave substantially differently than those who are asked to pretend to
> do a task. They take more time, are more discriminating on the results, are
> more likely to be frustrated when key information is missing, and are more
> likely to be delighted when the design meets their needs. Yet, these systems
> usually require that every user take the same path through the system, which
> means recruiting people with identical interests (every participant has to
> be actively planning their vacation to Paris and desiring the same dates &
> hotel requirements).
>
> 4) The site reports standard analytic measures: time on task, "fail pages",
> common navigation paths. But it's extremely difficult to come to the correct
> inference based on these measures. For example, does longer time-on-task or
> time-on-page imply frustration or interest? Does a deviation from the common
> navigation path imply clicking on the wrong element or curious exploration
> of additional features? Without talking to the individual, it's hard to even
> know if a reported measure is good or bad, let alone the action the team
> should take based on the reported result.
>
> In the ten years since I first started seeing these tools on the market.
> I've never seen results from a study that the team could actually interpret
> and act on. In one study a few years back with a major electronics retailer,
> we conducted an in-lab study with 18 highly-qualified participants that was
> comparable to a 60-participant Netraker (a Loop11 competitor from the past).
> The task was to find the laptop computer of your dreams and put it in the
> cart.
>
> In our study, all 18 participants were in the market to buy laptops, had
> spent at least a week thinking about the laptop they wanted and its
> requirements, and were given the cash to make the purchase (they would keep
> the laptop after the study). In the Netraker study, they 60 randomly
> selected participants from a panel of thousands who reportedly were in the
> demographic groups of the site (unverfiable) and hadn't thought about laptop
> purchases until the instructions for the test had popped up.
>
> In the Netraker results, 94% of the participants completed the tasks and
> the average time was 1m 18s. In our study, only 33% of the participants
> completed the task and the average time was 18 minutes.
>
> Why do you think there were such striking differences? Which study would
> you pay more attention to?
>
> Beware of VooDoo measurement techniques.
>
> Hope that helps,
>
> Jared
>
> Jared M. Spool
> User Interface Engineering
> 510 Turnpike St., Suite 102, North Andover, MA 01845
> e: jspool at uie.com p: +1 978 327 5561
> http://uie.com Blog: http://uie.com/brainsparks Twitter: jmspool
> UIE Web App Summit, 4/19-4/22: http://webappsummit.com
>
> ________________________________________________________________
> Welcome to the Interaction Design Association (IxDA)!
> To post to this list ....... discuss at ixda.org
> Unsubscribe ................ http://www.ixda.org/unsubscribe
> List Guidelines ............ http://www.ixda.org/guidelines
> List Help .................. http://www.ixda.org/help
>

8 Apr 2009 - 8:57am
Caroline Jarrett
2007

>From Jared Spool
>
> In the ten years since I first started seeing
> these tools on the market. I've never seen
> results from a study that the team could
> actually interpret and act on.

Ironically, your email arrived in my in-box just as I was taking a break
from analysing the results of an online, un-moderated usability test.

I don't use them all that often but do so routinely when I'm doing a
measurement evaluation of an established web presence. Here in the UK, we
have www.usabilityexchange.com They offer a large panel of users with
disabilities. I find that it's handy to get a chunk of disabled participants
doing the same (or as similar as I can make them) tasks as I'm running with
the face-to-face participants.

The remote participants are using their own technologies in their usual
environment, which is a helpful further insight compared to any disabled
participants that I can catch for my face-to-face but who then have to use
my technology in my environment.

As with the last time I tried this, I find myself turning to the comments
made by the remote participants quite often as I write the report. It
appeals to me that it's the actual words typed by the participant in their
own time.

I wouldn't want to run the remote test on its own as indeed it can be a bit
misleading. For example, the remote participant might find the correct page
on the web site but that doesn't say that they understood it.

I also take your point that when using a panel like this, you don't know how
much direct interest that they have in the site that you're testing. As it
happens, this current job is a large government web site and the sort of
tasks that I have designed are things that might spring at you out of a blue
sky whether you're interested in them or not.

For example, one task was about what you had to do with respect to this
particular bit of government after a bereavement. Obviously we were careful
to ensure that anyone who felt distress about the task didn't have to do it
(no one refused). One face-to-face participant and one remote participant
had in fact each been bereaved fairly recently, within the last year. Their
task experience was remarkably similar to the participants who had no
immediate personal interest in the task - my interpretation is that everyone
was able to empathise somewhat with the task.

For this evaluation, the remote panel has definitely added useful extra
insight (and some helpful extra numbers) at for not much extra cost:
Usability exchange charged me GBP 2500 (around USD 3500).

One caveat: I'd be very wary of doing the remote unmoderated testing without
some face-to-face alongside it.

Best
Caroline Jarrett
www.formsthatwork.com
"Forms that work: Designing web form for usability"

8 Apr 2009 - 11:28am
Kathy Neuss
2009

Hi Toby,

I see unmoderated remote testing tools as a useful addition to the
tools available within the user research field. However, in no way
should they replace existing and established methodologies.

I recently used Chalkmark (for some internal testing - in as much to
see how it works, and whether I could add the data from the tool into
the data from a larger program of research). I have also been
investigating automated tools to be used within benchmark usability
testing.

However, I am complete disagreement with any tool, that thinks that
it can replace experience, and am put off from using Chalkmark again
because of a recent press release...
"A New Zealand company called Optimal Workshop is trying to disrupt
the usability space by offering free software that replaces
consultants. Instead of hiring someone, you can use Optimal%u2019s
web-based products to test mockups, usability, navigation, and site
architecture."

Full article here:
http://www.thestandard.com/news/2009/03/31/changing-face-usability-testing-chalkmark-releases-free-service-called-treejack

A case of biting that hand that feeds it...

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Posted from the new ixda.org
http://www.ixda.org/discuss?post=41078

8 Apr 2009 - 12:24pm
James Page
2008

@Kathy
I would love to disagree with you that Remote can not totally replace other
methods. But as one of the methods we use to test our own unmoderated remote
testing method is the lab, I would be hypocrite if I did. Each method has it
advantages and disadvantages.

Many people are pointing out different types of Unmoderated Remote Testing
methods, and trying to compare them, which is like comparing apples and
oranges. My partner Sabrina Mach has posted a blog post explaining the
different types of Unmoderated Remote Testing methods
http://blog.feralabs.com/2009/02/is-all-remote-usablity-testing-the-same/
which explains the different types of remote testing.

For example the tool that @caroline points to is a task-based online
questionnaire. Chalkmark tests where users click on a static page.
Webnographer uses the Combined method.
Back to your statement if remote replaces lab. Most of our clients to date
are using Webnographer <http://webnographer.com> where no testing has been
done before. For example we are testing a site in about 9 countries,
the expense of using a traditional method would be more than the cost
of developing the software. Another test we are running is a weekly test for
a client as they iterate through different designs. Again traditional
methods would be uneconomic.

All the best

James
http://blog.feralabs.com

2009/4/8 kathy neuss <kathyneuss at gmail.com>

> Hi Toby,
>
> I see unmoderated remote testing tools as a useful addition to the
> tools available within the user research field. However, in no way
> should they replace existing and established methodologies.
>
> I recently used Chalkmark (for some internal testing - in as much to
> see how it works, and whether I could add the data from the tool into
> the data from a larger program of research). I have also been
> investigating automated tools to be used within benchmark usability
> testing.
>
> However, I am complete disagreement with any tool, that thinks that
> it can replace experience, and am put off from using Chalkmark again
> because of a recent press release...
> "A New Zealand company called Optimal Workshop is trying to disrupt
> the usability space by offering free software that replaces
> consultants. Instead of hiring someone, you can use Optimal%u2019s
> web-based products to test mockups, usability, navigation, and site
> architecture."
>
> Full article here:
>
> http://www.thestandard.com/news/2009/03/31/changing-face-usability-testing-chalkmark-releases-free-service-called-treejack
>
>
> A case of biting that hand that feeds it...
>
>
> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
> Posted from the new ixda.org
> http://www.ixda.org/discuss?post=41078
>
>
> ________________________________________________________________
> Welcome to the Interaction Design Association (IxDA)!
> To post to this list ....... discuss at ixda.org
> Unsubscribe ................ http://www.ixda.org/unsubscribe
> List Guidelines ............ http://www.ixda.org/guidelines
> List Help .................. http://www.ixda.org/help
>

8 Apr 2009 - 1:20pm
Caroline Jarrett
2007

> For example the tool that @caroline points to
> is a task-based online questionnaire.

Sort of. Users are given a task description, and they then interact
naturally with the site as they ordinarily would. When they're finished they
do indeed get some questions to answer. It's better than just a
questionnaire because you also get the logs of which page they were on and
for how long.

My experience of www.usabilityexchange.com is that their testers are
amazingly diligent, and often very perceptive, in their commenting during
the questionnaire part.

Cheers
Caroline Jarrett
www.formsthatwork.com
"Forms that work: Designing web forms for usability" foreword by Steve Krug

9 Apr 2009 - 8:53am
AJ Kock
2007

Sorry if this is slightly off topic, but am I the only person who is
completely thrown in a loop when it comes to the Webnographer website?
There is no examples of what they do and if you want to find out more,
it takes you back to the home page.

9 Apr 2009 - 6:00pm
James Page
2008

AJ,

Thanks for the feedback, we are redesigning the website as I write this.
Most our focus has been on the product until now, and the only page that is
there at the moment is a holding page.
But so I can put a quick fix up could you clarify what you did, as all the
links should point back to either our blog http://blog.feralabs.com or the
our company's main pages at http://www.feralabs.com

If you want to find out more, before we put the new site up, please get in
touch.

All the best

James

2009/4/9 AJKock <ajkock at gmail.com>

> Sorry if this is slightly off topic, but am I the only person who is
> completely thrown in a loop when it comes to the Webnographer website?
> There is no examples of what they do and if you want to find out more,
> it takes you back to the home page.
> ________________________________________________________________
> Welcome to the Interaction Design Association (IxDA)!
> To post to this list ....... discuss at ixda.org
> Unsubscribe ................ http://www.ixda.org/unsubscribe
> List Guidelines ............ http://www.ixda.org/guidelines
> List Help .................. http://www.ixda.org/help
>

Syndicate content Get the feed