Testing "Advanced" Users on New Products

21 Sep 2004 - 11:25am
9 years ago
19 replies
505 reads
H Taylor
2004

Hi group.

I have sort of a general question about UCD and usability testing. I've
done some testing with users myself, and am reasonably comfortable
operating tests.

My question is this:

In context of the distinction between learnability (usability for new
users) and efficiency/usability for experienced users, how do you test
for the latter? It's no problem to find inexperienced users for a new
product, but how do you test how well a design works for experienced
users, when the design itself is new?

- Hal

Comments

21 Sep 2004 - 12:01pm
Gerard Torenvliet
2004

Hal,

In my knowledge, the only way of achieving this is to do what statisticians call a 'longitudinal' test: you get multiple people to do some software tasks repeatedly over a long period of time (say 1 hr per day over six months).

This is hugely expensive, both in terms of time and product schedules. Even in academic research, I've been a conference presentation where the room hushed when we presented results for an experiment/test that had 20 1-hour sessions over a month.

I don't know of any other techniques than just rolling up your sleeves.

Regards,
-Gerard

:: Gerard Torenvliet / gerard.torenvliet at cmcelectronics.ca
:: Human Factors Engineering Design Specialist
:: CMC Electronics Inc.
::
:: Ph - 613 592 7400 x 2613
:: Fx - 613 592 7432
::
:: 415 Legget Drive, P.O. Box 13330
:: Ottawa, Ontario, CANADA, K2K 2B2
:: http://www.cmcelectronics.ca

-----Original Message-----
From: H Taylor [mailto:taylor at critpath.org]
Sent: September 21, 2004 12:26 PM
To: discuss-interactiondesigners.com at lists.interactiondesigners.com
Subject: [ID Discuss] Testing "Advanced" Users on New Products

[Please voluntarily trim replies to include only relevant quoted material.]

Hi group.

I have sort of a general question about UCD and usability testing. I've
done some testing with users myself, and am reasonably comfortable
operating tests.

My question is this:

In context of the distinction between learnability (usability for new
users) and efficiency/usability for experienced users, how do you test
for the latter? It's no problem to find inexperienced users for a new
product, but how do you test how well a design works for experienced
users, when the design itself is new?

- Hal

_______________________________________________
Interaction Design Discussion List
discuss at ixdg.org
--
to change your options (unsubscribe or set digest): http://discuss.ixdg.org/
--
Questions: lists at ixdg.org
--
Announcement Online List (discussion list members get announcements already)
http://subscribe-announce.ixdg.org/
--
http://ixdg.org/

21 Sep 2004 - 1:33pm
Dan Saffer
2003

This question (or something similar) came up at DIS this year in
Hillary Smith's talk on Eliciting Reactive and Reflexive Feedback:

"Findings from data collected across four sessions show that using
personalised task scenarios and giving users longer exposure to an
early interactive prototype, combined with peer-to-peer discussion,
enables participants to move beyond initial reactions to develop more
reflective opinions. Participants were able to overcome first
impressions and learning effects, develop deeper understanding of new
conceptual models underpinning the system, integrate their
understanding of piecemeal components and reflect on own use and use by
others in deeper ways."

In a nutshell, her solution was to expose users to a product, then
retest the same product on them once a week for four weeks. During the
final session, the test subject had to explain the product to a friend.

This helped get through the initial reactions to a system and let the
designers know what the learning curve was like and what the users
bought into over time.

Dan

22 Sep 2004 - 2:43am
H Taylor
2004

Hi, Gerard.

Thanks for your reply.

Yeah, that's what I was afraid of: that the only real way to test
designs (or aspects of designs) geared toward "expert users" is to
create some expert users over (relatively) prolonged periods. And if
one wanted to test multiple users (even two or three) on multiple
design candidates, it would seem to quickly go from "hugely expensive"
to simply absurd.

Thanks for the feedback.

I know that there are some folks on the list who have created designs
(or aspects of designs) for experienced users. It would be interesting
to hear how they went about attempting to validate design proposals, or
if they just had to "wing it".

- Hal

On Sep 21, 2004, at 7:01 PM, Torenvliet, Gerard wrote:

> Hal,
>
> In my knowledge, the only way of achieving this is to do what
> statisticians call a 'longitudinal' test: you get multiple people to
> do some software tasks repeatedly over a long period of time (say 1 hr
> per day over six months).
>
> This is hugely expensive, both in terms of time and product schedules.
> Even in academic research, I've been a conference presentation where
> the room hushed when we presented results for an experiment/test that
> had 20 1-hour sessions over a month.
>
> I don't know of any other techniques than just rolling up your sleeves.
>
> Regards,
> -Gerard
>
> :: Gerard Torenvliet / gerard.torenvliet at cmcelectronics.ca
> :: Human Factors Engineering Design Specialist
> :: CMC Electronics Inc.
> ::
> :: Ph - 613 592 7400 x 2613
> :: Fx - 613 592 7432
> ::
> :: 415 Legget Drive, P.O. Box 13330
> :: Ottawa, Ontario, CANADA, K2K 2B2
> :: http://www.cmcelectronics.ca
>
>
>
> -----Original Message-----
> From: H Taylor [mailto:taylor at critpath.org]
> Sent: September 21, 2004 12:26 PM
> To: discuss-interactiondesigners.com at lists.interactiondesigners.com
> Subject: [ID Discuss] Testing "Advanced" Users on New Products
>
>
> [Please voluntarily trim replies to include only relevant quoted
> material.]
>
> Hi group.
>
> I have sort of a general question about UCD and usability testing. I've
> done some testing with users myself, and am reasonably comfortable
> operating tests.
>
> My question is this:
>
> In context of the distinction between learnability (usability for new
> users) and efficiency/usability for experienced users, how do you test
> for the latter? It's no problem to find inexperienced users for a new
> product, but how do you test how well a design works for experienced
> users, when the design itself is new?
>
> - Hal
>
> _______________________________________________
> Interaction Design Discussion List
> discuss at ixdg.org
> --
> to change your options (unsubscribe or set digest):
> http://discuss.ixdg.org/
> --
> Questions: lists at ixdg.org
> --
> Announcement Online List (discussion list members get announcements
> already)
> http://subscribe-announce.ixdg.org/
> --
> http://ixdg.org/
>

22 Sep 2004 - 2:50am
H Taylor
2004

Thanks for the reply, Dan.

Basically, everyone seems to be saying that the only way to test
aspects of interaction design on experienced users of a product is to
invest the time and effort into creating experienced users. Even the
"once a week for four weeks" method seems a little light, if you have a
relatively complex application (and/or problem domain) that you expect
people to be using daily.

So, ummm, no magic bullets, huh?

- Hal

Dan Saffer wrote:

> This question (or something similar) came up at DIS this year in
> Hillary Smith's talk on Eliciting Reactive and Reflexive Feedback:
>
> "Findings from data collected across four sessions show that using
> personalised task scenarios and giving users longer exposure to an
> early interactive prototype, combined with peer-to-peer discussion,
> enables participants to move beyond initial reactions to develop more
> reflective opinions. Participants were able to overcome first
> impressions and learning effects, develop deeper understanding of new
> conceptual models underpinning the system, integrate their
> understanding of piecemeal components and reflect on own use and use by
> others in deeper ways."
>
> In a nutshell, her solution was to expose users to a product, then
> retest the same product on them once a week for four weeks. During the
> final session, the test subject had to explain the product to a friend.
>
> This helped get through the initial reactions to a system and let the
> designers know what the learning curve was like and what the users
> bought into over time.
>
> Dan

22 Sep 2004 - 3:07am
Listera
2004

H Taylor:

> So, ummm, no magic bullets, huh?

Sorta. It's called an "experienced designer." :-)

Ziya
Nullius in Verba

22 Sep 2004 - 3:19am
sudhindra
2004

Hi Hal,
Focussing on the problem you have mentioned of getting experienced users for
a new design, I feel it would help if the users used in user testing are
"experienced users of the computer" such as IT professionals so that it
would be easier for them to adjust to the new system.. That way, they would
become "experienced users" faster than a normal user and hence save you
some time and money......

Also, if there are similar systems developed by your company earlier, the
users of that system might prove to be an experienced user of the new system
much faster than others....

Thanks and Regards
Sudhindra

22 Sep 2004 - 3:55am
H Taylor
2004

Hello Sudhindra.

Interesting thought. I see 2 potential problems, however.

One is that IT professionals may have significant differences from the
target group for the product (which might be investment bankers, for
example), even when you're looking at interaction for experienced
users. Obviously, if your product is geared towards IT professionals,
that changes things.

Another is that the product may require significant domain knowledge,
in which case you're limited to working with candidates within that
domain.

But certainly, starting with people who use a computer regularly would
be absolutely necessary.

And yes, if the product is a revision to a previous version, then
testing on users of the existing/previous product is an option. You may
get some initial negative effect, though, if they are habituated to a
similar but different interface or interaction model. Then all of their
interactions which have become automated and unconscious may get in
their way, possibly causing them to perform worse than users new to the
system might.

- Hal

On Sep 22, 2004, at 10:19 AM, Sudhindra Venkatesha Murthy wrote:

> Hi Hal,
> Focussing on the problem you have mentioned of getting experienced
> users for a new design, I feel it would help if the users used in user
> testing are "experienced users of the computer" such as IT
> professionals so that it would be easier for them to adjust to the new
> system.. That way, they would become "experienced users"  faster than
> a normal user and hence save you some time and money......
>
> Also, if there are similar systems developed by your company earlier,
> the users of that system might prove to be an experienced user of the
> new system much faster than others....
>
> Thanks and Regards
> Sudhindra

22 Sep 2004 - 5:45am
Lada Gorlenko
2004

>> So, ummm, no magic bullets, huh?

> Sorta. It's called an "experienced designer." :-)

Not necessarily. It might as well be called "iterative design evaluation",
which includes formative-type evaluation and summative-type evaluation.
Formative is shaping the design concept and defining the design direction.
Summative is testing against the defined performance targets.

It seems to me that Hal is looking not only for a magic bullet, but for a
*free* magic bullet. Nupe, there is no free lunch. You either invest in
training or you invest in iterative design. Iterative evaluation has its
advantages: if you look to choose between different designs, make your
choice at relatively early stages. Do cognitive walkthroughs with the
target users, make quick changes, evaluate again, eliminate the choices,
concentrate on the winning design. Bringing 3 or 4 designs to a summative
test is costly and, in my view, totally unnecessary.

So, it's back to basics: test concepts, not finished designs. If you do
formative evaluation right, you'll be left with one winning design. If you
do iterative evaluation on the same users throughout the project, you'll
have users already experienced in the new design by the time of your
summative test. Bingo :-)

Lada
http://www.ibm.com/easy

21 Sep 2004 - 12:25pm
panu.korhonen a...
2004

Hello

One method that can be tried in some very restricted cases is theoretical modelling, using GOMS and derivatives. We have used that in modelling the expert performance on mobile phone keypads for text entry. It gives good indication of the actual performance when the majority of the user operations are mainly motoric, and therefore can be modelled with keystroke level models, or Fitt's law for pointing or finger movements. The results are of course very optimistic, but consistently so for different text entry methods.

We haven't tried modelling anything that requires lots of mental operations. Could be tricky, and probably less accurate.

Regards,
Panu

> Hal,
>
> In my knowledge, the only way of achieving this is to do what
> statisticians call a 'longitudinal' test: you get multiple
> people to do some software tasks repeatedly over a long
> period of time (say 1 hr per day over six months).
>
> This is hugely expensive, both in terms of time and product
> schedules. Even in academic research, I've been a conference
> presentation where the room hushed when we presented results
> for an experiment/test that had 20 1-hour sessions over a month.
>
> I don't know of any other techniques than just rolling up
> your sleeves.
>
> Regards,
> -Gerard
>

22 Sep 2004 - 10:18am
Pradyot Rai
2004

Lada Gorlenko <gorlenko at uk.ibm.com> wrote:
> <snip> It might as well be called "iterative design evaluation",
> which includes formative-type evaluation and summative-type evaluation.
> Formative is shaping the design concept and defining the design direction.
> Summative is testing against the defined performance targets. <snip>

> <snip> You either invest in
> training or you invest in iterative design. Iterative evaluation has its
> advantages: if you look to choose between different designs, make your
> choice at relatively early stages. Do cognitive walkthroughs with the
> target users, make quick changes, evaluate again, eliminate the choices,
> concentrate on the winning design. Bringing 3 or 4 designs to a summative
> test is costly and, in my view, totally unnecessary. <snip>

> <snip> So, it's back to basics: test concepts, not finished designs. If you do
> formative evaluation right, you'll be left with one winning design.<snip>

Excellent points!

I call them testing the users mental modal, against every user task
that design has proposed. Many a time, even when you haved learned
something, it does not fit with how you cognitively expect it to work.
Alan Cooper coined the word for it - "cognitive friction".

If one think that the theory of iterative design is *costly*, then
what is in design?

I always get scared when I hear people use words such as Usabiltiy,
expert testing, 3-5 users, six months...? This can never win you case
for developing good product. It has to be simply two word *iterative*
and *design* -- Iterative Design.

Whole software industry works like this.

Prady

22 Sep 2004 - 11:29pm
Anirudha Joshi
2003

We have tried a 'competition' on our campus to test the effect of
motivation and practice on the usage of a new (non-QWERTY) keyboard for
entering text
(http://www.idc.iitb.ac.in/~anirudha/papers/ex06-joshi.pdf). We have
talked about some lessons learnt in this paper about products 'like a
keyboard' that need experience users that you might find useful. We had
lot of participation and enthusiasm was quite high.

That was cool, but even then the participants had put in only 2-8 hours
of effort - not enough to learn a new keyboard. Hence, we are working on
a 'longitudinal' comparative study between three keyboards. It has been
a year, and we still have not done enough studies that are repeatable,
statistically valid and that have worked well. It has been a good
learning experience and we have had useful feedback - but not something
that is repeatable. In that sense, they have been formative evaluations,
but by no means summative. No silver bullets, sorry...

But in our case, the issue is not only of conceptual understanding of an
activity, but that of hand skill, which gets affected by things like
clunkiness of a prototype (a sticky key perhaps), knowledge of the
script and a one-day extra gap between two trials. Also, our experience
has been that new hardware evaluation has more problems than software.
But I would like to know if someone has a different view on this...

Hope this helps.
Anirudha

22 Sep 2004 - 4:42pm
H Taylor
2004

Oops...

In a prior message, I wrote:

> Zayera suggests that the substitue for validation, when validation
> becomes impractical[...]

and later:

> So in the end, maybe Zayera's answer is the most realistic[...]

That should have been "Ziya" and not "Zayera", in both cases. I
apologize to both parties for the slip-up; it's late, in my time zone.
What would Norman call that; displacement error, or something?

- Hal

22 Sep 2004 - 4:22pm
H Taylor
2004

> It seems to me that Hal is looking not only for a magic bullet, but
> for a
> *free* magic bullet. Nupe, there is no free lunch.

I suppose it's my own fault for introducing the term "magic bullet" (if
somewhat jokingly), but no, my purpose was to bring up what must be a
relatively common challenge in validating design decisions, and see how
experienced professionals had been addressing the issue.

I'm not (currently) facing such a challenge, and am not in need of a
free lunch. But Jef Raskin's recent comments got me thinking about
validation of design assumptions, and, well, piqued my curiousity about
the issue I raised. Testing initial learnability and what Jef would
call "familiarity" is relatively straightforward. Validating
assumptions about user behavior after the users have spent a
significant amount of time with a product can be challenging,
particularly if the product is new or attempts to change significantly
from a previous design.

Zayera suggests that the substitue for validation, when validation
becomes impractical, is experience. I find this to be a credible
assertion, since it is often possible to get feedback from experienced
users after you've accepted and deployed a design, and these lessons
can help you to better anticipate potential problems. But that's a
little different than quantative analysis to support deciding between
competing design alternatives.

Lada counters this with "iterative design evaluation". As someone who
studied architecture, I'm quite familiar with the advantages of
iterative design, but the "evaluation" is the sticky part in the
context I described, which is exactly why I brought the point up.

While cognitive walkthroughs and such will validate higher-level
decisions, it seems like it would be hard to address finer details that
way. And as for testing with the same users to have "experts" by the
time of a "summative test", I again assert that becoming expert in
sophisticated products may not happen in some number of hour-a-week
sessions. What if you were trying to evaluate how someone uses Excel or
Filemaker after 6 months of using it day-in, day-out, for example? But
maybe something like that necessitates iteration in the form of product
releases and revisions over time.

I appreciate Panu's comments about both the potential benefits and
limitations of GOMS-based assessments, which attempts to discuss the
possibility of some sort of quantitative validation, however
theoretical. And I think Anirudha's study really addresses the scope of
the challenge.

So in the end, maybe Zayera's answer is the most realistic, in that it
implies that the problem is essentially intractable, and your best hope
is to develop good instincts through feedback on previous design
choices.

I haven't addressed all comments individually, but I very much
appreciate the thoughtful opinions of all who've thus far responded.

Thank you for time and effort.

- Hal

Lada wrote:

> Not necessarily. It might as well be called "iterative design
> evaluation",
> which includes formative-type evaluation and summative-type evaluation.
> Formative is shaping the design concept and defining the design
> direction.
> Summative is testing against the defined performance targets.
>
> It seems to me that Hal is looking not only for a magic bullet, but
> for a
> *free* magic bullet. Nupe, there is no free lunch. You either invest in
> training or you invest in iterative design. Iterative evaluation has
> its
> advantages: if you look to choose between different designs, make your
> choice at relatively early stages. Do cognitive walkthroughs with the
> target users, make quick changes, evaluate again, eliminate the
> choices,
> concentrate on the winning design. Bringing 3 or 4 designs to a
> summative
> test is costly and, in my view, totally unnecessary.
>
> So, it's back to basics: test concepts, not finished designs. If you do
> formative evaluation right, you'll be left with one winning design. If
> you
> do iterative evaluation on the same users throughout the project,
> you'll
> have users already experienced in the new design by the time of your
> summative test. Bingo :-)
>
> Lada
> http://www.ibm.com/easy
>

23 Sep 2004 - 2:19am
Listera
2004

H Taylor:

> So in the end, maybe Zayera's answer is the most realistic, in that it
> implies that the problem is essentially intractable, and your best hope
> is to develop good instincts through feedback on previous design
> choices.

That's darn good advice, whoever he is. :-)

I often use the sig below to remind people that there's more art to design
than science. This is one of those occasions.

----
Ziya

When 2+2=4, it's development,
When 2+2>4, it's design.

30 Sep 2004 - 2:38pm
Ulla Tønner
2004

> Testing initial learnability and what Jef would
call "familiarity" is relatively straightforward. Validating
assumptions about user behavior after the users have spent a
significant amount of time with a product can be challenging,
particularly if the product is new or attempts to change significantly
from a previous design.

One issue that has not been mentioned in this thread is whether you want to
investigate in what manner the users change their behavior over time
(getting from non-experienced to experienced users). If that is the case, I
understand your worries about validity.

But if your goal is 'simply' to create a solution, that is effective in use
for experienced users, you cold let them work by themselves on a prototype,
in their own context, for a specific period of time. And then do a usability
test (preferably on their own computer in their own context) to see how the
product works. This would be a usability test like other usability tests,
only would the users be experienced before you conduct the test.

Of course you cold (time, ressources, money) combine with other methods to
get more differentiated answers from different perspectives:

I think the idea of letting the experienced user teach the system to another
user sounds very effective: This must give you some indications of, what
parts, elements and tasks the users found difficult or non-effecient - and
they might even (through their explanations to others) give you som good
hints about, how you could change the feature.

Or you could invite these pilot users to participate in a workshop to
discuss their experiences with each other and brain storm on ideas to solve
their troubles working with the product.

These thoughts are for now theoretical on my side, as I'm at the moment only
in the state of putting a usability plan together for such a pilot project.
But I think it will give me some of the answers I'm looking for. Feel free
to criticize - I can still make changes in the plan ;-)

Best regards
(and thanks for an interesting discussion)

/Ulla

30 Sep 2004 - 3:04pm
Listera
2004

Ulla Tonner:

> These thoughts are for now theoretical...

To say the least :-)

Think of a driver in his 10th year of daily driving. Now regress that back
to his first day in a car and ponder the difficulties in trying for him to
pretend to understand what it would be like for him to drive effectively and
efficiently in his 10th year. And imagine this driver is not an academic.
:-)

Ziya
Nullius in Verba

1 Oct 2004 - 2:50am
Ulla Tønner
2004

Ulla Tonner:

> These thoughts are for now theoretical...

Ziya wrote:

> To say the least :-)

> Think of a driver in his 10th year of daily driving. Now regress that back
to his first day in a car and ponder the difficulties in trying for him to
pretend to understand what it would be like for him to drive effectively and
efficiently in his 10th year. And imagine this driver is not an academic.
:-)

I'm not sure, that I understand your point. If your argument is, that you
can't find experienced users for the tests, as it takes ten years for them
to be experienced, I rest my case. And that wasn't the issue I adressed at
all.

If you are actually adressing my post, and you are saying that it is
difficult to get the test individuals to look back on, how they experienced
their learning driving the car, I say: Thank you Ziya. You hit the point of
my argument there:

Let me try to explain myself:

Sometimes one can get confused about what issue a usability investigation
should actually adress. And when it comes to the question of advanced users,
I believe, that it is obvious to get the issues mixed. Therefore, I tried to
split the discussion about advanced users into two themes:

* The case, where you want to know how the person develop from novice to
expert user
* and the other case, where you want to know how well the product works for
an experienced user.

This split, I believed, could help to clear out the question of, to what
extend you should 'stay close to the user' during a longer period of time
... And my point is, that maybe you don't have to do that at all. It depends
on, what you want to investigate:

If you want to capture the *process* of learning how to 'drive' this
product, it is a matter of measuring a learning process. I wont go into that
here, but for sure, you will need a method that lets you observe the learner
closely over a longer period of time.

On the other hand: If you 'only' want to test, whether the product is usable
for experienced users, you really don't need to be sitting on their lap
during their learning process. You will have to test the product with these
people, after they have become experts.
So, the 'driver' doesn't have to pretend anything. But you have to be a
skilled observer and a good interviewer (as always in usability tests).

Regards
Ulla

1 Oct 2004 - 5:14pm
Listera
2004

Ulla Tonner:

> You will have to test the product with these people, after they have
> become experts.

The operative word: "after."

If you're into "design by testing" then, as you suggest, you can't a priori
test something that hasn't yet happen. This was my point, in a nutshell.

You can concoct "simulations" and try convince yourself of the validity of
their significance. But, at the end of the day, they are still gross
approximations of what can be very complex interaction patterns which are in
turn interdependent on variables well beyond the particular app you are
working on.

Anticipating experienced user needs (and/or repetitive usage) remains a
fundamental problem not just restricted to software design, but can also be
observed in movies, board games, tools, etc.

Ziya
Nullius in Verba

1 Oct 2004 - 7:02pm
Petteri Hiisilä
2004

> how do you test how well a design works for experienced
> users, when the design itself is new?

Testing new design is problematic. Slow, expensive and unreliable.

With almost every successful consumer & business product (except some
lobby kiosks) we're mostly going to have "advanced" users anyway. Why?

Good business == lots of loyal users == lots of intermediate and
experienced users. Even gurus, if the product is really popular.

We still don't have a good word processor, for instance. If we had, it
would quickly get popular. Most users wouldn't remain beginners with it
for long.

We can't really test a *new* product with loyal users, because there is
none. No chicken, no egg. It's a dead end.

Once the product isn't new anymore, testing becomes possible. But that's
version two. Making money with version one is the tough part.

Hence, we'd better invest in research, modeling & design, if we want to
have good business with a new product. It's not easy either, but at
least it's possible.

Best,
Petteri

--
Petteri Hiisilä
Palveluarkkitehti / Interaction Designer /
Alma Media Interactive Oy / NWS /
+358505050123 / petteri.hiisila at almamedia.fi

"The unbroken spirit
Obscured and disquiet
Finds clearness this trial demands"
- Dream Theater

Syndicate content Get the feed