eHealth Observatory

Usability Engineering

Heuristic evaluation

Heuristic evaluation is a usability inspection methodology in which multiple expert analysts independently evaluate the usability of a system according to a recognized set of heuristics, and then aggregate their usability findings to create a list of found usability problems, which can be used by the system’s designers in an iterative system improvement process. Heuristic Evaluation can be classified as a “discount usability engineering” (Nielsen J. , 1994) approach, because although it may not be the best possible methodology to catch all usability problems, it is a very good methodology for catching many usability problems whilst being restrained by limited resources (time and money). Discount usability engineering spawned from the development community’s need for a methodology that could evaluate and improve the usability of a system, without being overly: complex, time-consuming, or expensive (Bellotti, 1988). Heuristic Evaluation fits this bill, in that it only involves a small set of expert usability analysts (3-10), who evaluate the system under study (primarily) on their own terms.

Heuristic Evaluation is an ideal methodology to be used for conducting rapid usability evaluations of non-mission critical systems. If it is not imperative that all usability problems are resolved (i.e. if the system doesn’t affect patient safety or if a usability error would not cost the system user considerable time or money), then this next-best approach may be ideal. Another possible use case for the Heuristic Evaluation methodology is for conducting first-contact, preliminary usability evaluations on systems to resolve the bulk of their usability problems before testing them with actual system users (usability testing). Since usability testing studies are more costly to conduct than usability inspection studies, it may be beneficial to resolve the majority of a system’s usability problems before performing such user tests (to prevent those users from getting hung up on the obvious problems that could have been resolved with a preliminary system inspection).

An additional advantage to conducting Heuristic Evaluations is that they do not have to be performed on fully operational systems. Heuristic evaluators do not need a fully operational system in order to test such design aspects as interface layout, design metaphor, consistency, etc. This opens the door for using Heuristic Evaluation to test prototype systems (i.e. paper prototypes) in the early stages of system development. Conducting usability tests earlier in a system’s lifecycle also makes it much easier to fix the usability problems, as less work has to be undone.

Heuristic evaluation strengths, weaknesses and PDSA cycles

Notes

Strengths and weaknesses

Strengths

  • Low cost – a Heuristic Evaluation generally costs around $4,000 + n($600)], where n = the number of system evaluators (Nielsen J. , How to Conduct a Heuristic Evaluation, 2001)
  • Easy to learn – the only learning curve for conducting a Heuristic Evaluation is learning and understanding the usability heuristics
  • Easy to conduct – no formal methodology needs to be followed for conducting a heuristic evaluation. The system inspectors may freely use the system under study as they see fit during their tests.
  • Doesn’t require advanced planning – very little planning is required before conducting a Heuristic Evaluation (just selecting the heuristics and the evaluators)
  • Can be used on prototypes (including paper prototypes) – because it is not actual system users testing the system by putting it to practical use, it is possible to test prototype systems that are not yet at full functionality (i.e. paper prototypes)

Weaknesses

  • Usability analysts are not representative of real system users – a usability problem that is brought out during the contextual use of a system (i.e. a physician ordering a prescription) may not be the same type of usability problems as would be found by an analyst in a heuristic evaluation (since they are not using the system the same way an actual system user would)
  • Doesn’t systematically create improvement strategies – the outcome of a heuristic analysis is a list of found usability problems for a given system (related to an established set of usability heuristics). This doesn’t, however, provide much guidance to the system’s developers towards how to fix these found issues.
  • Rarely captures all system usability problems – Heuristic Evaluations aren’t systematic, so there is no way to guarantee that all usability problems are found. It may also miss many usability problems that are created from actual, contextual use of a system
  • Biased by the current mindset of the evaluator – an evaluator in a pleasant mood may not find as much wrong with a system as an evaluator in an unpleasant mood

Return to tabs

Plan

Determine the number of evaluators

Determine the number of evaluators

Determining the correct number of evaluators to be used in a Heuristic Evaluation is critical to the success of the study. Having too few evaluators can lead to missed usability problems; however, using too many evaluators can lead to time delays and increased costs. Heuristic evaluators can generally find between 20-51% of a given system’s usability problems (Nielsen & Molich, HEURISTIC EVALUATION OF USER INTERFACES, 1990). It has been shown, however, that different heuristic evaluators (working independently) tend to find different types of usability problems. This means that the evaluation results from several evaluators can be aggregated to increase the total percentage of found usability problems.

It is recommended that three to ten evaluators should be used in a Heuristic Evaluation. A cost benefit analysis should be conducted to determine (from three to ten) just how many evaluators to use on a specific study (the cost of the Heuristic Evaluation vs. the savings incurred from a resolved usability problem). Heuristic Evaluations have both fixed costs (planning, documentation, etc.), which are incurred regardless of the number of evaluators used in a study, and variable costs (i.e. evaluator’s salary, additional study result compilation time, etc.), which increase with each additional evaluator used in a study. The fixed costs of a Heuristic Evaluation range from $3,700 - $4,800, as a flat fee for all Heuristic Evaluations, while the variable costs range from $410-$900 for each evaluator used in the study (Nielsen J. , How to Conduct a Heuristic Evaluation, 2001). For a Heuristic Evaluation to be beneficial, these study costs need to be balanced against the cost-savings of removing usability problems from the systems under study or against other quality factors (i.e. system efficiency, patient safety, etc.). More evaluators (closer to ten) should be used for mission critical systems (i.e. systems that can affect human safety, or the stability of an organization), or for systems where found usability problems by the system users would result in high costs to the system’s developers. If, on the other hand, it isn’t as critical to remove the usability problems from a system (i.e. for a simple system that is rarely used) or if other usability engineering methodologies will later be conducted, then fewer evaluators (closer to 3) can be used.

The paper (Nielsen & Landauer, A mathematical model of the finding of usability problems, 1993) proposes a mathematical model for estimating the number of evaluators needed to find a given ratio of a system’s usability problems in a Heuristic Evaluation:

Problems_Found(i) = N(1-(1-L)i)

Where:

  • Problems_Found(i) = number of problems found with i evaluators
  • N = Number of problems in the user interface
  • L = the percentage of usability problems found per user, which was found to range from 19-51% with a mean of 34%
  • i = the number of evaluators used in the study

For example, if it was estimated that a system had N=100 usability problems, and that each evaluator would be able to find approximately L=35% of the system’s problems. A Heuristic Evaluation that used i=5 evaluators would find Problems_Found(i) of the 100 problems:

Problems_Found(i) = 100(1 - (1 - 0.35)5)
= Problems_Found(i) = 100(1 - (0.65)5)
= Problems_Found(i) = 100(1 - 0.116)
= Problems_Found(i) = 100(.884)
= Problems_Found(i) = 88.4

∴ Approximately 88.4 of the system’s 100 usability problems would be found, when using 5 evaluators (assuming the evaluators can find 35% of the usability problems individually).

Return to tabs

Select the evaluators

Select the evaluators

Once the number of evaluators has been determined, the evaluators can be selected. Typically, heuristic evaluators have backgrounds in Human Factors or Human Computer Interaction (HCI); however, non-expert analysts have been shown to be useful in Heuristic Evaluations as well (Seffah & Metzker, 2009). It can be advantageous to select evaluators from different backgrounds or demographics, as this can increase the likelihood of the evaluators finding different types of problems from one another. It’s also recommended that the system designers are not used as the system heuristic evaluators, as it is often-times difficult for people to find usability problems that are self-introduced.

Return to tabs

Select the evaluation heuristics

Select the evaluation heuristics

The usability heuristics set chosen for a Heuristic Evaluation should be easy to use, and well defined. There are several previously established and validated system usability heuristics sets that can be easily applied to most any usability study. The most popular usability heuristics used in practice are Jacob Nielsen’s ‘Ten Usability Heuristics’. These heuristics are defined in (Nielsen & Mack, Usability Inspection Methods, 1994) and (Nielsen J. , Ten Usability Heuristics, 2005) as:

  1. “Visibility of system status - The system should always keep users informed about what is going on, through appropriate feedback within reasonable time.
  2. Match between system and the real world - The system should speak the users' language, with words, phrases and concepts familiar to the user, rather than system-oriented terms. Follow real-world conventions, making information appear in a natural and logical order.
  3. User control and freedom - Users often choose system functions by mistake and will need a clearly marked "emergency exit" to leave the unwanted state without having to go through an extended dialogue. Support undo and redo.
  4. Consistency and standards - Users should not have to wonder whether different words, situations, or actions mean the same thing. Follow platform conventions.
  5. Error prevention - Even better than good error messages is a careful design which prevents a problem from occurring in the first place. Either eliminate error-prone conditions or check for them and present users with a confirmation option before they commit to the action.
  6. Recognition rather than recall - Minimize the user's memory load by making objects, actions, and options visible. The user should not have to remember information from one part of the dialogue to another. Instructions for use of the system should be visible or easily retrievable whenever appropriate.
  7. Flexibility and efficiency of use - Accelerators -- unseen by the novice user -- may often speed up the interaction for the expert user such that the system can cater to both inexperienced and experienced users. Allow users to tailor frequent actions.
  8. Aesthetic and minimalist design - Dialogues should not contain information which is irrelevant or rarely needed. Every extra unit of information in a dialogue competes with the relevant units of information and diminishes their relative visibility.
  9. Help users recognize, diagnose, and recover from errors - Error messages should be expressed in plain language (no codes), precisely indicate the problem, and constructively suggest a solution.
  10. Help and documentation - Even though it is better if the system can be used without documentation, it may be necessary to provide help and documentation. Any such information should be easy to search, focused on the user's task, list concrete steps to be carried out, and not be too large.”

More detailed heuristic sets, such as (Smith & Mosier, 1986), which has in the order of one-thousand different usability heuristics, can often times be too difficult to use in practice, as they would require much more time to learn before conducting the evaluation, so it is normally recommended to use more generalized usability heuristic sets. Supplementary, context specific usability heuristics can also be added onto a pre-existing, general heuristic set (such as Nielsen’s ‘Ten Usability Heuristics’). Such category specific usability heuristics could be generated via a competitive analysis of like systems to the one under study (Dykstra, 1993).

Return to tabs

Select the evaluation strategy

Select the evaluation strategy

There is generally no evaluation strategy needed for conducting a heuristic analysis. The evaluators can decide how to use the system under study during the evaluation. Sometimes, however, it can be helpful for the evaluator to have a use-case-scenario to use whilst performing the evaluation. A use-case-scenario is a script, or a set of instructions that guide the user through a situation in which a typical system user would face when actually using the system. Because the heuristic evaluators aren’t necessarily domain experts in the system under study (i.e. the evaluator may be a HCI expert, yet the system is a physician tailored EMR, meaning the evaluator may not know how to use the system), they may want some background guidance on how the actual system users would use the system. Most of the times, however, this is unnecessary, as the evaluators are just looking at each interface screen, and evaluating it based on the given set of usability heuristics.

Another evaluation strategy consideration to make before conducting the tests is whether or not to use an observer to record the experts’ evaluation results. Although more expensive than having the evaluator record their own results (since an additional staff member would have to be paid), the use of an observer can help the evaluator spend more time on finding the usability problems, and less time on documenting. An observer can also be useful to help answer any questions that the evaluator may have about the system whilst testing it (i.e. “How do I exit out of this screen?”, “Where is the search function?”, etc.). The observer should not, however, provide any system use information to the evaluator during the system test that isn’t requested by the evaluator, as it may interfere with the test results (i.e. describing in detail to the evaluator how to best find the search function before the evaluator asked where the search function is, may lead to the evaluator not commenting on the fact that it is difficult to find the search function). Having one observer for all evaluations may also be beneficial in that the evaluation documentation will be standardized across studies.

Return to tabs

Do

Conduct the independent evaluations

Conduct the independent evaluations

Each Heuristic Evaluation is conducted individually by the evaluators (with the option of having an observer record the evaluation results, as mentioned in the Select the evaluation strategy section). As the evaluators use the system (either freely or with the aid of a use-case-scenario), they must record the usability heuristic violations as they occur. These violation recordings should include:

  • the name of the heuristic that was violated;
  • • a description of the problem (what happened, when did the problem occur, where can the problem be found on the system, etc.);
  • • a severity ranking of the problem (i.e. from 1-5, where 1 = a minor nuisance and 5 = a critical, ‘show-stopping’ error);
  • • and (optionally) a personal recommendation about how this usability problem could be prevented/fixed.

The evaluator should inspect all of the system components at least twice (once to become comfortable with the systems use and to understand its scope, and then at least one more time to find the usability problems).

Return to tabs

Compile the evaluation results

Compile the evaluation results

After each individual evaluator has completed their own evaluation, the results of the evaluations need to be compiled into a single document. Using spreadsheets for the compiled evaluation results can be beneficial for the later analysis phase of the evaluation, as they can enable automated sorting of the found heuristic violations by error name, heuristic type, evaluator name, usability violation severity, etc.

Return to tabs

Study

Conduct a results debriefing session

The next stage in a Heuristic Evaluation involves gathering the system evaluators and developers, as well as any other relevant system project stakeholders together in a meeting to discuss the evaluation results. This can help the evaluators clarify their system inspection findings, and the meeting can also be used for collaborative brainstorming about how to resolve the found usability issues.

Create a results and recommendations report

The debriefing session should have created a universal understanding of what the found usability problems were with the system, and it may have additionally created some strategies aimed at combating such issues. These findings should be written as a formalized report so that it can be later referenced by the system developers in the next phase.

Return to tabs

Act

Iterative input into design

With an understanding of the Heuristic Evaluation results gained from the debriefing session, together with the formalized results document, the system designers can utilize this information to iteratively improve their system by removing the found usability violations. After the system revisions are made, the Heuristic Evaluation can be repeated.

Return to tabs

PDSA Summary

Plan

The following considerations need to be made when planning a Heuristic Evaluation:

  • How many evaluators should be used?
    • 3-10 evaluators is usually an appropriate number (depending on the cost-benefit analysis of the study vs. having usability errors in the system)
    • Problems_Found(i) = N(1-(1-L)i), where: Problems_Found(i) = number of problems found with i evaluators; N = Number of problems in the user interface; L = the percentage of usability problems found per user, which was found to range from 19-51% with a mean of 34%; and i = the number of evaluators used in the study
  • Who should be chosen as the system evaluators?
    • Different types of people should be chosen as the system evaluators
    • Heuristic evaluators do not have to be actual target end-system users (they just have to have some expertise in Human Computer Interaction or Usability Engineering)
  • Which usability heuristics should be used in the study?
    • The usability heuristic set should be clear, concise, and usually previously established and validated
    • The most popular usability heuristic set is defined in (Nielsen J. , Ten Usability Heuristics, 2005)
  • What should the evaluation strategy be?
    • Should the evaluators freely use the system, or should they use use-case-scenarios?
    • Should an observer be used to record the evaluation findings and aid the system evaluator?

Do

The following considerations need to be made when conducting Heuristic Evaluations:

  • Performing the evaluation
    • Each evaluator should evaluate the system independently
    • The following information should be recorded for during the evaluations:
      • the name of the evaluator
      • the name of the heuristic that was violated
      • a description of the problem (what happened, when did the problem occur, where can the problem be found in the system, etc.)
      • a severity ranking of the problem (i.e. from 1-5, where 1 = a minor nuisance and 5 = a critical, ‘show-stopping’ error)
      • (optionally) a personal recommendation about how this usability problem could be prevented/fixed
  • Compiling the evaluation results
    • The individual evaluation results should be compiled into a single document that enables sorting by result category (i.e. a MS Excel spreadsheet that contains fields for each of the above mentioned evaluation recording categories)

Study

The following activities need to be performed when analyzing Heuristic Evaluations:

  • Conduct a results debriefing session
    • The evaluators, system designers, and other relevant system project stakeholders need to discuss the evaluation findings as well as discuss possible solution strategies
  • Create a results and recommendations report
    • A formalized document should be written to describe the aggregated usability problem findings, as well as any solution strategies

Act

  • The results from the Heuristic Evaluation can be provided to the system developers as a list of improvements to be made to their system.

Return to tabs

References

References

  • Bellotti, V. (1988). Implications of current design practice for the use of HCI techniques. In D. M. Jones, & R. Winder, People and Computers IV (pp. 13-34). Cambridge: Cambridge University Press.
  • Dykstra, D. J. (1993). A Comparison of Heuristic Evaluation and Usability Testing: The Efficacy of a Domain-Specific Heuristic Checklist. PH.D. Diss . Texas, United States of America: Department of Industrial Engineering, Texas A&M University.
  • Nielsen, J. (1994). Guerrilla HCI: Using Discount Usability Engineering to Penetrate the Intimidation Barrier. In D. Mayhew, Cost-Justifying Usability (pp. 245-272). Orlando: Academic Press, Inc.
  • Nielsen, J. (2001). How to Conduct a Heuristic Evaluation. Retrieved March 03, 2009, from useit.com: http://www.useit.com/papers/heuristic/heuristic_evaluation.html
  • Nielsen, J. (2005). Ten Usability Heuristics. Retrieved March 5, 2009, from Useit.com: http://www.useit.com/papers/heuristic/heuristic_list.html
  • Nielsen, J., & Landauer, T. K. (1993). A mathematical model of the finding of usability problems. Proceedings ACM/IFIP INTERCHI'93 Conference, (pp. 206-213). Amsterdam.
  • Nielsen, J., & Mack, R. L. (1994). Usability Inspection Methods. New York: John Wiley & Sons.
  • Nielsen, J., & Molich, B. (1990). HEURISTIC EVALUATION OF USER INTERFACES. Conference on Human Factors in Computing Systems Proceedings of the SIGCHI conference on Human factors in computing systems: Empowering people (pp. 249-256). New York: ACM.
  • Seffah, A., & Metzker, E. (2009). Usability Engineering Methods Plethora. In A. Seffah, & E. Metzker, Adoption-centric Usability Engineering (pp. 15-33). London: Springer-Verlag London Limited.
  • Smith, S., & Mosier, J. N. (1986). Guidelines for Designing User Interface Software - MTR-10090. Bedford: The MITRE Corp.

Return to tabs