There are many frameworks and methodologies that deal with this issue but very few of them have an emphasis on assessing the quality and usability of Web 2.0 applications. This paper contains a critical review of previous research in the field of Web quality assessment. It provides the theoretical basis for the development of a set of attributes that should be considered when measuring the quality of Web 2.0 applications. One of the crucial matters in software development is to what extent users can satisfy with the interfaces and functions provided.

definition of software usability measurement inventory

At the end of the discussion, the entire research team decided to reduce the number of statements for interactive mHealth apps to 21 and the number of statements for standalone mHealth apps to 18. In this study, the goal was to create a short, reliable, and customizable questionnaire for assessing the usability of mHealth apps. All components of a system that provide information and controls for the user to accomplish specific tasks with the system. Modification of a software product after delivery to correct defects, to improve performance or other attributes, or to adapt the product to a modified environment. The capability of the software product to interact with one or more specified components or systems. The degree to which a product or system can be used by people with the widest range of characteristics and capabilities to achieve a specified goal in a specified context of use.

Standardized questionnaires are also available for the assessment of website usability (e.g., WAMMI and SUPR-Q) and for a variety of related constructs. Almost all of these questionnaires have undergone some type of psychometric qualification, including assessment of reliability, validity, and sensitivity, making them valuable tools for usability practitioners. In this study, we required that the study participants had certain characteristics, such as a high school or higher education, age between 18 and 65 years, and some experience using mHealth apps. This excluded some potential participants, for instance, people older than 65 years or people with a low education level.

What Sample Sizes Do We Need?

Because it yields a single score on a scale of 0–100, it can be used to compare even systems that are outwardly dissimilar. This one-dimensional aspect of the SUS is both a benefit and a drawback, because the questionnaire is necessarily https://globalcloudteam.com/ quite general. Another measurement involves counting the number of errors the participant makes when attempting to complete a task. Errors can be unintended actions, slips, mistakes or omissions that a user makes while attempting a task.

  • An exploratory factor analysis was performed on the data collected from all study participants using the MAUQ.
  • Testing to determine the ease by which users with disabilities can use a component or system.
  • The former correlation coefficients were to be used to determine the criterion validity of the MAUQ, while the latter were to be used to determine the construct validity of the MAUQ .
  • It is common practice to substitute website, product, or interface for system without affecting the results.
  • However, since the purpose of this study was to evaluate the newly created usability questionnaire, the participants selected were representative of the majority of mHealth app users and ones who could provide the most reliable assessment on the questionnaire.
  • If any one of the usability experts rated the clarity of a statement 1 or 2, the wording of the statement was adjusted.

One recently published work adjusted an existing usability questionnaire to apply it for mHealth apps . In that study, the authors evaluated the psychometric properties of the existing usability questionnaire designed for health information technology systems by analyzing usability study data obtained from a group of patients with HIV. The authors indicated that this customizable health information technology usability questionnaire, the Health IT Usability Evaluation Scale , worked well in an mHealth app usability study. However, it is tricky to customize the statements in a usability questionnaire, since the change may impact the responses of the study participants.


Look for questionnaires providing normative data (e.g., SUPR-Q, SUMI). The even-numbered items are positive statements, and the odd-numbered definition of software usability measurement inventory items are negative statements. As you can see, the statements use the word system, which reflects its original use for software evaluation.

definition of software usability measurement inventory

For the MAUQ designed for standalone apps, a similar analysis was performed and again, three factors were found. We noticed that there were a few cross-loading items when 0.32 was used as the cut-off value of factor loadings. We chose not to remove these items, since our experience and numerous other usability studies indicate the importance of measuring overall satisfaction, information organization, time spent on the app, and usefulness of the app. In the future, we will conduct studies with larger samples to further evaluate these statements and determine whether they should be kept in the standalone mHealth app usability evaluation. In the MAUQ, we provide four versions for two types of target users and two major types of mHealth apps , which allows MAUQ users to choose the version that fits their needs.

Traditional Usability

From the above list, Sauro recommends using the SEQ since it is short and easy to respond to, administer and score. Naturally, the questions that come to mind are …“Which metric shall I use? ”, “Is this metric reliable enough to give a realistic picture of the degree to which my system is usable ?

Psychometric analysis indicated that the MAUQ has three subscales and their internal consistency reliability is high. The relevant subscales correlated well with the subscales of the PSSUQ. Four versions of the MAUQ were created in relation to the type of app and target user of the app .

The reason for the calculation is the test is based on a 100 score scale, and there are 10 questions, so each should account for 10, maximum is 10 and minimum is 0. A person’s perceptions and responses resulting from the use or anticipated use of a software product. A document specifying a sequence of actions for the execution of a usability test. It is used by the moderator to keep track of briefing and pre-session interview questions, usability test tasks, and post-session interview questions. A simple, ten-item attitude scale giving a global view of subjective assessments of usability.

Then there comes a lot of metrics you can tell, for example the completion time, errors, … that may reflect some aspects of a product. For an overall score of how your product is, well, you may find it useful to try Standardized Usability Tests. A questionnaire-based usability test technique for measuring web site software quality from the end user’s point of view. A low-level, specific rule or recommendation for user interface design that leaves little room for interpretation so designers implement it similarly. It is often used to ensure consistency in the appearance and behavior of the user interface of the systems produced by an organization. A questionnaire-based usability test technique for measuring software quality from the end user’s point of view.

definition of software usability measurement inventory

Moreover, the website created for the MAUQ makes the administration of the usability questionnaire easy. In addition, all the collected data from these usability studies are stored securely on the website for viewing and downloading. An exploratory factor analysis was performed on the data collected from all study participants using the MAUQ. We expected multiple factors for the MAUQ and that these factors would not be totally independent.

Imagine you want to redesign a page/product, you make an SUS survey to test the usability before redesigning. SUPR-Q or Standardized User Experience Percentile Rank Questionnaire helps measure the Usability, Credibility/Trust, Loyalty and Appearance of a website or an application (SUPR-Qm). It consists of 8 questions, 7 of them are to be answered by a 5-level scale while the 8th one is about NPS , which relates to the likeliness of a user to refer the app to a friend. A framework to describe the software development lifecycle activities from requirements specification to maintenance. The V-model illustrates how testing activities can be integrated into each phase of the software development lifecycle.

System usability scale

You should ideally assign a short description, a severity rating and classify each error under the respective category. Although it can be time consuming, counting the number of errors does provide excellent diagnostic information. The correlation coefficients among the scores obtained using the MAUQ, PSSUQ, and SUS were calculated, including the correlation coefficients of their subscales, if applicable, and the intersubscale correlation coefficient within the MAUQ.

definition of software usability measurement inventory

Are any of the sub-scores for either questionnaire especially interesting or relevant for your research? For example, if you are interested in the learnability of a product, then the SUS is a good choice. Determining which questionnaire to use depends on various factors such as the nature of the project, the stage of the research, the goal of the study, and the budget. For example, a survey designed to explore learnability that actually measures system capabilities would not be considered valid. The questions in a questionnaire are usually closed-ended and presented as multiple-choice.

In an earlier article, we provided a comprehensive guide to task-based metrics. Tasks can be included as part of usability tests or UX benchmark studies. They involve having a representative set of users attempt to accomplish a realistic goal, such as finding a movie to stream, selecting a product to purchase, or reserving a hotel room.

Standardized usability questionnaires

However, since the purpose of this study was to evaluate the newly created usability questionnaire, the participants selected were representative of the majority of mHealth app users and ones who could provide the most reliable assessment on the questionnaire. A different usability study method may be used for populations not included in this study. To make it convenient for others to utilize the MAUQ in their mHealth app usability studies, we created a website that includes the four versions of the MAUQ , and some optional demographic questions and open-ended questions typically used in usability studies. The data collected in a usability study are stored on a secure Web server. The user can view a brief summary of the data collected in his/her usability study on the website and download the collected dataset to a local computer for further analysis.

If you can only recruit a small number of users, it is best to choose a measure that can provide valid results with smaller sizes (e.g., SUS, PSSUQ). Likert scale, where 1 represents “Strongly disagree” and 5 represents “Strongly agree”. The final item is the Net Promoter Score , a single question often used as a standalone survey for measuring users’ loyalty. This question uses a scale from 0 (“Not at all likely”) to 10 (“Extremely likely”). Scores for individual questions can also be calculated to give us more insight into usability issues. This is achieved by multiplying the normalized score of each question and multiplying it by 25 to align with the scale used for the overall SUS score.

UX Psychology

3 users manage to successfully complete it – taking 1, 2 and 3 seconds respectively. The fourth user takes 6 seconds and then gives up without completing the task. Although one should always aim for a completion rate of 100%, according to a study carried out by Jeff Sauro, the average Task Completion Rate is 78% . In the same study, it was also observed that the completion rate is highly dependent on the context of the task being evaluated. Referred to as the fundamental usability metric, the completion rate is calculated by assigning a binary value of ‘1’ if the test participant manages to complete a task and ‘0’ if he/she does not.

In current practice, study-based UX metrics include measures of satisfaction, loyalty, brand perception, usability/usefulness, delight, trust, visual design, and special purpose questionnaires. We’ve provided key examples within these categories, but this is not an exhaustive list. In addition to more fundamental standardized UX questionnaires, many special-purpose questionnaires are available to UX researchers.

Quantify your product’s usability with some Standardized Usability Tests. A part of a series of web accessibility guidelines published by the Web Accessibility Initiative of the World Wide Web Consortium , the main international standards organization for the internet. They consist of a set of guidelines for making content accessible, primarily for people with disabilities. A usability test execution activity specified by the moderator that needs to be accomplished by a usability test participant within a given period of time. A process through which information about the usability of a system is gathered in order to improve the system or to assess the merit or worth of a system . Collecting and analyzing data from testing activities and subsequently consolidating the data in a report to inform stakeholders.

SUPR-Q — Standardized User Experience Percentile Rank Questionnaire

The SUMI was developed by the Human Factors Research Group at University College Cork in Ireland, led by Jurek Kirakowski. It is a 50-item questionnaire with a Global scale based on 25 of the items and five subscales for Efficiency, Affect, Helpfulness, Control, and Learnability . As shown in the figure below, users can choose one of three options . The SUMI contains a mixture of positive and negative statements (e.g., “The instructions and prompts are helpful”; “I sometimes don’t know what to do next with this system”).

The demographic information of these participants is summarized in Table 1. You would probably want to compare the SUS score between your product and theirs to make sure it works well to compete. I would imagine that most people would learn to use this product very quickly. The concept of usability as assessed by SUMI draws on the definition in ISO 9241. The test consists of 50 questions, and should be done with a minimum of 10 respondents.