Elaine Gerber, Ph.D.
Senior Research Associate
Policy Research and Program Evaluation
American Foundation for the Blind
11 Penn Plaza, Suite 300
New York, NY 10001
Paper presented at the 17th Annual International Conference of California State University Northridge (CSUN) "Technology and Persons with Disabilities", March 18-23, 2002
Table of Contents
Part I: Background - Usability and User Experience Testing
Part II: Adapting Methods and Unique Considerations
Part III: Generalizable Findings
1. Time spent on tasks.
2. Need for clarity and consistency
3. Patterns of information-seeking
4. Issue of separate screen reader (SR) or "text-only" versions vs. accessibility of default site
5. Repetitiveness of main navigation links
6. Search functions are too complicated and don't always work well
7. Difficulty accessing publications
8. Difficulty with the interactivity of the site
9. Items not included that users would like to see added
Appendix: METHODOLOGICAL CONSIDERATIONS
Additional Rationale for Testing in "Real-Life" Settings
There are three main points to this paper:
To illustrate the frustrations of accessible web sites. That is, even technically compliant sites can be inaccessible to the user, because they are so difficult to use. Compliance with accessibility laws and guidelines, in other words, is necessary but not sufficient for users to access what they need. I hope that illustrating this highlights the importance of usability studies as integral to web design.
To give some background and information about how to conduct usability and user experience studies. I am particularly interested in how they can be adapted for and what may be unique about computer users who are blind or visually impaired. I will include a discussion of the added value that the web offers to this community.
To present generalizable findings, the results of research conducted by the American Foundation for the Blind (AFB), which we hope can be applied more broadly.
Section 508 of the Rehabilitation Act now requires, among other things, that all web sites used by federal employees and members of the public seeking information and services from the federal government be accessible. The World Wide Web Consortium (W3C) has published guidelines (known as the Web Accessibility Initiative or WAI) in addition to the legal requirements. Furthermore, there are now a number of automated tools (such as Bobby and WAVE) which detect compliance with these standards. And the field of accessible web design is growing. However, as any computer user knows, in order to be truly accessible, a site also has to be usable. There has been very little focus thus far on measuring whether what is technically "accessible" for individuals with visual impairments is reasonably usable. In fact, our research suggests that it often is not: Technical compliance with accessibility standards is necessary but not sufficient for building truly usable sites for people who are blind or visually impaired.
This paper presents findings from extensive usability and user's experience research conducted with computer users who are blind or visually impaired. I review previous research conducted in this arena, illustrate how it can be adapted for this population, and present generalizable lessons drawn from three different web site tests conducted by the Policy Research and Program Evaluation Department of AFB, involving over 100 research participants. This paper, I hope, will allow web designers, software and assistive technology developers, researchers, and anyone else interested in learning how to conduct these tests to make more usable, and therefore accessible, web sites.
The field of mainstream web design has incorporated usability testing as its mainstay; research methodologies have been tested and metrics developed in large part due to the effort of Jakob Nielsen. According to his research, Nielsen has found that generally five users is a sufficient sample size to determine 80% of the site level usability (See <https://www.nngroup.com/articles/why-you-only-need-to-test-with-5-users/>). There are however some exceptions to this "rule," and I would add to his preexisting list some additional considerations based on my research among individuals with visual impairments: in particular, testing with people with low vision, older persons, and beginning users, since so little is known about computer use among these overlapping populations.
Previous usability and user's experience research has, almost without exception, been conducted with an able-bodied population, or at least with people for whom the use of assistive technologies was not noticeable. Some of the work of the Nielsen Norman group has begun to look at people with disabilities, and these findings are discussed below. The only other research to date conducted with people who are blind or visually impaired was a pilot study. In this work (see http://www.csun.edu/cod/conf2000/proceedings/0073Barnicle.html), Barnicle identified a number of pertinent questions, such as: How must the testing techniques be adapted to accommodate the needs of participants; Would the study yield useful (i.e., generalizable) data; and How will I know if the obstacles encountered were due to the mainstream software application, the assistive technology or the unique characteristics of an individual user? I hope to identify and answer some of those questions in this paper as well.
The research on which this paper is based consists of three rounds of web usability and user experience testing. Two of these were conducted for the purpose of revising the American Foundation for the Blind's web site (<www.afb.org>) and one round of testing on the Center for Medicare and Medicaid Services' (CMS, formerly HCFA) web site (<www.medicare.gov>). People in the usability field tend to refer to objective, quantitative tests as "usability studies": they gather objective metrics, such as the time a task requires, the error rate, which keystrokes were used for navigation, etc. Expanding the concept of usability a bit more broadly is often referred to as "user's experience" data: these studies include measuring users' subjective satisfaction, why they would visit one site as opposed to others, and for our purposes, how people with visual impairments conceptualize the web. Generally speaking, gathering "user's experience" data is more ethnographic in its approach; my emphasis was to understand how users approach and use a site, what they like and do not like qualitatively about it, and mainly, whether they perceive it as "accessible."
The findings presented below come from tests, all of which included individuals who accessed the screen using screen reading and screen magnification technologies. Testing AFB's web site involved 27 and 29 individual interviews in different rounds; CMS testing included seven at home, individual interviews, and eight focus groups involving 43 participants. A total of 106 people have participated in the testing. Greater detail about our selection process, justification for sampling, and obstacles to recruitment are available upon request (see Gerber and Kirchner 2001).
The second main point has to do with methods. (See the additional methodological considerations below). The Policy Research and Program Evaluation Department at AFB is convinced that a combination of methods, in particular focus groups and individual interviews, works best with this population.
In-depth, in-person, individual interviews (ideally conducted at the subject's natural workstation) take advantage of what is known in the field as the "thinking aloud method." Because the data are gathered through observation, subjects are literally asked to "think aloud," telling the researcher what they are doing and why, as they perform a variety of predetermined tasks. Tasks should be based on one's research needs as well as geared towards the research participant's interests; better data are collected when the individual involved is more highly motivated. The benefits of in-depth interviews (or IDIs) are not necessarily different for a visually impaired than for a sighted sample. That is, the researcher can observe what errors are being made, if the subject is "lost" (they think they are somewhere they aren't), and numerous other scenarios where a difference in perspective between user and observer may arise. Additionally, clarification can be sought for vague descriptions used by participants, such as "over here" or "I like that part there." Similarly, being present while someone is working gives the opportunity to probe on any new, unsuspected issues that arise while they are actively engaged on a site. There is a great need, as identified by Barnicle, for further research to take place in "real life" settings.
The second strategy, and one that deviates from the standard literature of usability, is to test using telephone focus groups. Although there are limitations to telephone focus groups, efforts should be made to work to minimize the effects, as they make obtaining data from individuals with low prevalence conditions much more easy and affordable. In part, Nielsen warns against using focus groups (and he refers to in-person groups, rather than ones conducted by phone) because the results are misleading: he says that individuals tend to focus on the hypothetical (See, for example, <https://www.nngroup.com/articles/focus-groups/>). We circumvented this difficulty by assigning practical "tasks" in advance of focus groups. Individuals were asked to complete two tasks and to spend about 10 minutes just "surfing." Concerns that people bend the truth to be closer to what they think you want to hear or what's socially acceptable were also avoided in this design, as only two people per group were assigned the same task. Positioning the participants as experts by soliciting, and valuing, their opinions further indicated that "there were no right answers", and helped elicit honest responses.
The last two reasons that Nielsen warns against the use of focus groups actually may not apply to the majority of computer users who are visually impaired. Specifically, he suggests that in focus groups users tell you what they believe they did, not what they actually did. Although I agree that memory is highly fallible, I would argue that, on average, people who are blind have trained themselves to be more dependent on their memory than most sighted individuals (both in terms of computer use and, most likely, in terms of other skills as well). For example, it may be that the use of memory results from heavy reliance on technical support (as computer use with adaptive equipment is frequently mediated by experts); simply, people with visual impairments are accustomed to recounting details when they call the "help desk" for support. Regardless of the cause, our data clearly indicate that blind and visually impaired users can remember with a high degree of accuracy exactly which steps they took to accomplish a particular task, which keystrokes or commands they used, and the wording of error messages they received as a result. Examples such as these abound from our research.
The second and major reason that, again on average, this population differs from that on whom all the other usability research has been conducted, is that these individuals seem especially highly motivated to clear any hurdles to accessing computer information. Being blind, these individuals have had limited access to graphical user interfaces (GUIs) and are accustomed to software that is incompatible with adaptive equipment. In other words, they are used to struggling to get the information they need. And, most importantly, because this medium allows users to access information independently (some for the first time), they are extremely motivated. While we encouraged users not to spend more than a half hour accomplishing their task assignment, users usually could not complete the tasks in the given time; however, very few stopped at a half hour, and some continued until they could complete it, taking as much as 10-14 hours.
One participant in the Medicare study told me that she had just spent two days purchasing airline tickets online. When I asked her why she did that, why didn't she just call a travel agent or the airlines directly, she said, "Well, I wanted to be able to do it. I wanted to see if I could."
That the participants in our studies were so motivated, so driven to succeed that they would spend days on the computer at the task at hand, reiterates the main theme of this presentation: visually impaired computer users tolerate many frustrations in using computers, even on web sites meeting technical standards of "accessibility," and they do so, I believe, mainly because of the "added value" that the web offers compared to other sources of information, especially for persons who are blind or visually impaired. By "added-value", we mean mainly unmediated access to information. That is:
a) the information is more consistent than that provided by various phone operators;
b) the information is considered to be more valid than what is offered by phone -- it is "in print", and thus has legal validity, it is accountable;
c) information is available "on call", so that individuals can access it whenever they want, on their own time schedule, including the middle of the night or whenever they have time to pursue the issues in more depth;
d) information can be copied verbatim and shared with others who need it;
e) there is opportunity to come across new, relevant information that they didn't know existed or might not have thought to ask about;
f) accessing it does not require one to divulge private information. Judy who likes to keep her Medicare status confidential, went on to say, "...I think the Internet for my use is good, as I may not be happy talking over the phone, so I want as much information on the Internet as possible."
and, perhaps most importantly for people who are blind or visually impaired,
g) users have direct access to information. This has both practical and psychological implications (i.e., the self-satisfaction, or empowerment, that results from being independent).
The third main objective is to present our results, and in particular, those findings which can be applied more generally to web sites than just the few on which we generated these data. In addition to the items discussed previously, there are another nine presented below. Some of these findings compare results to sighted counterparts (as in the case of the Nielsen Norman study discussed below). However, when asking the question, "How well did our web site fare?", it may be equally or more appropriate to compare the results to other, similar web sites or benchmarks, as also accessed by people who are blind or visually impaired. In the Medicare study, for example, this would mean sites such as WebMD or drkoop.com. Similarly, for users who are blind or visually impaired it would also be useful to compare findings to other sources of information, such as the phone, for getting at the same data. Participants' responses are shaped not only by how they feel an individual site functions, but also by these other experiences.
A discussion of the time it took individuals to accomplish assigned tasks in our studies is presented above. Generally speaking, it took participants far long than anticipated; individuals would continue working until they had succeeded, and they were not "timed-out." The Nielsen Norman Group estimates that the web is about three times easier for sighted users than for users who are blind or visually impaired. Sighted users in their study were six times more successful than users of screen readers at accomplishing given tasks, and three times more successful than users of screen magnification. (See <https://www.nngroup.com/articles/beyond-accessibility-treating-users-with-disabilities-as-people/>). These numbers clearly demonstrate just how poorly the web is designed for people who are blind or visually impaired, and how far we have to go.
People also had problems working with the interactivity of sites, particularly when sites deviated too much from standard Internet conventions. Because this population relies heavily on memorization to aid in their navigation, if there is a convention, use it. Deviating from these norms, means users have to learn a new routine for each site. Many people have told us that they re-visit sites because "they know it," because they "know it's accessible." They re-visit sites because they are familiar with their layout and therefore can navigate more easily. Thus, once a site has set up a system that works well, keep it: change content but don't change the overall gestalt. This point supports the importance of "early and often" user testing.
Elsewhere I have described in greater detail (see Gerber 2002) the two main approaches users take when they navigate the web: scrolling and searching. Designs should take into account the fact that many users will mine for just the kernel of information they are seeking, while others are more apt to listen to the whole page, or the whole list of links, before proceeding.
Most users told us they do not appreciate having a separate SR or "text only" version of a site. They were concerned that they might not be getting the same thing as the default site, and that it might not be updated as regularly. Those who did appreciate the "SR version," because of its name, also expected it would be specially designed for them (but in this case saw that as a positive, assuring ease of use and appropriate content). If a separate version must exist, they definitely recommended calling it a "text only" version, because this makes it a "universal design" feature. Importantly, users of screen magnification had difficulty finding the separate screen reader version in the Medicare study, didn't know it applied to them, and often preferred pages with graphics -- because too much text is harder for them to read.
As Marlene described it, "I like graphics. If I go to a site with a text version, that is not my first choice. I like the option of looking at the page, with a text version I have to scroll down more...I want to find information quickly and move on with less reading which is not possible with text version."
Having repetitive links is time-consuming, frustrating, and presents serious obstacles to navigation and orientation.
William described this problem: "The biggest thing I found was when I down arrowed and found what I was looking for, when I clicked on it I have to hear that full list of things again. It would be nice if these sites could be indexed like how tapes are. We heard it once, why do we have to hear it again? For example, from the home page, I clicked on FAQ; it would be good if it went to there but instead it starts at the top again."
Karen elaborated, "I found the repetition to be very difficult in terms of navigation. I would be on a page and I would know that this or that information was presented, and then I might come back to it and find it again and think, 'Have I been here or is this a new page?' Because it seemed to be so repetitive. So that was a little bit confusing for me."
Build into design the ability to skip links, particularly main navigation links. Furthermore, immediately confirm home page, once loaded, and subsequent pages, once loaded.
Search functions were very often felt to be too complicated; they didn't function properly to get users the information they wanted. Searching was generally problematic. Users suggested that, to make it more effective, create search engines that are key word and text sensitive searches, that the search recommend the next closest matches in case of misspellings or accidental key strokes, that it indicate when it was actively searching, and that users be taken immediately to their results. There were also numerous suggestions about how to present the results more clearly. These particularly had to do with less jargony, more user-driven language. Results generally were not where or what users expected. For example, on AFB's web site people had difficulty locating "jobs" (it was housed under the broader category of "community"). And with Medicare, in a section that compared nursing homes, the language reflected an agency or researcher, not a user's, perspective -- such as "total number of health deficiencies," not something users knew readily how to interpret.
Documents in PDF are considered to be inaccessible. Period. Although some very high-end, expert users know how to work with Adobe Acrobat, the majority of participants in our studies (who also had more experience than the average user) had difficulty with the files, were intimidated or didn't want to download from the web, had the download crash their system, and considered PDF to be inaccessible to them. This again proves the point that accessibility does not necessarily guarantee usability: when we design for the web, we need to design for an average, not high end, user. For a more technical explanation of the problems surrounding PDF, please see (Sajka and Roeder 2002): <www.afb.org/AboutPDF.asp>
The interactivity on web sites, particularly although not exclusively, with forms -- order forms, "shopping carts", and the like -- posed serious difficulty. In designing forms: A) make the tab order follow logically; include only one item per line; and label it appropriately. B) Consider longer times before being "timed out" and the ability to "page back" without losing all entered data. C) Place the submit button close to the last entry. This last consideration is particularly important for users of screen magnification. The Medicare web site is riddled with places where the submit button is located after confusing and lengthy layout (including misleading "top of page" and "bottom of page" arrows). I invite you to try their nursing home comparison, and remember you will only have a fraction of the screen visible when it is magnified, at: http://medicare.gov/NHCompare/home.asp#NewSearch
In thinking about how users who are blind or visually impaired benefit from electronic media, building sites that consider the added value will be more useful, and therefore more likely to be accessed by this population. Consider the case of Medicare once again. Users wanted additional links because they rely on online sources for prescription drug information. Users wanted the ability to email Medicare about coverage questions. They wanted replies sent to them in an accessible format. They suggested setting up a "my account" section. Those of you in the audience interested in electronic banking, as well as other fields, might do well to think of the added benefits that web access can bring to your clientele.
In conclusion, the single most important thing I want to convey is the importance of testing sites, including accessible ones, for usability. Users have told us time and again, that if they had a choice between two sites to get what they needed, they would go to the one where they knew how it worked, where it was easy to get what they needed, and they would keep coming back-using it almost as a portal, if you will. So, if you are trying to drive traffic to your site, build a site that is designed from the point of view of the end-user, as they have told us repeatedly that they would prefer to go to sites that they know are usable. Test early and test often. Thank you.
Research design will vary according to the objectives and nature of the particular research project at hand. The following methodological considerations are lists of details to be considered, and modified according to the nature of specific projects.
- Participants should be given compensation, which may be monetary or non-monetary.
- Obtain consent for all participants about taping and uses of the data.
- Provide participants with a copy of the findings, or summary of the findings, that has been cleared with the client.
- Recruitment variables will vary depending on the nature of the project. See below.
Screening variables on which respondents should meet a minimum requirement:
a) Vision loss - minimum is self-reported, e.g., "ongoing difficulty seeing words and letters in ordinary newsprint, (even with glasses on, if usually worn);"
b) Use of the internet - Has ever used the internet, and has current access at home, work, a library or other place that the person does visit.
c) Language/literacy - English-speaking, unless some provisions are made for moderators who speak another language; literacy presumed if has used Internet.
Respondent characteristics which can be confirmed by a telephone screening interview and used to achieve diversity, are:
a) Degree of vision loss: No usable vision vs. otherwise visually impaired.
When considering the degree of vision loss, we generally advise organizing groups according to whether they use visual (i.e., screen magnification) or non-visual means (i.e., screen readers) to access the web. In the research presented above, users of screen readers and screen magnification were grouped separately from each other whenever possible, as their concerns tended to differ; all other variables were mixed within groups.
b) Computer experience: Less experienced vs. more experienced
Diversity can also be sought on the following characteristics, but may be secondary in the selection process:
a) Geographic location: Broad census-defined regions, i.e. east, midwest, west, south
b) Age or "life stage" (e.g., school/transition age; young and middle aged adults; older adults)
c) Employment status (employed now or ever vs. never employed)
d) Educational attainment (and implied literacy level)
e) Age at loss of vision / length of time visually impaired (related but separate variables)
f) Ethnic identity (1st or 2nd generation U.S. citizen, any nationality; African-American/Blacks of 3rd+ generation; other)
e)* Health status (self-reported as "excellent, very good or good" vs. "fair or poor")
f)* Medicare eligibility and experience (very recent vs. long time or frequent user)
* for our purposes, these last two variables only applied to the Medicare study
Usability testing conducted in the user's home environment, by an anthropologist or other trained researcher, adds an ethnographic approach to the typical "laboratory" usability technique; this is an especially strong methodology because of its authenticity. That advantage has several aspects:
a) it shows how the web site works on computer hardware and access software that are what people really use, i.e., presumably older and with more limited capacity than what they would experience in a "lab" setting; this takes into account other software they have installed at home;
b) it takes place in environmental conditions (e.g., lighting, background noise, clutter in the vicinity, family distractions, etc.) that closely approximate actual usage, allowing for slight alteration due to the researcher's presence;
c) users are more at ease than in the "lab" setting, thus enhancing performance, and also tending to make their comments more meaningful in terms of their actual behavior in seeking information, particularly if the topic involves personal matters, such as health.
d) Finally, and importantly, this technique clearly reduces "respondent burden" both in terms of their not having to travel, and comfort during the testing. This makes it possible for people to participate who would not agree to testing in a "lab" because of the travel.
It is always advisable to know what the limitations inherent in one's research design are, in order to minimize their impact, if possible, and to know how they may shape the extent and quality of the data gathered. Consequently, we compiled a list of issues which represent possible limitations. In addition to considerations presented in the body of the paper, researchers should consider the following:
a) Number of groups - Given the nature of one's project, there may be many relevant demographic, health, computer usage variables, etc under consideration. Consider the number of groups and/or individuals involved in the study. Do they represent the minimum acceptable composition? How would you expand your project if you had unlimited time and resources (e.g., including Spanish-speaking users, etc)?
b) Composition and size of groups - It is desirable to make focus groups more homogeneous in respects besides whether they use visual or nonvisual access; this may only be possible with a larger number of groups. For example, in the Medicare study mentioned above, we aimed to have separate groups of beneficiaries by age and by disability status. However, limited time and resources required us to occasionally mix those types of respondents within a group.
We experienced, as expected, that it would be particularly difficult to find older individuals who are beginner or intermediate-level Internet users, i.e., older visually impaired persons typically are either not computer users at all or have been using computers for a long time and are quite expert.
c) Recruitment sources - A main limitation is the heavy reliance on a single strategy or source for recruitment (e.g., via electronic advertisements to listservs). While online sources or known computer user groups are valuable resources to achieve relatively well-targeted recruitment in a short period of time, this design will not result in representativeness, and is biased toward people skilled in using computers with assistive technology.
Use of prior research subjects has the advantage that one already knows about many of an individual's relevant characteristics, but has the disadvantage that those persons may become "professional respondents" and therefore less representative of other users.
d) Content - See above for a critique of obtaining precise or reliable information about issues of content and navigability via focus groups. It has been suggested in previous literature (i.e., <https://www.nngroup.com/articles/first-rule-of-usability-dont-listen-to-users/>) that focus groups can not provide the same level of detail as observation, and it is possible that this will be a limitation. In our research, however, we attempted to compensate for this by complementing our focus group findings with in-person interviews. Moreover, there are certain characteristics of the study population which we believe actually increase the utility and accuracy of such information (again, see above).
American Foundation for the Blind web site: www.afb.org
Barnicle, Kitch. 2000. Paper presented at the California State University Northridge (CSUN) "Technology and Persons with Disabilities" Conference. See: http://www.csun.edu/cod/conf2000/proceedings/0073Barnicle.html)
Bobby Worldwide, Center for Applied Special Technology (CAST). 1984. See:
Centers for Medicare and Medicaid Services (CMS) web site. Department of Health and Human Services, U.S. government. See: www.medicare.gov
Federal standards for accessibility can be found at:
Gerber, Elaine and Kirchner, Corinne. "Social Research on Use of, and Preferences for <www.medicare.gov> By People who are Blind or Visually Impaired." Unpublished report, Policy Research and Program Evaluation, AFB, December 2001.
Nielsen, Jakob. 2000. "Why You Only Need to Test With 5 Users." The Alertbox: Current Issues in Web Usability. See: <www.nngroup.com/articles/why-you-only-need-to-test-with-5-users/>.
Nielsen, Jakob. 2001. "First Rule of Usability? Don't Listen to Users." The Alertbox: Current Issues in Web Usability. See: <http://www.useit.com/alertbox/20010805.html>.
Nielsen, Jakob. 2001. "Beyond Accessibility: Treating Users with Disabilities as People." The Alertbox: Current Issues in Web Usability. (Work prepared by the Nielsen Norman Group). See: <https://www.nngroup.com/articles/beyond-accessibility-treating-users-with-disabilities-as-people/>
Sajka, Janina and Joe Roeder. 2002. "PDF and Public Documents: A White Paper." See: <www.afb.org/AboutPDF.asp>
WAVE. Len Kasday, Pennsylvania's Initiative on Assistive Technology (PIAT). Institute on Disabilities, Temple University. See: www.temple.edu/inst_disabilities/piat/wave/
Web Accessibility Initiative. World Wide Web Consortia, 1997 - 2001 (W3C, MIT, INRIA, Keio). See: <http://www.w3.org/WAI/>