Turning the Printed Word into Speech: A Review of Open Book Ruby Edition and Kurzweil 1000
What was the first talking Windows program you used? For many consumers who are blind, it was an optical character recognition (OCR) program, and we did not know or care that we were using Windows. We knew that we had the ability to read printed documents independently and understand almost all of the words.
This article reviews Kurzweil 1000 versions 4.0 and 4.5 from Lernout & Hauspie and Open Book Ruby edition version 4.02 from Arkenstone, the two most popular scanning packages for blind computer users.
How Scanning Works
When a document is placed on a scanner, it is first scanned by a camera. OCR software then converts the images into computerized characters and words. A speech synthesizer speaks the computerized text, and the information is then stored in the computer.
The Many Faces of Print
To determine accuracy, we used a variety of text and format types on both systems. When necessary, we adjusted scanning settings to improve accuracy.
For our simple test, we printed a document with ordinary English text on plain white paper using a laser printer and Ariel 12 point print to be sure that both systems could recognize what we thought should be an easy-to-read item. We printed text in several fonts, including Times New Roman and Ariel bold and italic. We introduced deliberate errors to find out what the two systems would do with a sentence beginning with the word "tke."
For tough jobs, we scanned book pages printed on colored paper, pages with light text on dark paper, pages that were partly black on white and partly white on black, faxes, business cards, magazine pages with nonlinear formats, bills, and tiny print.
We installed two Hewlett-Packard 5P scanners, one on a Pentium 500 with Windows 98 and one on a Pentium 200 with Windows 95. These installations were very frustrating. We had to update drivers from HP to get this scanner to work with Windows 98 and install and remove drivers repeatedly on both machines before Windows would recognize them correctly. We also had to change scanner settings in the Windows 98 machine to prevent a conflict between the scanner card and the DECtalk PC. These problems were not related to or caused by the OCR packages tested here, but users should be aware of potential problems. If you are not technically inclined, buying equipment from a dealer who will set it up for you will be a less challenging experience.
Open Book Ruby Edition Version 4.02
Open Book's default synthesizer, IBM's ViaVoice, is run at the start of the installation so speech is present throughout. We first installed Open Book with ViaVoice and found that the program would only present the opening screen then cease to run with no error message. After much time and assistance from Arkenstone's technical support, we managed to solve the problem and also install Open Book to work with the DECtalk PC.
Open Book offers context-sensitive help. Its Key Describer mode allows the user to press any key and hear its function described. Its manual is available in print and braille and as a text file in the Help folder. Parts of it are accessible from within the program. We would have preferred to have also found it on the Help menu or as an item in Open Book's Library.
All of Open Book's commands are located on the number pad and can also be accessed through conventional Windows menus. One especially beginner-friendly feature is the ability to quit Open Book without saving the document and find it still open when Open Book is started up again. This feature allows the occasional user to simply scan a piece of mail and exit without knowledge of file-naming conventions or Windows menus.
Open Book read our easy-to-read test pages without errors. It did not change "tke" to "the." Open Book can read and scan at the same time. There was sometimes a brief delay in reading when the Scan key was pressed.
When black text on brown or blue paper was scanned, Open Book's recognition was much poorer with the default settings than it was when white pages with black text were scanned. To compensate, we adjusted the contrast setting (Settings/Scanning/Contrast) and had results comparable to those achieved with the same text on white paper. Although Automatic Contrast gives good results most of the time, a custom setting can make an enormous difference.
If Open Book's option to recognize white text on a black background is not checked, users may believe that a page is blank or miss a significant amount of information. We scanned our test page containing sections of light text on dark paper and sections of dark text on light paper and were able to significantly improve the recognition by adjusting the contrast. Open Book made errors on every pass, however. The problem seemed to be a result of varied fonts, rather than colored paper.
The scan of a catalog cover was problematic. After numerous scans and adjustments to the contrast, we were able to decipher all essential information, but the page was never without errors, and different contrast settings yielded errors in different words, meaning that we had no most-correct page.
Open Book consistently made fewer errors on its second pass than on its first pass, so we scanned all problem documents at least twice. On our test fax, Open Book made four errors, once sticking two words together, once putting an apostrophe in the middle of a word, and twice inserting line breaks where there were none in the original text. None of these errors prevented comprehension of the material.
When it scanned the sample business card, Open Book made a minor error in the word E-mail but no errors in phone numbers or other important information. It joined several lines into one, giving us the entire business card on two lines.
With some creative interpretation, we were able to determine from the American Express bill how much we owed and where to send the check. Open Book read numbers accurately, except in one case in which a letter O replaced a digit 0.
The scans of the 4 point and 6 point Times New Roman documents were mostly incomprehensible. At 4 point, numbers were read far more reliably than text, but each scan yielded different results. At 8 point, the text was mostly readable, but there were errors in some of the words and numbers.
Open Book can recognize and read in Spanish and other languages. To read text in a language other than the default, the synthesizer in use must speak that language.
Open Book's dictionary includes etymology and sample sentences. The user can navigate in the dictionary-to spell words, for example. Dictionaries and other tools are in English only.
Open Book offers the ability to launch programs such as Word, WordPerfect, or Duxbury. This feature makes creating braille from print or quoting scanned text in a word-processed document easy and convenient.
Kurzweil 1000 4.0 and 4.5
Kurzweil 1000 version 4.5 was released while this evaluation was in progress, so both 4.0 and 4.5 were tested. The Kurzweil 1000 4.0 installation program was extremely frustrating because of a bug that caused repeated crashes and failed installations. Correction required copying and renaming several files. This problem is fixed in version 4.5. However, the installation program still does not provide enough information to troubleshoot problems unless a screen reader is loaded.
Kurzweil 1000's Help key is identified when the program opens. When it is pressed before another key, the function of the second key is explained. However, this feature works only for keys on the numberpad, not for function keys. The manual is accessible on the Help menu.
Kurzweil 1000 uses the numberpad and the function keys for its commands. Its functions can also be accessed through conventional Windows menus. After settings in the Windows menus were changed, Kurzweil 1000 failed to give feedback when we exited the menus.
Kurzweil 1000 read the easy test pages without errors. Its default setting corrects errors, so it changed "tke" to "the" in the test document. It handles scanning in the background seamlessly, allowing the user to read continuously while scanning.
The test page containing sections of both dark-on-light and light-on-dark text yielded unpredictable results. To find text in the dark region, we enabled the Recognition of Light Text on a Dark Background option. We experimented with the Dynamic Thresholding setting for scanner contrast and tried both enabling and disabling Speckle Removal. On some passes, Kurzweil 1000 read parts of words in the light-on-dark text section but found nothing meaningful. On other passes, all light-on-dark text vanished. Changing from Kurzweil's RTK recognition engine to FineEngine made matters worse, then switching back to RTK and rescanning did not restore the text we had been reading.
The page with shaded dark paper and artistically scattered light text was nearly 100 percent unreadable with Kurzweil 1000. A few partial words were discernable on some passes, but others caused the program to read the text and punctuation in a language other than English.
Kurzweil 1000 made six errors in our test fax, four of them in the time-date stamp. The two errors in the body of the fax did not reduce comprehension. The errors in the stamped information, however, were indicative of Kurzweil 1000's consistent difficulty with reading numbers. The number 5 often became the letter S, for example. Switching the recognition engine to FineEngine eliminated all but one of the errors in the stamp.
Kurzweil 1000 made minor errors scanning business cards but none in phone numbers and other important information. The text was formatted correctly.
For people who return from conferences with folders of business cards, Kurzweil 1000 has one very attractive feature. The scanner boundaries can be set to pass the camera over a small area. Users can place business cards in one consistent location and save the five seconds or so it takes to scan the blank scanner surface each time.
Kurzweil 1000 read the headline of the theater flier page in the middle of the page, but otherwise the text was presented in a reasonable order. Recognition errors existed, but they can probably be attributed to the color of the paper.
Kurzweil 1000 made some errors in the numbers in our sample bill. It also gave the cardholder's name and address as the address to which the payment was to be sent. Of course, human interpretation was used to correct this error in reading columns.
When reading the 4 point and 6 point versions of the test pages, Kurzweil 1000 gave wildly varying interpretations of the page. The fax number "502-7774" was read "502-7776," and "s02-7777," for example. The 8 point version had some errors, and the phone number was not read correctly, but most text was comprehensible.
Kurzweil 1000 easily scans and reads in Spanish and other languages. It also includes an extensive dictionary. Kurzweil 1000 has a feature to allow the user to launch another application, but this feature is not automatically set up during the installation. When the scanned document is dropped into the application, the document format is not converted, making the file unreadable in most cases.
Despite the recognition errors noted, both products performed very well. All OCR packages still make mistakes, some of them amusing. Users who are comfortable with Windows terms and use a variety of applications will probably prefer Open Book. Users who scan in multiple languages or want more help with Windows terms than Open Book provides will prefer Kurzweil 1000.
Arkenstone: "The new Ruby Edition of Open Book's extensive new low vision features, editing capabilities, talking dictionary, braille display integration, BuckScan currency identifier, and well-behaved talking Windows interface are all worth mentioning. Our new dual-OCR engine gives excellent recognition results and supports 400 dpi scanning for small fonts. We suggest potential users test Ruby on their documents to find their best solution."
Kurzweil Educational Systems Group: "It should be noted that L&H Kurzweil 1000 includes a multilingual speech synthesizer on the CD. It is not clear from the review that you can automatically summarize documents, create bookmarks, and take notes, either in the "margins" of the current document or by writing into a second document. In response to the comments made about the Application Launch feature, we think our launch facility is more flexible by allowing the user to add applications to it in the product, rather than in a setup program. Since we support over 120 document formats, we leave it to the user to pick one that is appropriate for the application, rather than choosing one ourselves."
Product: Open Book Ruby Edition.
Manufacturer: Arkenstone; NASA Ames Moffett Complex, Building 23, P.O. Box 215, Moffett Field, CA 94035-0215; phone: 800-444-4443 or 650-603-8880; fax: 650-603-8887; e-mail: <email@example.com>; web site: <www.arkenstone.org>. Price: $995.
Product: L&H Kurzweil 1000 v. 4.5.
Manufacturer: Kurzweil Educational Systems Group, Lernout and Hauspie Speech Products N.V.; 52 Third Avenue, Burlington, MA 01803; phone: 800-894-5374 or 781-203-5000; fax: 781-203-5033; e-mail: <firstname.lastname@example.org>; web site: <www.lhsl.com/education/>. Price: $995 with FlexTalk speech, $1,195 with DECtalk speech.
Previous Article | Next Article |
Table of Contents
AccessWorld, Copyright © 2002 American Foundation for the Blind. All rights reserved.