These days, when traveling down a city street with cane or harness in hand, people with visual impairments have a variety of smartphone apps offering turn-by-turn directions and street address and upcoming intersection information. You depart. You arrive. And sometimes along the way you get to “stop and smell the roses,” or “hear the children play.” But what about the rest of the scenery, both along the way and once you reach your destination?
That’s the question Cagri Hakan Zaman and Emre Sarbak asked one another. Zaman is an MIT computer vision specialist. Sarbak is a social media entrepreneur focused on underserved communities. The two suspected that Zaman’s work with computer vision could be used to assist the blind during their travels with object recognition and identification. They began researching the possibilities and polling Boston-area blind individuals for their wish lists.
With grants from the MIT Sandbox, the US Department of Veterans Affairs, and the National Science Foundation, Zaman and Sarbak founded Cambridge research nonprofit Mediate.
They released their first smartphone app, Supersense, for Android in February of 2019, and followed up with an iOS version in March of 2020. The apps are free, but the service uses a subscription model to unlock most of its features. Pricing details are provided at the end of this article, along with a special AccessWorld reader offer.
The duo’s original goal was to use computer vision and machine learning to help blind individuals in unfamiliar environments learn what’s around them. “GPS apps do a great job of getting you from here to there, but there are a lot of things you might wish or even need to know along the way,” says Shane Lowe, Mediate’s Community Operations Manager. “Is there a mailbox on this corner? Where is the bus stop bench? Where is the door to my destination?”
The team used AI to identify over 600 different object types such as trees, chairs, and doorways, and they were able to facilitate the identification in real time. The app’s Object Explorer mode uses streaming video from your smartphone camera. No need to snap photos and wait for images to be uploaded and analyzed.
Likewise if you’re sitting in a room and you want to know what’s around, launch the Supersense app and enable Object Explorer. Slowly pan the phone and the app will identify and speak the names of furnishings: sofas, chairs, lamps, picture frames and such. Again, the identification happens in real time—very handy for a quick look-around to orient yourself in an unfamiliar room or office setting.
The team then turned the Object Explorer concept on its head. Instead of scanning into the void and waiting for the app to pinpoint the nearest chair, the user can decide to look for chairs, and have the app announce the presence of various seating options as you pan the phone.
The Find menu includes a wide range of categories. I offer the entire list, in order, to give you an idea of the breadth of the app’s recognition capabilities.
- Person category: person, human face, hand, and more.
- Seat category: chair, bench, and more
- Door category: door, door handle
- Stairs category: stairs, ladders
- Trashcan category: trash can, waste container
- Vehicle category: car, bicycle, train, and more
- Bathroom category: toilet, sink, toothbrush, and more
- Kitchen category: kettle, microwave, oven, and more
- Cups and bottle category: coffee cup, bottle, and jug
- Utensils category: fork, knife, spoon, and more
- Bags and accessories category: handbag, umbrella, earrings, and more
- Electronics category: TV laptop, keyboard, and more
- Animals category: cat, dog, insect and more
- Bedroom category: wardrobe, chair, pillow, and more
- Buildings and trees category: house, tree, fountain, and more
- Clothes and shoes category: boot, sock, dress, and more
- Food category: food, fruit, tea, and more
- Office items category: whiteboard, stapler, and scissors
- Hobbies category: musical instrument, sports equipment, ball, and more
- Tables category: table, desk, nightstand, and more
- Traffic category: traffic light, parking meter, traffic sign, and more
- Work tools category: hammer, screwdriver, wrench, and more
In Object Explorer mode Supersense searches for and announces items in the above categories. Walking along a sidewalk, for example, the app might announce “car,” “parking meter,” or, even more usefully, “tree branch.” The only direction information you will receive, however, is whatever you can glean from the direction your phone is pointed toward.
By now you are likely listing any number of potential uses for Supersense. Finding that pesky TV remote. Locating an empty seat in the doctor’s waiting room. Navigating your way to the sinks in the arena-size airport restroom. All of these are possible, but with one significant caveat.
For Supersense, camera quality matters. The better your smartphone’s camera, the further Supersense can detect, recognize, and identify.
My own iPhone XR needs to be within five or six feet of a door, window, or chair to recognize it. Users of newer generation smartphones can expect better distance scanning.
Also, when the app did report a door, it did not indicate if it was two or four feet away. These distance limitations may soon be at least partially overcome by the use of LiDAR paired with the company’s newest offering: Super LiDAR. LiDAR stands for light detection and ranging. Basically, LiDAR is similar to RADAR but it uses pulses of infrared light instead of radio waves. Emitted light is bounced off objects, then distance and shape are calculated measuring the time for the reflected light to return to the receiver. It’s the technology behind Google Mapping vehicles and self-driving cars.
LiDAR is currently available on the iPad Pro and iPhone 12 Pro and Pro Max only, but other manufacturers are expected to follow suit.
“With LiDAR, objects can currently be detected up to 15 to16 feet away, and the technology is still in early days,” says Lowe. The reflected light pulses enable the app to calculate both distance and moving direction, much the same as RADAR can determine how far away an aircraft is and the direction it’s flying. With Supersense you may discover a person standing ahead. Using Super LiDAR’s more precise infrared pulse reflection you may also receive extra information, such as “a person wearing a mask,” and the distance between you and that person. Rising and falling tones provide feedback as you approach or move away from the detected person or object. Very helpful if, say, you’re standing in a moving line or reaching for a doorknob.
Super LiDAR is available free from the iOS App Store. There is currently no subscription fee.
According to Lowe, “When we asked users what features and enhancements they would like us to work on the responses were overwhelming. They wanted accessible, easier-to-use text recognition.” The team set to work, and with a recent update to the Android version and the initial release of Supersense for iOS, the app now includes text, currency, and bar-code scanning.
In the default Smart Scan mode, Supersense searches for brief snatches of text, currency, or bar codes, and when it finds them, the app performs automatic recognition.
If Supersense detects a longer document, it automatically switches to Document Scan mode. There, it offers spoken-English guidance, such as “Rotate to 11 o’clock position,” “Move one inch to the right,” and “Move further away from the document.” Image capture is automatic—no fumbling for a button. Recognition is both swift and accurate.
You can bypass Smart Scan and choose one of the following modes directly.
- Quick Read: Reads brief snatches of text. Recognition and voicing are automatic.
- Document Read: Scan a full page and review or save the text.
- Multipage Scanning: Scan multiple pages and then read them as a single document.
- Currency: Recognizes various denominations of US dollars, Euros, British Pounds, Australian and Canadian dollars.
- Bar Code: Auto-scans for bar codes. Variable speed beeps to help locate the code. Works best with the phone positioned approximately one foot away from the product.
- Magnify: Enables the user to zoom in and read the text either via eText or from a magnified image from the camera. Any text is also recognized.
- Import image or PDF: Recognizes text within images and inaccessible PDF files. This feature is also available through various Share sheets.
- Read History: Lists all of your previous recognitions with the date and time and scan mode used. From here you can review, delete, or share the text.
I enjoyed using Supersense to quick scan my mail and read letters of interest. My only complaint is that currently the read-back does not allow speech interruption. Scan a wordy flier and you’ll have to wait for voicing to finish before the app will scan another. Either that or toggle Quick Scan off and back on.
Currency reading worked as well as Seeing AI, and I didn’t have to use the app’s menu to switch to a specific currency mode to accomplish it. Smart Scan did the job. Bar-code scanning also worked well, achieving better results than with Seeing AI. PDFs scanned well. Photos only scanned for text, not object or scene description. I did not evaluate the Magnify mode.
Object recognition was equally swift, and sometimes surprising. On a recent walk my dog tugged her leash and ventured off the sidewalk. She stopped and began pawing at something. Supersense announced, “Scattering moths and butterflies.” Pretty cool. However, when leaving the park I aimed the camera where I know there is a park sign. Supersense announced “Tree,” but when I switched to Quick Scan mode and pointed to the same spot, the text on the sign was read flawlessly.
I found Supersense quite useful for orienting myself when left alone in a doctor’s exam room. Earlier, however, I’d tried using Furnishings mode to find an empty seat in the waiting room. I launched the app and my phone and began to pan—and that was when I thought about the people already in the waiting room who were undoubtedly wondering, “Why is that stranger aiming his camera at me? Am I going to be on YouTube? I haven’t given my permission.”
Perhaps I am overly cautious, but I can't see myself using Supersense to find a urinal in a crowded bathroom, or the shower in an occupied locker room.
Pricing, Contact Information, and a Special Offer
Supersense’s free plan includes unlimited access to Quick Read, Import Image or PDF Mode, and Read History. All other features require a monthly, yearly, or lifetime subscription. Monthly: $4.99 Yearly: $49.99 Lifetime Subscription: $99.99
Payments are billed to your iOS iTunes or Google Play Store account, depending on which version you are using.
If you'd like to put Supersense’s full version through its paces, request a call through the app’s support option or email.. Mention AccessWorld and you'll receive a free initial month subscription.
This article is made possible in part by generous funding from the James H. and Alice Teubert Charitable Trust, Huntington, West Virginia.
- GuideConnect by Dolphin, Part 1: Getting Started by Steve Kelley
- Learning to Code with Swift Playgrounds by Janet Ingber
More by this author:
- Access Essentials for Your iOS Device: Must-Haves for People with Visual Impairments
- Vision Tech: Several Gene Therapies for Blindness Reach Clinical Trials