What you see is what you get? Yet, for those innocent souls who are visually impaired, Microsoft has created an AI-powered application that is turning the visual world into an audible experience. This app scans and narrates the world. It allows blind people or people with low vision to have more access to information around them.
Our spotlight feature shifts the focus to mobile application Seeing AI and how this technology is literally transforming lives.
About Seeing AI
Seeing AI is a Microsoft research project that brings together the power of the cloud and AI to deliver an intelligent app. This project is expanding the capabilities of computer vision systems to help visually impaired users navigate everyday tasks. It’s currently available in 70 countries and a number of languages.
The application and its functions
Seeing AI draws on AI-powered image recognition and description technology to enable those with visual impairment to recognize all the visual elements in their surroundings via audio narration. It identifies these elements via the user’s smartphone camera and then provides an audio description.
The app can automatically recognize contacts in the user’s phone if there are photos of that person on the phone or if the user has introduced that person to the app beforehand. But it can also estimate the age, sex, race, and emotion of people it does not know yet.
It offers a wide variety of functions including short text reading, document scanning, product barcode recognition, scene preview, person recognition, color recognition, light detection, and the ability to describe images across all social media apps.
I believe that at the end of the century the use of words and general educated opinion will have altered so much that one will be able to speak of machines thinking without expecting to be contradicted
― Alan Turing, Computing machinery and intelligence
The algorithm behind it
As mentioned in an article by Verge, Microsoft’s new image-captioning algorithm will improve the performance of Seeing AI significantly, as it’s able to not only identify objects but also more precisely describe the relationship between them. So, the algorithm can look at a picture and not just say what items and objects it contains (e.g., “a person, a chair, an accordion”) but how they are interacting (e.g., “a person is sitting on a chair and playing an accordion”).
The algorithm, which was described in a pre-print paper published in September, achieved the highest ever scores on an image-captioning benchmark known as “nocaps.” This is an industry-leading scoreboard for image captioning.
Our Take
An app that lets blind and limited-vision folks convert visual data into audio feedback & allows users to use touch in order to explore the objects and people in photos. This is the crux behind Seeing AI, which utilizes cutting-edge technology to create a new dawn in the lives of those that are visually impaired.
What is the Product spotlight?
Product spotlight is our effort to bring unique and innovative AI products that can help businesses and their customers deliver quality and exceptional service. It is a bi-weekly blog that focuses on a single product and how that can help create value.
If you like what you read, subscribe to us and get unlimited access to intriguing discussions, interviews, and articles - all about AI.
Recommended articles -