Do Not Sell My Personal Information
OPEN APP

Hear, hear! AI is using voice notes to diagnose patients

Researchers are developing AI technology to diagnose diseases through speech analysis. The US National Institutes of Health and pa... Read More

Remember how your voice over a long-distance phone call was enough for your mother to know whether you were tired or feeling low? Artificial intelligence can’t match up to a mother’s intuition, but it’s getting close enough to detect a range of diseases just by hearing a few seconds of a voice clip.

The US National Institutes of Health — in collaboration with University of South Florida (USF), Cornell and 10 other institutions — has been collecting voice data from all over as part of its Bridge2AI programme to develop AI applications that will diagnose disease through speech analysis. Every nuance of the voice will be scrutinised, from volume, pace and intonation to vocal cord vibrations not heard by the human ear, and even breathing patterns to diagnose not just speech disorders, but also neurological diseases, mental health issues, respiratory problems and spectrum disorders like autism.

“Voice has the potential to be a biomarker for several health conditions. Creating an effective framework that incorporates huge datasets using the best of today’s technology in a collaborative manner will revolutionize the way voice is used as a tool for helping clinicians diagnose diseases and disorders,” said study lead and director of USF Health Voice Center Yaël Bensoussan in a press release.

We all know that slurring is a major indicator of stroke. But a person who speaks slowly and in a low tone might have Parkinson’s disease. Researchers can even detect cancer and depression. Maria Espinola, a psychologist and assistant professor at the University of Cincinnati College of Medicine, told the New York Times that listening to not only what a person says but also how they say it has been key to detecting mental health disorders for long. “The speech of those with depression is generally more monotone, flatter and softer. They also have a reduced pitch range and lower volume. They take more pauses. They stop more often,” said Dr Espinola, adding that anxiety patients “tend to speak faster. They have more difficulty breathing”. Vocal features are being leveraged to diagnose schizophrenia and post traumatic stress disorder as well.



“The technology we’re using now can extract meaningful features that even the human ear can’t pick up on. A lot can happen in between doctor appointments, and technology can really offer us the potential to improve monitoring and assessment in a more continuous way,” Kate Bentley, clinical psychologist and assistant professor at Harvard Medical School, told NYT.

A study published in Mayo Clinic’s digital health journal even used an Indian cohort to detect type 2 diabetes with more than 80% accuracy just by listening to a 10-second voice sample. Recordings of phone conversations, coupled with basic health data like age, BMI and gender, helped the program make a diagnosis.

But what if there is a misdiagnosis? For starters, one can speak in a low tone for various reasons — not all of them indicate depression or Parkinson’s. Only a trained human professional can discern disease in a certain vocal tonality. At best, AI can screen patients, but exact diagnosis is a far cry.



There are voice data privacy concerns as well as the fact that machine learning technologies need a wide variety of voices to work equally well for patients across nationalities, gender and age. “You really need to have a very large, diverse and robust set of data,” Grace Chang told NYT. Chang co-founded Berkeley-based company Kintsugi which is developing technology for telehealth and call-centre providers so that they can identify patients who might benefit from further support. It uses voice recordings from around the world, in many different languages. By using Kintsugi’s voice-analysis program, a nurse might be prompted to take an extra minute to ask a harried parent with a colicky infant about their own well-being. nyt news service & agencies

BOX: What AI tools can pick up from voice clips
Type 2 diabetes
Parkinson’s, Alzheimer’s, stroke
Depression, schizophrenia, bipolar disorders and PTSD
Heart failure, COPD, pneumonia
Autism and speech delay in children
Cancer of the larynx, vocal fold paralysis

02:25



About the Author

TOI Lifestyle Desk

The TOI Lifestyle Desk is a dynamic team of dedicated journalists... Read More

Start a Conversation

Post comment
Continue Reading
Follow Us On Social Media
end of article
Read Next
Visual Stories
More Visual Stories
UP NEXT