We are continuing our work on automatic recognition in the medical field, for people with disabilities. In fact, automatic speech recognition is increasingly linked to artificial intelligence. In this respect, several tools are working to improve the daily lives of people with pathologies and disabilities. For example, for speech issues. This is the case with Apple’s Siri, Google Assistant and Amazon’s Alexa. We will see here how these voice assistants are developing in such a way that they can even understand “disturbed” spoken messages…
Specific requests to be made orally
According to igen.fr, Apple, Google and Amazon are working on “making their assistants more accessible to people who stutter or, more generally, who suffer from speech articulation disabilities – or dysarthria“.
In fact, until now, when voice assistants have difficulty understanding a query, not only are the results often wrong, but also the assistants often do not continue, even though the user has not finished making the request.
Given that a function – in “accessibility“, “side button” – forces Apple’s Siri to take a voice request to its conclusion, this problem should not continue. Indeed, by taking this factor into account, Siri already prevents the user from being cut off in his request…
Therefore, in order to go further and make sure that voice assistants understand even poorly expressed requests, professionals rely on machine learning.
According to lebigdata.fr, machine learning is an “artificial intelligence technology that allows computers to learn without being explicitly programmed to do so”. These systems rely on data to train and refine themselves.
For example, Apple has built up a database of 28,000 audio clips of stuttering. This data will allow Siri to perfect its voice recognition system.
Indeed, according to 20minutes.fr, we “hear speech defects that are analyzed by Siri. Thus, when the voice assistant is confronted with a person who stutters, it will be able to interact with them“.
For its part, Google is currently testing a prototype – called Euphonia. This system will directly enable people with speech impairments to communicate with Google Assistant and with Google Home products.
Amazon has injected the Israeli Voiceitt technology into Alexa. This allows the algorithms to recognise more and more specific speech patterns.
According to mac4ever.com, Amazon even launched the “Alexa Fund” in December. The aim is to “create an algorithm to recognise the speech patterns of people with diction disorders” … Which could also be very interesting to follow!
The principle is always the same: integrate audio files with pronunciation defects, stuttering, articulatory troubles, elocution. All these defects will train an automatic speech recognition system.
For voice assistants, automatic speech recognition tools and systems, there are different speech disabilities to work on. For example, there are five speech problems that are taken into account:
These five problems are very important in terms of sound processing.
At present, there is only one option for listening to people with a particular speech rate, with pauses. Siri’s “Hold to Talk” increases the amount of time the person listens. The voice assistant understands that a pause does not mark the immediate end of a voice request.
In practical terms, promoting and improving the understanding of people with speech disabilities is not easy for voice assistants that rely on automatic speech recognition. Siri, Alexa and Google assistant have real difficulties.
In fact, voice assistants today are easily able to recognise standard voices, quality audio. However, with speech disorders in particular, it is more complex, because the oral expression varies more. This variability makes the analysis and understanding of oral messages more complex.
At Authôt, we understand these challenges and complexities perfectly.
Specialists in automatic speech recognition, in “Speech to Text”, we offer two online platforms for automatic transcription, subtitling and translation – live or not – Authôt Live and Authôt APP.
We always make it clear to our clients that the quality of the final output of the automatic transcription is conditioned by the recording quality. There must not be too much echo or reverberation. Parasitic noise and speech difficulties must be avoided. For good quality files, our solution is 95% reliable.
Fortunately we have our human proofreading service for certain audio files in particular… Both to deal with this type of problem and also when several modern languages are involved. This is very important.
We are happy with those tools and companies that are evolving like ours and working in terms of accessibility, but this is just the beginning!
It’s useful for simple, short queries, but for long, complex content, it’s best to call in the professionals.
All in all, it’s good news that companies and players like Siri, Google Assistant and Alexa are working for people with speech impairments, disabilities. Not only have companies understood the interest, but the needs are increasingly important and precise! Thanks to our Authôt APP and Authôt Live solutions and our experts, you can be sure of benefiting from the best of technology and people. If you are a school, an institute or a media company, do not hesitate to contact our teams!
Authôt. You speak. We write.