You need to thank the Google Speech experts if you have been experiencing a significant change in your voice search function. Recently, Google announced that they have successfully implemented the system that allows for listening to the human voice automatically with significantly higher accuracy than before. The addition of the exciting recurring neural network feature has now enabled Google to comprehend complete words and therefore present more accurate results.
So what is the Recurrent Neural Network?
This acoustic model system has feedback loops built in such that it is able to account for temporal dependencies much better than before. The RNNs are able to capture words that are spoken out in one breath and that are how they are presenting voice search results more accurately. The quality of voice recognition gets enhanced by a great extent due to this network.
Having created the system, the next challenge was to ensure that the recognition of phonemes within a sentence was possible without any predictions. Using the Connectionist Temporal Classification, models were trained to deliver output in sequential spikes that would in turn reveal the nature of the sounds in waveform. There was no restriction and they only needed to get the sequence right.
Temporal physics hard to understand
If you are still scratching your head about what the above paras were all about then just understand one thing. Google has now become better at voice search as well as its related functions on both the Android and iOS and this includes even nuanced speech patterns. You just have to go back to the times when you tried out voice search in a noisy environment or on the highway to now appreciate the difference. Those with an understanding of acoustic modeling would have understood better and would be in a position to quickly identify the change.
The acoustic models used by Google are the very latest and these have come into play after Google had tried out the Gaussian Mixture Model or GMM as well as the Deep Neural Networks or DNN.
It is the sophisticated and very advanced gating mechanism within the RNN that enables the retention of information much better than the previous RNNs tried out by Google earlier.
What can you as an Android user expect out of this development?
The good news is that when you henceforth conduct a voice search or put out other commands, you will get faster and very accurate results despite the usage of much lower computational resources. You will also benefit dictating on any of the Android devices.
So what does this all mean for us? Google explains that voice searches and commands in the Google app will now use fewer computational resources and also be more accurate and faster to respond. That goes for the Google apps on both Android and iOS. Dictation on Android devices will benefit from today’s change as well.
Google is certainly stealing a march over Apple in coming out with innovative technological solutions to previous problems. That is surely going to help them maintain or even increase the market share they are already enjoying.