AI and Speech Recognition: A Primer for Chatbots
--
Conversational User Interfaces (CUI) are at the heart of the current wave of AI development. Although many applications and products out there are simply “Mechanical Turks” — which means machines that pretend to be automatized while a hidden person is actually doing all the work — there have been many interesting advancements in speech recognition from the symbolic or statistical learning approaches.
In particular, deep learning is drastically augmenting the abilities of the bots with respect to traditional NLP (i.e., bag-of-words clustering, TF-IDF, etc.) and is creating the concept of “conversation-as-a-platform”, which is disrupting the apps market.
Our smartphone currently represents the most expensive area to be purchased per squared centimeter (even more expensive than the square meters price of houses in Beverly Hills), and it is not hard to envision that having a bot as unique interfaces will make this area worth almost zero.
None of these would be possible though without heavily investing in speech recognition research. Deep Reinforcement Learning (DFL) has been the boss in town for the past few years and it has been fed by human feedbacks. However, I personally believe that soon we will move toward a B2B (bot-to-bot) training for a very simple reason: the reward structure. Humans spend time training their bots if they are enough compensated for their effort.
This is not a new concept, and it is something Li Deng (Microsoft) and his group are really aware of. He actually provides a great threefold classification of AI bots:
- Bots that look around for information;
- Bots that look around for information to complete a specific task;
- Bots with social abilities and tasks (which he names social bots or chatbots)
For the first two, the reward structure is indeed pretty easy to be defined, while the third one is more complex, which makes it more difficult to be approached nowadays.