Artificially intelligent beings, once portrayed in celluloid as gigantic machines inside mammoth control rooms, are now a part of every smartphone (virtual assistants). And as I write this post, technology bigwigs and startups alike are tirelessly engaging in the effort to build the perfect virtual assistant for everyone. In their powerful endeavor, they have been strongly aided by the power of inexpensive, computing platforms in the form of the smartphones. Significant advancements in speech recognition algorithms have also contributed to the optimism. Those who understand technology will also appreciate the fact that access to data through web APIs has also boosted the research and outcome in the domain.
But how far have we gone in achieving the desired success? Perhaps, the built-in assistants in smartphones that respond to mostly unstructured queries are the biggest win we have today. But as we go along, we should look at more specialized and powerful tools. Self-driven cars is a popular example of a product belonging to the desired category. Other areas such as healthcare and cybersecurity are also likely to witness heightened activity in virtual assistants in the future. A lot of success of such plans will depend upon the level of back and forth interaction achieved by virtual assistants as well as the level of personalization they can accomplish. As many would agree, we want to reach a stage where the virtual assistant is so good that we don’t even realize it’s virtual.
Why the optimism? Undoubtedly, we can safely bet that a time when speech recognition will become ubiquitous isn’t far from horizon. Speech will become the primary communication interface between the user and his connected home and devices such as his in-car systems. Even in the professional arena, functions that require hands-free mobility will see a rise in developed speech recognition applications (hospitals, warehouses, labs, shop floors, etc.). A Comscore prediction has placed that by 2020, 200 billion voice searches will be done in a month’s time. As a data scientist, I must add that success will depend strongly on the extent machine learning algorithms will be able to build error free user speech search.
Who is dominating the market? Siri, Google Now, Cortana and Amazon Alexa are the most well-known big names in the virtual assistant market. Other startups in the fray include the likes of Amy (meeting scheduler), Shae (health assistant) and Otto (optical assistant). However, it is important to note that the bigwigs are likely to evolve themselves as platforms, allowing for both hardware and software integrations.
How does machine learning work in virtual assistants? Firstly, it is a combination of several algorithms as multiple issues need to be resolved. Initially, a semantic token parser will be required. This then needs an expert database (logical networks such as Prolog, object based databases, functional representation) of semantic representation. After that, you’ll implement the learning structures based on existing knowledge or by programming it. This serves the pattern matching algorithm. The next step is finding the valid answer in the network, which requires a different algorithm (graph). Then, you’d need an interface algorithm for user interaction (IRC channel, web, web socket, general socket, API). You’ll also need a semantic generator – an algorithm that is generating from the found solution a grammatically and syntactically correct natural language way to represent the answer in a human readable way.
As you’d notice, the process isn’t that easy. And for any aspiring data scientist, irrespective of how many have written off NLP as a technique that has plateaued in utility, this should be inspiring.