Aravind Ganapathiraju Detailed the Challenges in Applied Conversational AI
In an insightful Worldwide AI Webinar session, Aravind Ganapathiraju, the VP of Applied AI at Forbes AI50 company Uniphore, walked the audience through the challenges in applied conversational AI. Read on for the highlights of his keynote.
Watch his whole presentation on our website or YouTube channel.

A brief background of conversational AI and its future

Conversational AI, according to Aravind has been in existence for a while. It was primarily a replacement for the traditional IVR systems, which were used for punching in numbers to go through a simple flow to accomplish a basic task. Conversational AI helped reduce the mundaneness of that task to make it more natural yet the flows were still simple.
Conversational AI systems are capable of handling complex flows. They can handle soft landings when systems make mistakes or the customer provides incorrect information.
Conversations don't reach a dead end. There are simple ways in which you can change the flow of the conversation to solicit the correct information so that the customer on the other end doesn't feel burdened by the fact that they're talking to a machine.
Then the advancement in speech recognition has made dealing with intents and other aspects quite easy within the conversations. The future seems even brighter given the complexity of conversations that we can handle today because of the advancements in ASR, national language processing, and computing power.
Aravind also touched upon what he thought conversational AI would be capable of in the near future. He believed that conversational AI could handle a higher level of complexity and we would go beyond the simple self-service systems or intelligent assistants to mining information and giving guidance to humans while the conversation is happening in real-time.

Challenges in applied AI

Aravind talked a bit about the obstacles that teams have encountered with generic applied AI. 
From his experience, to approach applied AI, one has to start with a use case definition. Then you have to decide from a plethora of great advancements in research, which one of those advancements is relevant to your particular use case and your particular problem. Once that is decided, then you have to start digging into what is it that you want to model, the features that you want for modeling, the data that goes into the modeling, which falls under DataOps, and how you going to deploy once the models are built, which falls under MLOps.
Next, you’d have to think about a production system. Considering computer resource constraints and data constraints is vital. When you think about data, you also need to consider the various sources from which the data is ingested into the system, which would require a data ingestion pipeline data transformations, and several other aspects of data.
Fortunately, technology is advancing at a rapid pace where feature engineering will be a thing of the past as claimed by Aravind because of the advancement of transformers and sequence-to-sequence models as well as the need for feature engineering in deep learning has gone down significantly over the past few years.

Challenges in applied conversational AI

Aravind listed out a couple of challenges of conversational AI he has witnessed throughout his career.

1. Multifacetedness of conversation

According to Aravind, what makes conversational AI challenging is the multifacetedness of conversations. The first facet he mentioned was the domain. He talked about the release of a large open-source ASR model called Whisper by OpenAI, which was trained on 600,000+ hours of speech and their struggle with training systems even with hundreds of hours of speech. He said that this took it to the next level by changing the complexity by two orders of magnitude.
Aravind also brought up multimodal systems and infrastructure constraints as the other features of conversations.
He dug deeper into the complexity of language.
Aravind confirmed that there isn't a conversational AI system out there that hasn't taken into account the impact the language coverage has in the planning that they do. Low resources available for a given language, multilingualism, accented speech, languages with complicated writing, etc. are a few aspects of languages that need to be taken into consideration when approaching conversational AI. 

2. Choosing the right model

Aravind encouraged teams to use simple models. In several cases, for NLP specifically, he has found that simple TF-IDF with embeddings and cosine similarity can lead to great systems with excellent accuracy.
He recommended being prudent with what you choose and with your modeling data needs. You also have to decide whether you want to do fine-tuning starting with a pre-trained model or you want to build it from the ground up.

We are entering the golden age of conversational AI

Aravind emphasized his belief that we are entering the golden age of conversational AI for three reasons:
  1. The technology is ready because of the advent of transformers and the advent of end to end neural approaches: Accessibility to the compute today is way beyond the imagination of what his generation thought would be possible and there’s a huge availability of pre-trained models that can be fine tuned to a lot of applications and use cases.
  2. The people are ready: Aravind talked about “the army of practitioners” today and the availability of open-source literature that once was limited. The business is lucrative, which means people are attracted to pursue this as a field of interest and hence become available to contribute to building great systems.
  3. The data is ready: There is a lot of open-source data available to build conversational AI systems both for speech recognition and natural language processing. There is now a great vendor pool out there that wasn’t available a few decades back. Tools for rapid development are also accessible. Low resource of data as Aravind put it is “history at this point”.
“Conversational AI truly is ready for takeoff. We have the tools to make it happen. We have the data that's available and the vendors that can make it available for us. We have the infrastructure and the computing. We have the models and the science that has improved over the past few years that can make this really feasible so I'm really optimistic about what conversation AI can bring us.”