We’re on the precipice of a multimodal boom

Trend alert! Multimodal AI is gaining more traction. Are you ready for it? NLX is here to help bring multimodal to your self-service experiences.

Andrei Papancea

11/29/2023

It seems that on a nearly daily basis, there are reports emerging about the innovative applications of artificial intelligence (AI). Within the realm of generative AI, one prevailing trend that captivates the attention of experts is multimodality. In March, OpenAI launched multimodal GPT-4, and just the other month Meta unveiled an open-source multimodal generative AI system. Many others have continued to develop similar technology within the AI space since then. 

Just like video killed the radio star, multimodal is going to step into first place over unimodal experiences. It leverages enterprise full omnichannel investment, thoughtfully weaving together channels for the best experience. 

NLX has been doing multimodal conversational AI from the get-go. NLX defines multimodal as two or more synchronized channels.

If you haven’t tried our multimodal conversational AI experience co-created with AWS yet, head to The Showroom and give it a shot now. 


Our clients using multimodal self-service experiences higher CSAT scores (+85%) and better automation rates (+80%). With the addition of generative AI powering even better intent and slot accuracy past what a usual Natural Language Processor (NLP) can do, those numbers are continuing to rise for the better. The result? One of our Fortune 500 CX Leaders said it best, “A win for the customer, a win for the employee, and a win for the stakeholders.” 

Multimodal will expand from common devices like mobile phones and laptops and in-home virtual assistants like the Alexa Echo and Google Assistant to a wide range of devices that surround us in our daily lives like refrigerators, doors, and so much more

To be clear, we’re not saying unimodal experiences will die off - radio and podcasts are still wildly popular, and there are times when you can’t use multiple senses to reply. For example, if you’re driving in the car, you’re going to use voice over a multimodal onscreen + voice experience to remain safe. While single-channel experiences are still highly valuable, trendlines continue to lean toward a more multimodal customer experience ecosystem for better customer satisfaction and self-service automation rates, among many other benefits. 

To learn more about multimodal and how you can incorporate it into your self-service experience, contact us here.

Andrei Papancea

Andrei is our CEO and swiss-army knife for all things natural language-related.

He built the Natural Language Understanding platform for American Express, processing millions of conversations across AmEx’s main servicing channels.

As Director of Engineering, he deployed AWS across the business units of Argo Group, a publicly traded US company, and successfully passed the implementation through a technical audit (30+ AWS accounts managed).

He teaches graduate lectures on Cloud Computing and Big Data at Columbia University.

He holds a M.S. in Computer Science from Columbia University.