Ask the Expert: Your Topic Modeling and Machine Learning Questions Answered!

The Team at CallMiner

July 18, 2019

Businessman drawing on virtual screen. ask an expert concept.

Our ongoing AI webinar series has been full of great audience questions on artificial intelligence, machine learning, and natural language processing. We wanted to highlight some from our most recent How to Use Topic Modeling to Extract Conversational Insights. If you missed any of the webinars, we are replaying them all during our Webinarstock virtual conference AI day, Wednesday, July 24th.

How is topic modeling different from categories themselves? e.g. for most used words would we build categories with those words?

Topic modeling is an unsupervised learning technique. That means we don’t know what categories exist in the data yet. We use this technique in the initial exploring phase to find what the common topics in the data. Once you discover the topics, you can use language in those topics to create categories.

Is topic modeling supervised machine learning (ML)?

The goal is to find the topics in data. But in practice, you will likely combine topic modeling and classification models because the outcome from topic modeling is the input classification. You can use classification to verify whether the topic modeling technique makes business sense. We have built a powerful set of tools that can build unsupervised ML topics, but as you know any unsupervised still needs some human intervention, just not in creation.

How can we validate the topic modeling results?

In most cases machine learning models don’t have a business understanding. And because you don’t actually have the label to verify, you cannot do this automatically. With understanding of a particular business, you have to look at these topics and determine if they are important. We are in the process of creating the most common topics, so in the future you will be using our massive library of topics to help make sense of your data.

What do you think is the most powerful insight we can gain from topic turns? Do the set of call topics stay fairly consistent across different types of businesses?

Finding actionable insights can alter agent behavior and make true business impact. But what the actionable insight actually is vary from business and department. For the customer support department, transfer to the wrong agent could be detrimental. And better agent training could alleviate that issue.

Why focus on turn topics instead of call topics?

Focusing on speaker turn topic allow us to generate insight for the elements in the call flow that effected your client’s outcome measure. Let’s say the customer is interested in determining the factor that effect NPS score. Call topics may be able to determine why the customer called, but cannot tell why the NPS score is high or low for that call. Turn topics allow us to distinguish between each turn of the agent and exactly what the agent said that resulted in the high or low score. And then we can either suggest the client to promote or discourage the behavior.

How can we take the topic model and determine if the conversation was a first call resolution (FCR)?

In many situations, you can determine whether you have multiple calls from the same customer in a short period of time from the metadata captured by your call center. If that’s the case, then you may want to determine that’s an FCR issue. But let’s say you are not satisfied with that answer, you can identify the topic turns in your phone call and see if there are language related to positive customer responses like, appreciation or proper closing.

Also, turns are more granular than calls. Just like when designing a recipe, one wants to consider each ingredient carefully. Thinking about the meal as a whole loses that level of small change refinement. Although, call topics would be useful when considering customer journey, their sequence of phone calls can lead to churn or satisfaction.

What are you using for topic clustering? LDA?

We found Latent Dirichlet Allocation (LDA) to not be as accurate and hard to use. We convert the sentence into embeddings and cluster the sentence embeddings because they are similar contextual meanings.

How are the lists of keywords built for each topic?

Traditional technique mainly focuses on the frequency of words in the sentence. You are welcome to try the traditional Latent Dirichlet Allocation (LDA) approach, but we found sentence embedding to be more useful.

What’s an “engineered category”? How is it different than the traditional list of keywords?

Our “engineered category” use both Eureka generated categories that analysts generated using business knowledge, sentence embeddings of the speaker turns, and acoustic features include tone and silence length. They either have professional input with business understanding, sentence level contextual meanings beyond keywords, or audio signal other than words, respectively. These are all features that a simple count of keywords doesn’t have.

What text classification algorithms do you use for noisy data? Have you tried to use VAE or other generative techniques for that task?

We actually think the noise in our text features may tell us additional information about the agent or customer. As to denoising techniques like Variational Autoencoder (VAE), we found they have useful applications to image processing and speech recognition but didn’t give useful information for text analysis. There are some interesting research using XLNet, but we haven’t explored those directions yet.

When we talked about acoustics measures like tempo, agitation etc. is it by word and time?

They exist together, so yes time is the key. But we don’t append the word, we append the context. Some of our acoustic measures, like silences and overtalks, are by word and time. Other kinds of acoustic measures could be useful as well.

Also, the categories we create are based on words, but we keep the timing information by putting words in context and in order. When was a word said could have different meanings according to when it was said in the conversation.

Do business functions play a role in defining or modifying the categories? Influence.

Absolutely. When designing a solution for clients we always have to keep in mind how does it affect the client’s business model. The case we referred to earlier that transferring phone calls to the wrong agent negatively impact the customer satisfaction. The more detailed a client can be, the more help we can provide to direct them to the right solution.

Listen to our on-demand webinar, Sweet Emotion: Measuring Emotion for Better Experiences. Hear how through emotion you can gain actionable insights to better your Voice of the Customer and Voice of the Employee programs.

Why CallMiner?

Contact Center Experience

Frontline Agent Experience

Quality Management

Contact Center Efficiency

Risk & Compliance

Fraud Detection

Sales Effectiveness

Experience Management

Customer Experience

Product Experience

Brand Experience

Industry

Healthcare

Communications

Retail

Finance & Banking

Collections

Insurance

Energy & Utilities

BPO

Travel & Hospitality

Technology

CallMiner Eureka Platform

Customer Stories

The CallMiner Community

Learning Center

CX Landscape Report

About Us

Contact Us

Ask the Expert: Your Topic Modeling and Machine Learning Questions Answered!

How is topic modeling different from categories themselves? e.g. for most used words would we build categories with those words?

Is topic modeling supervised machine learning (ML)?

How can we validate the topic modeling results?

What do you think is the most powerful insight we can gain from topic turns? Do the set of call topics stay fairly consistent across different types of businesses?

Why focus on turn topics instead of call topics?

How can we take the topic model and determine if the conversation was a first call resolution (FCR)?

What are you using for topic clustering? LDA?

How are the lists of keywords built for each topic?

What’s an “engineered category”? How is it different than the traditional list of keywords?

What text classification algorithms do you use for noisy data? Have you tried to use VAE or other generative techniques for that task?

When we talked about acoustics measures like tempo, agitation etc. is it by word and time?

Do business functions play a role in defining or modifying the categories? Influence.

Product Demo Videos

Additional Resources You Might Like:

Webinar on Demand

Leveraging Machine Learning in Conversational Analytics

Whitepaper

Inner Circle Guide to AI, Chatbots & Machine Learning

Related Posts

6 innovative use cases of generative AI in contact centers

Top strategies to enhance agent performance

CallMiner Product Innovation Series: Q2 2025

Products

Solutions

Customers

Resources

Company

Products

Resources

Solutions

Customers

Company