[Webinar] Learn how Assurant is scaling AI in the contact center
Watch on-demand

Blog

All
Browse by topic
Customer Experience
R&D Innovations
Articles

Training natural language models to understand new customer events in hours

by 
Yi Yang
Article
Video
Nov 17
2 mins

Named entity recognition (NER) aims to identify and categorize spans of text into entity classes, such as people, organizations, dates, product names, and locations. As a core language understanding task, NER powers numerous high-impact applications including search, question answering, and recommendation.

At ASAPP, we employ NER to understand customer needs, such as upgrading a plan to a specific package, making an appointment on a specific date, or troubleshooting a particular device. We then highlight these business-related entities for agents, making it easier for them to recognize and interact with key information.

Typically, training a state-of-the-art NER model requires a large amount of annotated data, which can be expensive and take weeks to obtain. However, a much more rapid response is demanded by many business use cases. For example, you can’t afford a weeks-long wait when dealing with emerging entities like those related to an unexpected natural disaster or a fast-changing pandemic such as with COVID-19.

My colleague PostDoc researcher Arzoo Katiyar and I solve this challenge using few-shot learning techniques. In few-shot learning, only a few labeled examples (typically one to five examples) are needed for the ML algorithm to train. We applied this technique to NER by transferring the knowledge from existing NER annotations from public NER datasets which only focus on entities in the general domain to each new target entity class. This is achieved by learning a similarity function between a target word and an entity class.

Yi Yang
When we apply structured nearest neighbor learning techniques to named entity recognition (NER) we’re able to double performance, as shared in our paper published at EMNLP 2020. This translates to higher accuracy—and better customer experiences.

Yi Yang, PhD

Few-shot learning techniques have been applied to language understanding tasks in the past. However, compared to the reasonable results obtained by the state-of-the-art few-shot text classification systems (e.g., classify a web page into a topic such as Troubleshooting, Billing, or Accounts), prior state-of-the-art NER performance was far from satisfactory.

Moreover, existing few-shot learning techniques typically require the training and deployment of a fresh model, in addition to the existing NER system, which can be costly and tedious.

We identified two issues with adopting the existing few-shot learning techniques for NER. First, the existing techniques assume that each class has a single semantic meaning. NER adds a special class called ‘Outside’, which means it is not one of the entities that this specific model is looking for. If a word is labeled as ‘Outside’, it means that it doesn’t fit in the current set of entities, however, it can still be a valid entity in another dataset. This suggests that the ‘Outside’ class corresponds to multiple semantic meanings.

Second, they fail to model label-label dependencies, e.g., given the phrase “ASAPP Inc”, knowing that “ASAPP” is labeled as ‘Organization’ would suggest that “Inc” is also likely to be labeled as ‘Organization’.

Our approach is to address the issues with nearest-neighbor learning and structured decoding techniques, as detailed in our paper published at EMNLP 2020.

By combining the techniques, we are able to create a novel few-shot NER system that outperforms the previous state-of-the-art system by doubling the F1 score (standard evaluation metric for NER, similar to accuracy) from 25% to 50%. Thanks to the flexibility of the nearest neighbor technique, the system can work on top of the conventionally deployed NER model, which eliminates the expensive process of model retraining and redeployment.

ASAPP—By combining the techniques, we are able to create a novel few-shot NER system that outperforms the previous state-of-the-art system by doubling the F1 score (standard evaluation metric for NER, similar to accuracy) from 25% to 50%

The work reflects the innovation principle of our world-class research team at ASAPP—identify the most challenging problems in customer experience, and push hard to overcome the challenge without worrying about the confines of current limits.

AI Native®
Concurrency
Customer Experience
Articles
Generative AI for CX

Does AI really improve agent efficiency?

by 
Heather Reed
Article
Video
Oct 30
2 mins

At ASAPP we develop AI-driven platform features to improve agent efficiency. To be data-driven about feature development, we require a way to measure how well we’re performing. If we introduce something new (like a new feature), we want to know how that feature impacts agent efficiency so that we can learn from it to drive the next features. Because of the complex nature of confounding effects on both 1) the features we wish to analyze as well as 2) the response KPIs of interest, we require specialized techniques to disentangle these confounding effects to properly measure the impact of our features, which is described in this post.

We can think of agent efficiency as throughput of customer issues: the number of issues per agent per time. We can improve throughput both by decreasing agent handle time (AHT) for a single issue and by increasing concurrency (the number of issues an agents handles at the same time). This post specifically discusses how we measure AHT improvement gains enabled by agent augmentation and automation features in digital messaging application.

Throughput = Concurrency / AHT

Calculating how much augmentation improves AHT

AHT improvement should not be measured by usage of a feature alone. Usage of a feature is a great signal about whether agents find our features useful to do their jobs (otherwise we would expect them not to use it). We can think of AHT improvement as the multiplication of both feature usage AND the feature’s impact on AHT when it is used.

AHC improvement details

Usage is easy to measure: we have data events every time an agent clicks on a suggestion. Those events are aggregated nicely into tables that we can consume. And then we can plot the average daily augmentation rates over time, as in the graph below

Augmentation usage over time

But measuring impact is more difficult because there are other factors that contribute to AHT. For example, some types of customer issues naturally take longer to resolve than others (such as troubleshooting, where much of the handle time is waiting for devices to restart). If those types of issues happen to have a lower augmentation rate than other types of issues, it would be unfair to say that augmentation causes longer AHT, when it is moreso the content of the issue that contributes to longer AHT.

Heather Reed
As use of agent augmentation features increase handle time decreases—and agents can manage concurrent conversations. This results in higher throughput.

Heather Reed, PhD

Measuring impact

While randomized experiments are the gold standard for measuring the causal effect of a treatment (e.g. a feature) on KPIs of interest (e.g. AHT), observational studies also enable measurement of feature impact without requiring experimentation. This becomes tricky because there are confounding factors that may impact both:

  1. Agents’ likelihood of using augmentation (e.g. some customer issues are more augmentable than others); and
  2. AHT (e.g. seasoned agents may be “naturally” faster at solving customer problems).

We can build a causal, statistical model to account for these confounding factors, constructed in such a way that we are able to isolate the impact of the variable that we are interested in (i.e. augmentation). We solve this problem by using random effects regression analysis.

In regression analysis, the goal is to fit regression parameters that describe the relationship between features and responses of interest. In random effects modeling, the regression parameters vary by group (e.g. by individual agent, types of issues, etc.), and the model accounts for both the average group effect (i.e. for all agents) as well as the offset between each agent’s effect and the group average. Random effects models are beneficial when the data exhibits a clustered structure and thus the data are not independent observations. The benefit of the random effects model is that in the absence of a lot of data for a value in the group (e.g. a new agent), the model shrinks the estimate for that agent toward the group average.

The schematic below demonstrates the concept of a random effects model where there is a group-varying intercept and slope. The black line represents the average effect for the group (i.e. all agents), and the pink and blue lines are the offsets from the group average that correspond to specific agents.

Augmentation

When constructing the statistical model, we consider the clustered and/or hierarchical nature of the data. Then, we fit this regression model with production data to learn what these model coefficients are (or more accurately, we learn the posterior distributions of these coefficients).

We then use this trained model to infer insights about augmentation. We do this by using the model to answer the counterfactual question, “What would the AHT have been, if there had been no agent augmentation?”. The difference between this potential AHT outcome and the real AHT provides an estimate of the AHT savings. We can take statistics across all issues (and subsets of issues) to learn insights or assess agent efficiency (proxied as AHT savings) over time, like the blue curve below.

Handle time saving over time

Using these approaches, we’re able to measure how much handle time our augmentation features save. The blue curve in the plot above shows a 100% increase in AHT savings (15% to 30%) over a period of 8 months. This is the result of both measuring the impact of new features as well as increasing usage of features that are driving increased impact (the green curve above). In this way we can quantify the value we deliver for our customers with current features and our Product team can use this insight to develop new features.

No results found.
No items found.
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Get Started

AI Services Value Calculator

Estimate your cost savings

contact us

Request a Demo

Transform your enterprise with generative AI • Optimize and grow your CX •
Transform your enterprise with generative AI • Optimize and grow your CX •