Join the Community

23,317

Expert opinions

42,499

Total members

355

New members (last 30 days)

187

New opinions (last 30 days)

29,094

Total comments

Join Sign in

Improving Call Center Effectiveness with Pre-Trained and Fine-Tuned LLMs

1 Like 2 19 June 2025 Be the first to comment

Serhii Serednii

Head of AI / ML

MD Finance

Many client-facing companies face significant challenges with call center efficiency. Among the most common issues are:

High turnover: Frequent departures of call center specialists make companies continually optimize their policies regarding the hiring, training, and evaluation of call center staff.
High call volumes: The large number of calls makes it impossible to monitor every interaction and ensure that a company's quality of service standards are not violated.
Multi-country and bilingual operations: Companies operating call centers in different countries or managing bilingual centers struggle to deliver uniform service quality across languages.

How to deal with it

To address these challenges here proposed a tool that automatically estimates call quality and return call score from 0 to 1. This approach allows:

Supervisors: To focus on problematic calls and collaborate with their teams to resolve identified issues.
Management: To access a convenient and traceable summary of call center efficiency.
Swift reaction to issues: By flagging calls with red alerts, enabling prompt responses.
Agent motivation: Call scores can be used to incentivize and motivate agents.

Solution

Introducing criteria

To enhance the explainability of results and reduce subjectivity in evaluations, the proposed tool assesses call quality based on predefined criteria. Each criterion has weight in overall call quality estimation. These criteria can be designed by call center supervisors, operations or marketing specialists, or employees involved in call script design.

Example:

Let’s assume that experts specify that during a call, an employee should:

Introduce themselves and mention the company's name.
State the purpose of the call.
Answer all of the client's questions.
Offer at least one cross-sell (i.e., propose an additional product).
Avoid using coarse language.

Each of these criteria can be assigned a specific score and/or flagged as a red flag. In this example, the first four criteria are weighted (0.2, 0.2, 0.2, 0.4), while the use of coarse language serves as a red flag.

Each call is scored according to the assigned weights. The total score directs supervisors’ attention to specific calls, particularly those with low scores or in which red flags were triggered. Additionally, an employee's average score can serve as a motivational tool, and the overall call center average can provide top management with a clear performance indicator.

Benefits of this approach:

Highly adaptable: The list of criteria can be easily modified without altering the overall AI solution architecture.
Focused on improvement: It clearly identifies which aspects of service each call center specialist needs to improve, providing supervisors with a targeted training tool.
Explainable: The system offers clear explanations for why a call was rated as low or high quality.

General solution architecture

Solution overview:

First part of the solution is a periodic (e.g., once per day) batch inference of call records. This part processes calls and stores processing results. Batch inference consists of 3 steps:
1. Transcription: A speech recognition model creates a transcription of a call record.
2. Translation: The transcription is translated into English using a translation model. (This step can be skipped if the call record is already in English. Additionally, some models, such as OpenAI's Whisper, can generate an English transcription regardless of the original language.)
3. Criteria estimation: The translated text is fed to criteria estimation model(s).
The results of call processing are stored in a database.
Overall effectiveness metrics are gathered into a management report, enabling management to monitor overall call center effectiveness and performance.
Detailed estimations of each call are displayed to supervisors, who can filter calls by performance, specific criteria, or red flags, and then take actions to improve service quality at the individual employee level. At this step, supervisors can manually change the results of call valuation; these changes are stored in the database as supervisors’ feedback.
Feedback received from supervisors is used to fine-tune the models, implementing CI/CD principles.

Benefits of this solution:

Criteria-based estimation: The solution employs predefined criteria for estimation, offering benefits such as enhanced explainability, reduced subjectivity, actionable insights.
Global scalability: It can be deployed across multiple countries and languages simultaneously with minimal adaptation.
Modular design: Each component of the solution operates independently, allowing for targeted training and improvement.

Transcription of calls

In this solution Whisper was used for call transcription; however, any suitable speech-to-text model can be integrated if it better meets specific case requirements.

Advantages of Whisper:

Multi-language support: Whisper supports a wide variety of languages, enabling the solution to be easily adapted to new language without requiring significant modifications.
Open source and free: Being open source, Whisper is free to use, which reduces overall costs.
High performance: Whisper delivers relatively high performance in transcription tasks. For more details on its performance metrics, refer to the official Whisper repository and related documentation.

Translation of calls

In this solution, Whisper is also used for call translation; however, any other translation model can be integrated if it better meets specific case requirements.

Advantages of Whisper for translation:

Wide range of supported languages: Whisper supports a wide variety of source languages and can produce transcriptions in English, regardless of the original language of call.
Open source and free: Being open source, Whisper is free to use, which makes it a cost-effective option.
Contextual accuracy: Whisper is effective at capturing the overall sense of a conversation rather than delivering a strictly word-for-word translation, resulting in more natural and accurate translations.

Criteria estimation (First approximation)

If you lack extensive data scored by your criteria (which is often the case), the initial approach to criteria estimation can be accomplished using one of the widely available pre-trained LLMs. These models can evaluate criteria formulated in plain language, providing estimates that can serve as a dataset for subsequent steps. Alternatively, these estimates can be used directly if fine-tuning is not required or if the costs in time and effort outweigh the benefits.

In this solution, Llama 3.2 was used for first approximation. However, any other pre-trained LLM can be used if it better meets your specific requirements.

Advantages of Llama:

Open source and free: Llama is open source, making it accessible and cost-effective.
High performance: It performs exceptionally well in summarizing conversations and processing unstructured text, which is crucial for effective criteria estimation.

Feedback loop tool

To provide detailed results to supervisors and gather human feedback on model outcomes, this solution includes a specialized feedback loop tool.

Main functionality:

Highlighting calls with quality issues or red flags: The tool allows supervisors to quickly identify calls where the specialist performed unprofessionally, enabling swift actions.
In-depth insights on call quality: It provides a comprehensive, objective summary of what went wrong during the call, which helps supervisors communicate specific issues to specialists.
Feedback collection: Users can submit feedback on the model's accuracy across all stages, including transcription, translation, and criteria estimation.

This tool can be developed with varying levels of complexity, depending on the specific requirements of each implementation case.

Fine-tuning the criteria estimation model (second approximation)

Feedback gathered from the supervisors is used to fine-tune models to improve the accuracy of criteria estimates. The fine-tuning process is integrated into CI/CD pipeline, ensuring continuous updates as new human feedback becomes available. LoRA (Low-Rank Adaptation) approach is used for this fine-tuning task.

Advantages of the LoRA approach:

Resource efficiency: LoRA requires fewer parameters to be updated during fine-tuning, reducing computational overhead and speeding up the training process.
Improved generalization: By focusing on low-rank modifications, LoRA minimizes the risk of overfitting, enabling the fine-tuned model to generalize better across various call scenarios and adapt effectively to new data.

Summary

The tool described in this article has proven effective at improving contact center quality, demonstrating a universal approach applicable to various call center tasks - such as telesales, debt collection, and more. Its modular and scalable architecture allows organizations to effortlessly extend the solution to multiple countries and languages while adapting it to different operational requirements.

External

This content is provided by an external author without editing by Finextra. It expresses the views and opinions of the author.

134

Report

1 Like

Channels

/artificial intelligence /retail banking

Artificial Intelligence

After the successful launch of the Chat GPT 4.0 chatbot by OpenAI at the beginning of 2023, many businesses started testing the tools provided by artificial intelligence and the areas of their application.

Join group

89 opinions 39 members 19 June 2025

Comments: (0)

Serhii Serednii

Head of AI / ML

MD Finance

Member since

26 May

Location

Toronto

Lending Redefined: Implementing ML To Enhance Products, Workflow, and Customer Experience

02 June

See all Opinions from Serhii

More expert opinions

Doriel Abrahams Principal Technologist at Forter

External

This content is provided by an external author without editing by Finextra. It expresses the views and opinions of the author.

Join the Community

23,317

Expert opinions

42,499

Total members

355

New members (last 30 days)

187

New opinions (last 30 days)

29,094

Total comments

Join Sign in

Join the Community

Improving Call Center Effectiveness with Pre-Trained and Fine-Tuned LLMs

How to deal with it