By clicking “Accept All Cookies”, you agree to the storing of cookies on your device to enhance site navigation, analyze site usage, and assist in our marketing efforts. View our Privacy Policy for more information.

How We Assessed the Quality of a Leading Belgian Bank’s AI-Powered Chatbot

Client Background

Our client is a prominent Belgian bank offering a comprehensive range of financial services to individuals and businesses. To enhance customer experience, the bank deployed a chatbot designed to address common inquiries and streamline interactions. As the chatbot became an increasingly critical tool for customer service, assessing its performance and identifying opportunities for improvement became a top priority.
‍
‍

Objective

The engagement’s primary objective was to evaluate the bank’s chatbot by focusing on its use of cutting-edge AI technologies, such as Natural Language Understanding (NLU) and Natural Language Processing (NLP). Sailpeak aimed to provide actionable insights to help the chatbot deliver more relevant, accurate, and impactful interactions while addressing risks such as hallucinations and misinformation. Our assessment also emphasized benchmarking the chatbot’s performance against industry standards, offering the client a clear roadmap for future innovation.

‍

Challenges

The chatbot relied on outdated decision-tree logic, which limited its ability to handle complex user interactions and adapt dynamically. Key challenges included:

Lack of Advanced AI Integration: The chatbot did not utilize modern frameworks such as Large Language Models (LLMs) or Retrieval-Augmented Generation (RAG), reducing its ability to provide nuanced responses.
Risk of Hallucination: There was a pressing need to mitigate risks of generating incorrect or misleading information, a known challenge in AI-based systems.
Scalability Concerns: As customer inquiries grew more sophisticated, the chatbot risked being unable to meet evolving demands.

‍

Sailpeak’s Approach

We employed a structured and innovative methodology to assess the chatbot’s quality, combining functionality evaluation with relevance and understanding benchmarks. Our approach integrated modern AI principles and focused on identifying areas for technological transformation.
‍

Phase 1: Functionality Assessment

We evaluated the chatbot’s core functionalities, based on our chatbot assessment framework:

Human Agent Escalation: Seamless handoff to a human agent when required.
Chat Persistence: Retention of context across sessions to maintain continuity.
Feedback Mechanisms: Consistency and depth of user feedback collection.
Multilingual Support: Capability to fluently handle interactions in multiple languages.
Conversation History Retrieval: Access to past interactions for reference.
Navigation Context Awareness: Ability to understand the user’s position in their journey.
Dynamic Navigation Buttons: Presence of interactive buttons for smoother guidance.

We categorize chatbots based on their functionalities into different categories, such as Basic, Comprehensive, or Advanced.
‍

Phase 2: Understanding & Relevance Assessment

We assessed the chatbot’s ability to respond to two categories of inquiries:

Informative Questions: Queries on banking domains like Accounts & Cards, Claims, Loans, and Appointments (e.g., “How can I open an account?”).
Intelligent Questions: Complex or out-of-scope questions requiring contextual understanding (e.g., “What technology powers this chatbot?”).

We categorize chatbots based on their ability to understand questions and generate relevant answers into different categories, such as Limited, Informative, or Conversational.

‍

Results

The assessment revealed that the chatbot was Basic in functionality and Informative in understanding and relevance. While it managed straightforward banking queries and escalations effectively, it struggled with:

Handling complex or out-of-scope questions, which exposed gaps in contextual comprehension and adaptability.
Providing dynamic, contextually rich responses due to a lack of modern AI integrations like RAG and advanced NLU models.

‍

Recommendations and Impact

Sailpeak provided the bank with a comprehensive roadmap to evolve its chatbot into a state-of-the-art conversational AI system:

Integrate Retrieval-Augmented Generation (RAG): Enhance the chatbot’s ability to retrieve and ground responses in factual, context-specific information.
Upgrade NLU/NLP Models: Implement advanced models for deeper contextual understanding and more fluid conversations.
Establish Guardrails: Deploy mechanisms to minimize risks of hallucinations and ensure response accuracy.
Enhance Benchmarking Practices: Regularly evaluate the chatbot’s performance against industry-leading conversational agents.

This transformation would not only improve customer satisfaction but also position the bank as an innovator in leveraging AI technologies for superior service delivery. By addressing both present limitations and future challenges, our assessment provided actionable insights to drive impactful, long-term improvements.

‍