Automated Testing, Training and Monitoring for Chatbots
Building chatbots is challenging – quality, accuracy and reliability are all critical to deliver high levels of customer satisfaction.
And though AI has improved by leaps and bounds in recent years, it still requires constant attention to work well. That is where Bespoken can help out.
We provide a full-cycle program for managing AI/NLU-based systems:
Crowd-Sourced User Testing
We assist our customers with initial utterance gathering using our Device Service in conjunction with crowd-sourced task testing providers such as Amazon Mechanical Turk and Applause. Our team will gather input from real users to assist with:
- Functional testing: Ensure the application works correctly with real users
- Utterance gathering: Acquiring a complete picture of what real users will say and how they will say it
- Usability Evaluation: Validate the design and UX of your application via objective and subjective feedback from actual users
The output of our crowd-sourced testing is then used as the basis for creating a comprehensive automated testing regimen.
Our automated testing assists across several key concerns for chat-based systems:
- Functional Testing: ensuring the system works correctly and is bug-free, automatically and repeatably
- Monitoring: we run tests on a routine basis to ensure everything is working well in your system. If there are issues, we let you know right away.
- Accuracy Testing: we measure performance to ensure that users are consistently understood across every utterance. When they are not understood correctly, we make specific recommendations
All of this is driven off our unified API for conversation:
We support testing Conversational AI via the following platforms and channels:
And that is not a complete list of all the platforms we support – if you don’t see your preferred one listed, just reach out – we probably have you covered, and if not, it’s a small effort for us to extend our Device Service to support your platform/channel of choice.
How It Works
Here’s a sample test script:
--- - test: Rewards test - hi: Hey, nice to chat with you! - rewards program: Being rewarded for eating, ahh... what a dream! - learn about rewards: Dropping knowledge on the most delicious program there is! - about points: I've got answers - can i earn points anywhere: You can earn points anywhere in the US! Just make sure you
Each line represents an interaction with the bot. The part on the left-side of the colon is what the customer is saying to the bot. The part on the right-hand side is what we are expecting back in response.
Here is an actual test being run by our Device Service in action against the web chatbot for Chipotle:
As you can see, we work through each interaction in the test step-by-step, confirming at each step that the correct response is received. Once completed, a comprehensive report is created for the set of tests that are run:
Ongoing Reporting and Alerting
Beyond the individual results for each test run, we also provide reporting on what is happening over time. Take a look here:
What’s more, you can setup highly granular alerting to notify you and your team when particular events occur. This is particularly useful to distinguish between critical events that demand an “all-hands-on-deck” response versus more routine bugs and minor, temporary outages.
It’s easy to get started – you can check out our sample project here:
Or just email us at firstname.lastname@example.org and we will be happy to set you up with a guided trial.