It is not enough to provide first-class content or delightful functionality to ensure the success of voice applications. An engaged user will return only if his experience has been satisfactory, that means, in addition to having his needs fulfilled, not finding any errors during the time he has used our product, and the only way to ensure that happens is by testing.
We know that it is not advisable to trust only in unit tests (even with a code coverage above 90%) because we can be very good at programming and analyzing all possible scenarios, but we cannot be certain that the components or services we use are always working properly. End-to-End (E2E) testing is necessary.
And when we say E2E testing for voice experiences we mean:
- The voice app as a whole (from Alexa/Google through infrastructure to voice app)
- The speech recognition
- The interaction model and the natural language understanding
At Bespoken we have focused on minimizing the complexity of E2E testing for voice. And as a result of our efforts, we offer simple and powerful tools that allow complex use-cases to be verified with straightforward test scripts. In addition to the ease of creation and maintenance, one of the major advantages of using test scripts is that their execution can be automated quite simply, reducing manual work and optimizing the use of resources.
However, all this has not been enough for us, and we wanted to go a step further by building plug-and-play solutions to bring you test automation at the click of a button. Sounds great, right?
In this article we’ll explore our comprehensive and recommended approach to voice testing, it’s the best guarantee to highly engaged and happy users!
How It Works
Our solution leverages a state of the art approach that encompasses:
It is ready to go on day one yet still allows for extensive customization. To get started, it is as easy as creating a test script using our simple YAML-based syntax. It looks like this:
The test can be executed from a CI environment like GitLab, or your console. The test output is shown in the console, and a report is created under the test_output/report folder. The report looks like this:
We also leverage an amazing tool called DataDog for real-time visual and historical reporting as well as notifications. DataDog provides incredible flexibility and allows for data to be sliced and diced as needed. Here is our standard reporting dashboard:
How To Get Started
Take a look at our End-toEnd Automation Solution here:
We provide directions there on how to get set up. The quick highlights are:
- Create a Bespoken account
- Make a copy of our sample project
- Add in your tests (as shown here)
- Run it!
It is that easy. And if you have questions, just send an email to us at firstname.lastname@example.org.
One of the big benefits of using industrial-strength DevOps tools like GitLab is that it allows for setting any schedule you like for running tests. With the GitLab scheduler, you can:
- Run a subset of tests every hour
- A full suite of tests every day
- In-depth speech recognition tests every week
These schedules can be set and configured in a matter of minutes. See how to do it here.
We use a filter in our sample project. A filter is a module used to do extra processing at some stages of the test execution. In this case, we use the filter to capture and send metrics onTestEnd to DataDog.
Filters are referenced in the configuration file like this:
Notice the reference points to the node_modules folder since the filter has been defined as a dependency (see the package.json file):
Take a look at the plugin here, and feel free to extend and customize it to meet your test reporting needs.
Leveraging Multi-Locale Testing
Check this sample project to know how to test a skill supporting German, English, and Spanish with just a single set of tests. The sample project includes unit and E2E tests and is part of the tutorial series we have done with Dabble Lab, check the video here.
Leveraging Multi-Platform Testing
We love to simplify the challenges of voice testing, this is why you can also use a single set of tests to check both versions of your app, for Alexa, Google Assistant, and potentially other platforms as well. See how to do it in this sample project.
Speech Recognition Testing
Besides running functional and regression tests with our E2E test scripts, we highly recommend testing and tuning the speech recognition before deploying your voice experiences.
Speech recognition and NLU testing helps you to identify which utterances are not being understood correctly, and just as importantly, verifies if your invocation name is free from issues.
These kinds of tests leverage a variety of phrasings and accents. You can even simulate challenging sound conditions like crowded places, or noisy environments like streets or public transportation stations.
Bespoken Test Robot
What if your testing needs are beyond smart speakers. You might have deployed your apps on platforms like Siri, Facebook Portal, or even you might want to test IoT devices with display and voice capabilities. How can we perform extensive testing without resorting to tedious manual testing? Simple, leave it to our test robots.
Bespoken’s test robots are the world’s best diagnostic tools for voice that perform E2E testing using analog output and input. This way we are not just testing the software, but the hardware as well. Our Test Robot ensures your device delivers flawless and incredible Voice Experiences to your customers. Contact us to get started.
Bespoken – Just 30 minutes can save 300 hours
Difficult to believe? Perhaps, but it just happens to be true! We’ve saved our customers more than 100,000 hours of manual testing in the last year alone. And feel free to put us to the test. Reach out to our team – we can tell you all about our solution, then get you set up with a free demo that proves everything we describe above. Or just get cranking with our sample project:
Yours in testing and tuning,