Bespoken LLM Benchmark: Does ChatGPT know more than Google and Amazon?
Learn more
March 3, 2020 in Blog

Listen to the Music 🎼

How to Ensure Great Experiences with Alexa, Google, Siri, and More!
TL;DR About to launch your streaming audio app? Read this article first and learn how to avoid tedious manual testing for a fraction of the time and money - this is how testing works with Bespoken 🤖.

Streaming Voice Experiences are Great – WHEN They Work

Streaming audio has been a boon for radio stations, podcasters, and music streaming services. It has brought listeners back into the fold, extended listening hours for stalwart customers and introduced a new, nearly frictionless way for listeners to access the content they love.

In some ways, it may seem like found money. But those working in the space know it’s not that simple. There are several challenges:

  • There may be thousands, if not millions, of unique pieces of content your listeners are requesting. How do you ensure they are all understood by the myriad platforms?
  • The platforms, and even your backend services, are constantly evolving. How do you make sure they are working consistently and repeatably?
  • The right content is often contextual – how do you account for the location and preferences of your listeners? How do you measure where it’s working and where it’s not?
  • Successful streaming requires a handoff between your content delivery network and many other cloud services. How can you be sure each radio station’s streaming is OK – not just before you publish the app, but at all times?.
  • And that’s not even to mention the challenge of speech recognition across the countless accents, environments, and languages of your users. The combinations can become head-spinning.

Manually testing these scenarios is tedious and impractical for even simple cases. For more complex and in-depth testing, it is outright impossible. Fortunately, Bespoken offers automated testing to comprehensively and consistently ensure your voice applications are working. And if they aren’t, we will make sure you know before your users do.

Our tools are built for anyone working with streaming audio. We offer turnkey solutions to solve all the problems described above, across Alexa, Google, Siri and more. We are a one-stop automation shop for your streaming testing needs.

How It Works

Our solution leverages a state of the art approach that encompasses:

  • Scheduling
  • Monitoring
  • Reporting
  • Notifications

It is ready to go on day one yet still allows for extensive customization. To get started,  it is as easy as creating a spreadsheet with the list of utterances to test. It looks like this:

UtteranceStream URL
play K T C Z
play W Z T U


By default, the output is also created in a simple spreadsheet. It looks like this:

Utterance Expected Stream URL Actual Stream URL Success? Alexa Provider Transcript
play K T C Z… TRUE iHeartRadio Live Radio getting your cities 97.1 station from iHeartRadio
play W Z T U… TRUE iHeartRadio Live Radio getting you to 94.9 station from iHeartRadio

We also leverage an amazing tool called DataDog for Real-time visual and historical reporting as well as notifications. DataDog provides incredible flexibility and allows for data to be sliced and diced as needed. Here is our standard reporting dashboard:

How To Get Started

Take a look at our Streaming Audio Automation Solution here:

We provide directions there on getting setup. The quick highlights are:

  • Create a Bespoken account
  • Make a copy of our sample project
  • Add in your tests (as shown above)
  • Run it!

It is that easy. And if you have questions, just send an email to us at

Going Deeper

Custom Scheduling

One of the big benefits of using industrial-strength DevOps tools like GitLab is it allows for setting any schedule you like for running tests. With the GitLab scheduler, you can:

  • Run a subset of tests every hour
  • A full suite of tests every day
  • In-depth speech recognition tests every week

These schedules can be set and configured in a matter of minutes. See how to do it here.

Custom sources

Our sample project by default uses CSV files (tables) as the source for running tests. But we are hardly limited to that. For example, let’s say you want to generate utterances from an API. Easy!

Custom source code to load data from that API might look like this:

Even for non-programmers, we hope that doesn’t look too complicated. And if you have questions, again, just ping us. We probably have the solution on the tip of our tongues.

Leverage Recorded Audio

Some of our customers use real audio recordings to test their voice experiences, and we also support that. Instead of setting your utterances as phrases, just hand us the URL of the audio file – we can use that to test with just as well. And if you need help gathering data, we can help with that as well.

Speech Recognition Testing

Besides running functional and regression tests with our end-to-end tools, we highly recommend to also run speech recognition tests before publishing your voice experiences so you can deploy them with confidence.

Speech recognition and NLU testing helps you to identify which utterances are not being understood correctly by the voice service you are planning to use, and more importantly, verifies if your invocation name is free from issues.

These kind of tests leverages our STT capabilities, where we use almost-real Google Wavenet voices with different pitches and accents. You can even simulate challenging sound conditions like crowded places, or noisy environments like streets or public transportation stations.

Bespoken – Just 30 minutes can save 300 hours

Sound incredible? It’s not – we’ve saved our customers more than 55,000 hours of manual testing in the last year alone. And feel free to put us to the test. Reach out to our team – we can tell you all about our solution, then get you set up with a proof-of-value that demonstrates everything we describe above. Or just get cranking with our sample project:

Yours in testing and tuning,


Leave a Reply

Your email address will not be published. Required fields are marked *