Bespoken LLM Benchmark: Does ChatGPT know more than Google and Amazon?
Learn more
March 20, 2020 in Blog

Alexa, Are The Lights On?

Ensure your voice-enabled smart home devices are working, every time and all the time, with Bespoken🔊🏠
TL;DR About to launch your smart-home device? Read this article first and learn how to save time and money by avoiding tedious manual testing.


You won’t know how convenient it is to talk to your bedroom lights until you’ve tried it. And what once seemed to be science fiction is now affordable, easy to set up, and above all easy to use.

Take a smart bulb: such a versatile invention. You can change its color to make a more pleasant environment or use cold lighting to concentrate better when you work. It is also possible to automate its operation by creating routines – for example, discourage thieves by turning the lights on at certain times of the day when you are out.

If we add the ability to control them by voice, the whole experience becomes nearly frictionless. For instance, when you’re about to fall asleep, there’ s no need to get out of bed and turn the lights off, you just say, “Alexa, turn off the lights” 😴.

Now, verifying that these experiences work consistently for users is where it gets a bit tricky. In fact, for a manufacturer, it is necessary to test complex scenarios involving:

  • A large number of models and device types. Some manufacturers offer smart bulbs, switches, plugs, vacuum cleaners, pet feeders, cameras, motion sensors, fans, etc.
  • Each device has many configuration parameters that can be changed by users, such as colors, brightness, power level, automation and execution routines, connection with other devices or applications, etc.
  • The platforms, and even your backend services, are constantly evolving.
  • The right action to execute is often contextual – how do you account for the location and preferences of your users?
  • Your users have myriad communications with many accents and languages.
  • Your users can issue their voice commands in noisy scenarios such as family rooms, cars, social gatherings, etc.

Manually testing these scenarios is tedious and impractical for even simple cases. For more complex and in-depth testing, it is outright impossible. Fortunately, Bespoken offers automated testing to comprehensively and consistently ensure your voice applications are working. And if they aren’t, we will make sure you know before your users do.

Our tools are built for anyone working with smart home devices. We offer turn-key solutions to solve all the problems described above, across Alexa, Google, Siri and more. We are a one-stop automation shop for your smart home testing needs.

How it works?

Our solution leverages a state of the art approach that encompasses:

  • Scheduling
  • Monitoring
  • Reporting
  • Notifications

It is ready to go on day one yet still allows for extensive customization. To get started,  it is as easy as creating a spreadsheet with the list of utterances to test. Here’s an example:

Utterance Transcript
show me james bond movies on office roku getting james bond from roku
turn office bulb on okay
set brightness on office bulb to 30% *
turn office plug off okay

By default, the output is also created in a simple spreadsheet. It looks like this:

Utterance Actual Transcript Expected Transcript Success? Error Raw Data URL
show me james bond movies on office roku getting james bond from roku getting james bond from roku TRUE https://batch-tester…
turn office bulb on okay okay TRUE https://batch-tester…

We also leverage an amazing tool called DataDog for real-time visual and historical reporting as well as notifications. DataDog provides incredible flexibility and allows for data to be sliced and diced as needed. Here is our standard reporting dashboard:

How To Get Started

Take a look at our Smart Home Automation Solution here:

We provide directions there on how to get  setup. The quick highlights are:

  • Create a Bespoken account
  • Make a copy of our sample project
  • Add in your tests (as shown here)
  • Run it!

It is that easy. And if you have questions, just send an email to us at

Going Deeper

Custom Scheduling

One of the major benefits of using industrial-strength DevOps tools like GitLab is that it allows for setting any schedule you like for running tests. With the GitLab scheduler, you can:

  • Run a subset of tests every hour
  • A full suite of tests every day
  • In-depth speech recognition tests every week

These schedules can be set and configured in a matter of minutes. See how to do it here.

Custom sources and interceptors

Our sample project by default uses CSV files (tables) as the source for running tests. But we are hardly limited to that. For example, it is possible to generate utterances from an API by using a custom source module. Write down your code and add a reference in the configuration file:

   "source": "./src/read-from-api.js"

And you don’t need to to just take Alexa’s word for it that the specified utterance worked – you can verify the status of the device with your own APIs easily within the tests. Read here to learn more.

Leveraging Recorded Audio

Some of our customers use real audio recordings to test their voice experiences, and we also support that. Instead of setting your utterances as phrases, just hand us the URL of the audio file – we can use that to test with just as well. And if you need help gathering data, we can help with that as well.

Speech Recognition Testing

Besides running functional and regression tests with our end-to-end tools, we highly recommend to also run speech recognition tests before publishing your voice experiences so you can deploy them with confidence.

Speech recognition and NLU testing helps you to identify which utterances are not being understood correctly by the voice service you are planning to use, and more importantly, verifies if your invocation name is free from issues.

These kinds of tests leverage our STT capabilities, where we use almost-real Google Wavenet voices with different pitches and accents. You can even simulate challenging sound conditions like crowded places, or noisy environments like streets or public transportation stations.

Bespoken – Just 30 minutes can save 300 hours

Sounds incredible? It’s not – we’ve saved our customers more than 100,000 hours of manual testing in the last year alone. And feel free to put us to the test. Reach out to our team – we can tell you all about our solution, then get you set up with a free demo that proves everything we describe above. Or just get cranking with our sample project:


Yours in testing and tuning,

Leave a Reply

Your email address will not be published. Required fields are marked *