Alexa Skill Automation – Testing, Integration, and Delivery

TL;DR Alexa skill automation has not been the first order of business for voice app developers. For many, the code and UX come first. Well, not anymore.
automation

Developers love automated testing and deployment, but up until recently, it has not been possible for voice apps. Luckily, with Bespoken’s suite of tools, first-class Alexa skill automation and testing are now achievable! We took one of our homegrown skills and used our set of testing tools, and have turned it into a showcase for best practices for Alexa skill testing and automation.

We brought together:

  • ASK CLI – for Lambda deployment and skill updates
  • Bespoken Tools (bst test) – for skill unit tests and end-to-end testing using the Alexa Skill Management API (SMAPI)
  • Circle CI – for continuous integration and deployment
  • Codecov – for code coverage tracking and reporting

This gives us a full-featured automation platform – one that ensures our skill is always working. And it’s a complete level of assurance – thanks to our tools, we can ensure that things are working at both a code-level as well as a whole system. The net result is a fun little skill that is a serious showcase for automation. We’ll go through it, piece-by-piece, starting with our Alexa skill automation for unit testing.

Unit Testing

For unit-testing Alexa skills, bst test uses our Virtual Alexa library. Virtual Alexa emulates the behavior of Alexa and generates JSON request payloads as if they were coming from it. It’s already integrated with Jest, so it’s a great way to ensure code quality. Here is a sample test:

(For a more complete write-up on unit-testing and this skill, take a look here)

Continuous Integration (CI)

Now that our unit-tests are in place, our next step is setting up continuous integration to run these tests whenever we make changes. There are a lot of great tools for this – we prefer CircleCI, but Travis, Jenkins, CodeShip, etc. are also great choices.

For running our unit tests, we need a circle.yml file, with a line like so in it:

The last line, bst test is the key one. After doing some setup (such as establishing which Node version to run, and installing the ASK CLI), this actually runs the tests. Our project is setup so that every push triggers them.

Here is what our dashboard in Circle looks like for our last few runs:

Circle Dashboard Alexa Skill Automation
All green, which is great – feel free to take a look for yourself.

Code Coverage

With our unit tests in place, the next piece is code coverage. For this, we use CodeCov, which is another tool that is free for open-source projects (same as CircleCI). It is easy to work with and provides nice graphs and visualizations of what’s happening with your unit tests over time.

Check out their interactive sunburst graph – it’s a fun way to explore unit test coverage.

Sunburst Graph of Alexa Skill Automation

End-To-End Testing

Also known as integration testing, but we use the term end-to-end to distinguish it from the “typical” unit-testing done via a CI system. In our case, we are deploying our code to a dev environment every time we commit to master – more on that in a moment. But before we do that deployment, we want to make sure our system as a whole is working. To that end, we use our tools in e2e mode.

What is the difference between our e2e and unit tests? Great question – the essential one is that our Unit tests work with Virtual Alexa and it just emulates Alexa – it mimics its behavior. The e2e use Alexa. It uses the SMAPI Simulation tools to interact with Alexa, and in turn to our skill. Both are testing our skill but in different ways.

Bespoken unit tests are for:

  • Running unit tests against code, with minimal dependencies
  • Measuring depth of testing and code coverage
  • Ensuring code is working properly

Bespoken e2e tests on the other hand, are best for:

  • Ensuring the interaction model is configured properly (remember, it is using the real Alexa Voice Service)
  • Ensuring infrastructure (such as Dynamo and S3) is all in place and working correctly
  • Ensuring there are no speech recognition issues

The last point we will expand on in future posts, but suffice it to say, most Alexa developers have run into the situation where they designed an interaction model that looked great on paper but did not survive first contact with the “enemy”: real users speaking to real devices.

The e2e tests can help tease these issues out. And since it’s part of our Alexa skill automation process, it will ensure as intents are added, as the code is enhanced, and as Alexa’s machine learning evolves, everything is still working perfectly. Awesome, right?

So, enough background – let’s look at an actual integration test:

It looks pretty similar to our unit test. Not surprising – they are using the same entry point, an utterance, to test with. But as just explained, under the covers, it is quite different. Also note that the expectations on our tests are a bit simpler, such as this line:

- caterina: "let's start the game: Jordi your product is * Guess the price"

Now, let’s bring it all together.

Continuous Deployment

In CircleCI, we set up our continuous deployment to run our end-to-end tests whenever commits are made to master. Then we deploy.

Our deployment is done via a shell script, which uses the ASK CLI. Our shell script:

  • Sets up the AWS credentials (from environment variables securely set in Circle)
  • Sets up the ASK credentials (again, from secure environment variables)
  • Packages the Lambda code into a zip file
  • Uploads it using the ASK CLI

We use shell script because it has so many handy file manipulation tools. It allows us to succinctly and easily do all the steps above.

With that in place, our deployment is set to run automatically whenever pull requests are merged. So we know that when updates are made, a new development version will be delivered to our testers to work with right away. Everything is in sync, and we have a smooth, highly-assured build pipeline.

And what about production? We do not auto-deploy to production – a manual step is required. But it’s a simple one – just tag a release with a name like “prod-*” and it will be pushed to production. In this way, we use a manual trigger to kickoff our automated workflow.

Summary

We’ve gone through a lot here – unit testing, CI, CD, end-to-end-testing.

You’re well in your way of Alexa Skill Automation. We will be expanding on these different points in future posts – we know it is a lot take in all at once.

Feel free to use this project as a template for creating your own highly-automated, highly-tested Alexa skill pipeline. And if you would like to go into more depth, as well as talk to the author (John Kelvie), sign up for one of our webinars as we do a deep dive on skill testing and automation – you can register here.

6 responses to “Alexa Skill Automation – Testing, Integration, and Delivery”

  1. Hi:
    Nice article, but I don’t understand how did you initial ask-cli (ask init) in CI?
    This action need you to operate browser, but CI usually locate in remote computer.

    thank

Leave a Reply

Your email address will not be published. Required fields are marked *