Bespoken LLM Benchmark: Does ChatGPT know more than Google and Amazon?
Learn more
December 15, 2017 in Blog

Voice Is Eating The World

TL;DR Introducing the Human Operating System and why two billion voice devices is not the ceiling – it’s the floor. Read this and more about voice here.

Bespoken creates automated solutions for monitoring and testing voice experiences. We consider our job to be humble but a necessary one, and for people that enjoy troubleshooting and automation, one that is filled with many sublime pleasures.

But sometimes, we feel the need to step back, take a look around, and make an extremely bold proclamation. Today is just such a day – so sit down, put down any potentially stain-inflicting liquids, and brace yourself! Because we have news for you – even if you aren’t excited about the subtle pleasures of well-instrumented systems, you should be excited about voice.

Voice technology represents a revolution that is going to make the mobile install-base (2B+) seem quaint. That’s right – for voice devices, two billion is not the ceiling – it’s the floor.

Voice Technology is Everywhere

Don’t believe me? Let’s see where voice has been added already:

At Home

On-The-Go

In Healthcare

In the Government

Hospitality

Finance

Education

In the Office

You might scoff – does one really need a voice-enabled refrigerator – will I ever really ask “Alexa, how old is the milk?” It doesn’t matter, but you can – even if you cross off half of the items on this list, it still leads to a far, far larger install-base than we have with phones – orders of magnitude larger. We each have one phone (and maybe Walter White had two) – in my eye-line right now, I have a TV, thermostat, two light-bulbs, and a smart speaker. That’s one room. Now count the whole house. Now picture it five years from now. The game is already over.

Voice-Addressable Devices Are Even More Ubiquitous

But perhaps you are still not convinced. Well, let’s pile on even more devices. Consider this – even if your next washer isn’t voice-enabled, will it be voice-addressable? I.e., will you be able to control it and get information about it via your voice? Again, maybe you don’t want to – but it already has an embedded microcontroller, a WiFi adapter, and an API built by some poor nerd somewhere deep in the bowels of Samsung, wondering, “when is someone going to use my software?” Well, your time has come, my friend!

(I’m not even touching on all the enabling technologies that Amazon is pumping out – like Lambdas@Edge, the Gadgets API, microsecond response-time DBs, etc. But these pieces are all pulling together to make it possible for anything with a battery to play a part in this brave new connected world.)

Here the distinction between being enabled and addressable gets blurred – here are two scenarios:

1) You have Echo Dots embedded in every room that control all your smart devices

2) You have devices everywhere with microphones (and microcontrollers) embedded in them

It’s a distinction without a difference – the devices will be at the tip of your tongue, available via a microphone from one of several potential providers. And it brings us to our real thesis.

Voice is the Operating System of the Internet of Things

The march of voice technology is not the march of Artificial Intelligence – it is the march of the Internet of Things. The AI is exciting, tantalizing but it is not the main course. The main course is that we have so many devices that are now network-enabled and API-controllable, but with no easy way to interact with them.

Single-device or single-purpose apps are pointless – no one wants to hunt around for the app that corresponds to the particular lightbulb they chose for the corner lamp in their bedroom (which is not the same as the bulb on their bedside table).

And chatbots are more readily accessible, but still suffer from the same issue – even if you don’t have to hunt for an app, you need to find your phone. Step one, ask Alexa where it is.

The genius of Alexa is not its ability to carry on conversations, play podcasts, or tell you the average rainfall in the Amazon river basin. It is that it can turn the lights off and the TV on.

That is why Amazon cannot give away the blueprints for building voice-enabled devices fast enough. They created immense intellectual property in the first Echo, combining great speech recognition with an exceptional microphone array (the first true voice-specific device). And within months of releasing it, they are practically putting it into the public domain. Why do this?

The Human Operating System

Because Bezos knows he is playing for a far, far bigger prize than selling more dishwasher detergent. Or having a big hit consumer device – those are such modest, Apple-esque aspirations. Not that Amazon minds the extra income, of course, and they have a very efficient system for picking it up. But the big prize is Bezos gets to be the first truly ubiquitous operating system, the means by which we address all this amazing, ever-present computational power, and the tool that finally uncaps it. It turns our voice, and our very intentions, into the operating system of the world that surrounds us. Jeff sees this. As does Google (who is very much in this fight). And we do too.

And so as this revolution unfolds, we are delighted to be here, helping to keep it all running smoothly. Now take a breath, pick up your coffee, have a sip, and say “Alexa, play tears for fears.” ??? “Everybody wants to rule the world” – nice melody, eh?

4 responses to “Voice Is Eating The World”

  1. At the beginning there was the keyboard, then the mouse, the trackpad, the touchscreen … and now the voice. The epic struggle of the humans to make these stupid boxes do something useful. I think the next is telepathy. Or the computers will just simply guess what we want.

    • From voice to telepathy – doesn’t seem like that much of a stretch. As long as the vocabulary and syntax is appropriately limited to ensure proper ATR (Automated Thought Recognition) 🙂

Leave a Reply

Your email address will not be published. Required fields are marked *