At Amazon, we are heavily invested in machine learning (ML), and are developing new tools to help developers quickly and easily build, train, and deploy ML models. The power of ML is in its ability to unlock a new set of capabilities that create value for consumers and businesses. A great example of this is the way we are using ML to deal with one of the world’s biggest and most tangled datasets: human speech.
Voice-driven conversation has always been the most natural way for us to communicate. Conversations are personal and they convey context, which helps us to understand each other. Conversations continue over time, and develop history, which in turn builds richer context. The challenge was that technology wasn’t capable of processing real human conversation.
The interfaces to our digital system have been dictated by the capabilities of our computer systems—keyboards, mice, graphical interfaces, remotes, and touch screens. Touch made things easier; it let us tap on screens to get the app that we wanted. But what if touch isn’t possible or practical? Even when it is, the proliferation of apps has created a sort of “app fatigue”. This essentially forces us to hunt for the app that we need, and often results in us not using many of the apps that we already have. None of these approaches are particularly natural. As a result, they fail to deliver a truly seamless and customer-centric experience that integrates our digital systems into our analog lives.
Voice becomes a game changer
Using your voice is powerful because it’s spontaneous, intuitive, and enables you to interact with technology in the most natural way possible. It may well be considered the universal user interface. When you use your voice, you don’t need to adapt and learn a new user interface. Voice interfaces don’t need to be application-centric, so you don’t have to find an app to accomplish the task that you want. All of these benefits make voice a game changer for interacting with all kinds of digital systems.
Until 2-3 years ago we did not have the capabilities to process voice at scale and in real time. The availability of large scale voice training data, the advances made in software with processing engines such as Caffe, MXNet and Tensflow, and the rise of massively parallel compute engines with low-latency memory access, such as the Amazon EC2 P3 instances have made voice processing at scale a reality.
Today, the power of voice is most commonly used in the home or in cars to do things like play music, shop, control smart home features, and get directions. A variety of digital assistants are playing a big role here. When we released Amazon Alexa, our intelligent, cloud-based voice service, we built its voice technology on the AWS Natural Language Processing platform powered by ML algorithms. Alexa is constantly learning, and she has tens of thousands of skills that extend beyond the consumer space. But by using the stickiness of voice, we think there are even more scenarios that can be unlocked at work.
Helping more people and organizations use voice
People interact with many different applications and systems at work. So why aren’t voice interfaces being used to enable these scenarios? One impediment is the ability to manage voice-controlled interactions and devices at scale, and we are working to address this with Alexa for Business. Alexa for Business helps companies voice-enable their spaces, corporate applications, people, and customers.
To use voice in the workplace, you really need three things. The first is a management layer, which is where Alexa for Business plays. Second, you need a set of APIs to integrate with your IT apps and infrastructure, and third is having voice-enabled devices everywhere.
Voice interfaces are a paradigm shift, and we’ve worked to remove the heavy lifting associated with integrating Alexa voice capabilities into more devices. For example, Alexa Voice Service (AVS), a cloud-based service that provides APIs to interface with Alexa, enables products built using AVS to have access to Alexa capabilities and skills.
We’re also making it easy to build skills for the things you want to do. This is where the Alexa Skills Kit and the Alexa Skills Store can help both companies and developers. Some organizations may want to control who has access to the skills that they build. In those cases, Alexa for Business allows people to create a private skill that can only be accessed by employees in your organization. In just a few months, our customers have built hundreds of private skills that help voice-enabled employees do everything from getting internal news briefings to asking what time their help desk closes.
Just like Alexa is making smart homes easier, the same is possible in the workplace. Alexa can control the environment, help you find directions, book a room, report an issue, or find transportation. One of the biggest applications of voice in the enterprise is conference rooms and we’ve built some special skills in this area to allow people to be more productive.
For example, many meetings fail to start on time. It’s usually a struggle to find the dial-in information, punch in the numbers, and enter a passcode every time a meeting starts. With Alexa for Business, the administrator can configure the conference rooms and integrate calendars to the devices. When you walk into a meeting, all you have to say is “Alexa, start my meeting”. Alexa for Business automatically knows what the meeting is from the integrated calendar, mines the dial-in information, dials into the conference provider, and starts the meeting. Furthermore, you can also configure Alexa for Business to automatically lower the projector screen, dim the lights, and more. People who work from home can also take advantage of these capabilities. By using Amazon Echo in their home office and asking Alexa to start the meeting, employees who have Alexa for Business in their workplace are automatically connected to the meeting on their calendar.
Voice interfaces will really hit their stride when we begin to see more voice-enabled applications. Today, Alexa can interact with many corporate applications including Salesforce, Concur, ServiceNow, and more. IT developers who want to take advantage of voice interfaces can enable their custom apps using the Alexa Skills Kit, and make their skills available just for their organization. There are a number of agencies and SIs that can help with this, and there are code repositories with code examples for AWS services.
We’re seeing a lot of interesting use cases with Alexa for Business from a wide range of companies. Take WeWork, a provider of shared workspaces and services. WeWork has adopted Alexa, managed by Alexa for Business, in their everyday workflow. They have built private skills for Alexa that employees can use to reserve conference rooms, file help tickets for their community management team, and get important information on the status of meeting rooms. Alexa for Business makes it easy for WeWork to configure and deploy Alexa-enabled devices, and the Alexa skills that they need to improve their employees’ productivity.
The next generation of corporate systems and applications will be built using conversational interfaces, and we’re beginning to see this happen with customers using Alexa for Business in their workplace. Want to learn more? If you are attending Enterprise Connect in Orlando next week, I encourage you to attend the AWS keynote on March 13 given by Collin Davis. Collin’s team has focused on helping customers use voice to manage everyday tasks. He’ll have more to share about the advances we’re seeing in this space, and what we’re doing to help our customers be successful in a voice-enabled era.
When it comes to enabling voice capabilities at home and in the workplace, we’re here to help you build.