Using the Actions SDK to Develop for the Google Assistant

阿新 • • 發佈：2018-12-29

In today's post I'm going to show how to develop a simple app for the Google Assistant. For developing this app, I will be using the Actions SDK. My next post will use Dialogflow (formerly api.ai) instead. After reading both posts you hopefully will know enough to decide which approach is better suited for you.

About the app

In my home town I usually take the bus when taking our youngest to kindergarten before heading off to work. In winter and also due to some long-lasting street works, one of the two lines near our home is quite late all the time. But naturally unreliably so. So I wonder every morning when to leave for the bus. We do

have an app for bus information in Münster - but some limitations of this app make it not really suitable for the task.

Thus an assistant app that tells me when the next busses are due and whether we should hurry up or whether we can idle around a bit more is what I'm going to build. The first version in this post with the Actions SDK. Then in the next post an improved version with Dialogflow.

Here's a sketch of what a conversation with this app should look like:

Sample dialog of the app - Made with botsociety.io

On the right is what I'm going to speak. On the left are the assistant's answers. The first is by the assistant itself - I changed the color for that one to emphasize the difference. The second and third are responses of the app. After the last response the app finishes - and with this also the current assistant's action. Thanks to botsociety.io, you can even see this website showcasing this prototype.

What is the Actions SDK

The Actions SDK is one way to develop apps for the Google Assistant. As the name of the SDK implies you can create actions which basically are speech snippets of your user and your response to them. Let's go a bit more into the base concepts here.

The relevant things you need to know are:

Actions
Intents
Fulfillment

Actions

An action is a wrapper combining an intent and the fulfillment for this intent. For a truly conversational app, where there's a dialog between the user and the assistant, apps have multiple actions. But often it makes sense to create a one-shot app. In this case the app has only one action and answers directly with the response the user is looking for. The sample app would work well as a one-shot app. I only added one more action to make it a better sample for this post.

Intents

We already know intents from Android. And with Actions on Google they are not much different. An intent specifies what the user intends your app to do.

In contrast to Android, though, you are very limited when it comes to intents with the Actions SDK.

Every app has to define one actions.intent.MAIN intent, that is used to start up your app, if no custom intent is better suited to the invocation phrase. Depending on your kind of app, this intent might directly provide the answer and then stop the app. Or it might kind of introduce the app to the user in some way.

For the start of your app you can also define custom intents and which user utterances trigger that intent. You might even add some parameters to that (for example numbers or a date). And if the Assistant detects that one of those phrases was used by the user, it starts up your app using this intent.

If you were to build a more complex app with the Actions SDK, you would use one intent most of the time: actions.intent.TEXT. Meaning: You have one intent that has to deal with nearly all possible user input. And thus your app would have to decide what to do based on the text spoken or typed in by the user.

There are a few more intents available. For example android.intent.OPTION which is useful if you provide a set of options to the user of which she might select one. But when dealing with user input most of the time you would have to deal with actions.intent.TEXT.

This limitation is the reason why I've written in my intro post about the Assistant, that the Action SDK is mostly useful when you are proficient with natural language processing or have a simple project with a command like interface.

Unless you have some language processing super powers it boils down to this:

⇨ Use the Actions SDK for one-shot apps. These are apps that provide the required answer directly after being invoked and then stop. They give you just this one result. A typical example would be setting a timer to ten minutes.

⇨ For all other apps, for those that are really conversational, where there are multiple paths to follow and where you want your user to provide more information during the conversation, I recommend to use Dialogflow (api.ai) instead of the Actions SDK. I will cover Dialogflow in my next post.

Fulfillments

Fulfillment is just another word for your backend. Every action must be backed by some backend of yours. This is where you generate the text spoken to the user. Or — on devices that have a graphical user interface — that's where you create visual cues for your user.

Of course the response is based on which intent was triggered and what additional cues the user gave in her question.

When Google calls your backend, it provides a Json file that contains plenty of information as you will see later in this post. Some of the things you get access to:

The spoken text as Google has detected it
The intent invoked
The parameters to the intent
The session id
Information about the device's capabilities

In this post I'm only going to make use of some of them, but I'm going to cover more aspects in future posts.

The next picture shows where your app — or more precisely: your fulfillment code — fits into the overall picture. The assistant knows which app to call based on the invocation phrase used by the user. It also provides voice recognition of the user utterances and text to speech transformation of your answers to the user. Your app then takes the recognized text and creates an appropriate response for that.

Action SDK workflow - picture © Google, Inc.

Preparations

When you want to use Actions on Google your account must be prepared for that. You have to do three things:

Create a project within the Actions on Google Dev Console
Install the gactions command line tool
Initialize the project using the gactions tool

In the next paragraphs I'm going to cover those steps.

Create a project within the Action on Google Dev Console

The very first step is to use the Dev Console to create a project. If it's your first project check the terms of the services - since by clicking the "Create Project" button you signify to adhere to them:

Creating a project on the dev console — Creating a project on the Actions on Google Dev Console

Afterwards you have to decide which API to use to build your project:

As you can see, there are three prominent ways to develop apps: The Actions SDK or one of the two services Dialogflow or Converse AI. If you click on "Actions SDK" you will see a tiny popup where you can copy a gactions command. So the next step is to install the gactions command line tool.

Install the gactions command line tool

You can download the gactions tool from Google's site. It's a binary file you can store, wherever you see fit. Make sure to make this file executable (if you're using Linux or Mac OS X) and - since you're going to use this tool a lot - I recommend to add it to your PATH.

If you already had this tool installed, you can use the selfupdate option to update it:


gactions selfupdate

That's all that is needed in this step

Using the Actions SDK to Develop for the Google Assistant

About the app

What is the Actions SDK

Actions

Intents

Fulfillments

Preparations

Create a project within the Action on Google Dev Console

Install the gactions command line tool

Using the Actions SDK to Develop for the Google Assistant

Male Spiders Sacrifice Themselves Industrial IoT Router/Gateway to Mates for the Kids

Japanese spacecraft drops robot onto asteroid to hunt for the origin of the solar system

Ask HN: Where to go for the cheapest EE degree in the U.S.? In the World?

Why must I enable the MMU to use the D-Cache but not for the I-Cache?

dubbo異常:Failed to check the status of the service . No provider available for the service 解決

The method iterator() is ambiguous for the type KafkaStream

The attribute required is undefined for the annotation type XmlElementRef

The method getTextContent() is undefined for the type Node 錯誤解決

A Virtual Jewish Nation May Teach the World How to Live on the Cloud

Ask HN: Which can be the best way to communicate with the user?

Version 當前jdk版本號 of the JVM is not suitable for the this product.Version:1.8 or greater is required

I belive I can fly in the sky!**I am waiting for the future!

The method getInstance() is undefined for the type Service

The method getDispatcherType() is undefined for the type HttpServletRequest錯誤解決方法

The method getDispatcherType() is undefined for the type HttpServletRequest 升級到tomcat8

java 呼叫方法引起歧義：The method XXX is ambiguous for the type XX

MyEclipse 下 the method getTextContent() is undefined for the type Element

The method getTextContent() is undefined for the type Node錯誤

kubeadm init 卡在 Created API client, waiting for the control plane to become ready

Using the Actions SDK to Develop for the Google Assistant

About the app

What is the Actions SDK

Actions

Intents

Fulfillments

Preparations

Create a project within the Action on Google Dev Console

Install the gactions command line tool

相關推薦