How to develop an Alexa Skill

Skills are like apps for Alexa. With an interactive voice interface, Alexa gives users a hands-free way to interact with your skill. Users can use their voice to perform everyday tasks like checking the news, listening to music, or playing a game. Users can also use their voice to control cloud-connected devices.

Alexa Skill Kit

A collection of self-service APIs, tools, documentation, and code samples that make it fast and easy for you to add custom skill to Alexa.

There are mainly 2 parts on developing a skill for Alexa. First you have to define what is the input of the skill, meaning how will the user interact with the skill. There are couple of parts to this interaction

Skill Invocation Name

Users say an invocation name to begin an interaction with a particular skill. For example the user can say “Alexa ask Temperature Lookup for the temperature in Berlin” This will invoke the skill Temperature Lookup. User can also invoke it by saying “Alexa Open Temperature Lookup”

There can me multiple ways to invoke the skill and pass the information, if the information required by skill is not provided in the initial invocation, then the skill will prompt the user to provide the information.

Intents

Intent is a command / task to execute. Like start cleaning, tell me the weather, play this song for me. Every intent is supposed to resolve to one unique action. A skill can have multiple intents, meaning it can do multiple things. Intents can also be chained. Like “Tell me the weather currently in Berlin”, Alexa will respond with the information. Then you can ask “how is the weather there tomorrow” without specifying the city name, and Alexa will be able to understand that this is a continuation of the previous intent.

Sample utterances and Slots

An intent can be invoked multiple ways example: Tell me Berlin temperature, or Tell me temperature of Berlin, or What is the temperature in Berlin, and of course more ways. These are called sample utterances which will invoke the same intent.

Berlin here is referred as a Slot or fill in the blank, which Alexa service will parse out and provide us as a variable input. Alexa can understand multiple types of slots like numbers, dates, duration, phone numbers, name of famous people, geographical areas etc. and custom values too.

"session": {
    "new": false,
    "sessionId": "someSessionId",
    "attributes": {},
    "user": {
      "userId": "userId"
    },
    "application": {
      "applicationId": "skillId"
    }
  },
  "version": "1.0",
  "request": {
    "intent": {
      "slots": {
        "CityName": {
          "name": "CityName",
          "value": "Berlin"
        }
      },
      "name": "TemperatureIntent"
    },
    "type": "IntentRequest",
    "requestId": "requestId1234"
  }
}

Request part contains the name of Intent which is getting invoked, and the slots part contain the information which was pared out from the utterance.

Session / attributes is used to preserve information from one interaction to the next so the context is preserved between utterances if needed.

Server side options

This is second part. The response can be generated by a lambda function hosted on AWS, or by a Rest service running on your environment. You need to configure in Alexa developer portal the destination for the above request. There are templates available on handling the request

Lambda

Lambda option is straight-forward, you just need to provide the lambda id to the skill to do the mapping.

exports.handler = function(event,context) {
    var request = event.request;

    if (request.type === "LaunchRequest") {
      handleLaunchRequest(context); // no action, just prompt the user for more info

    } else if (request.type === "IntentRequest") {

      if (request.intent.name === "TemperatureIntent") {
        handleTemperatureIntent(request,context);

      } else if (request.intent.name === "WeatherIntent") {
        handleWeatherIntent(request,context,session);

      } else {
        throw "Unknown intent";
      }
    } else if (request.type === "SessionEndedRequest") {
    } else {
      throw "Unknown intent type";
    }
}

All you have to do in this then is to formulate a response and send it back

function handleTemperatureIntent(request,context) {
  let cityName = request.intent.slots.CityName.value;
  let temperature = getTemperature(cityName);

  var response = {
    version: "1.0",
    response: {
      outputSpeech: {
        type: "SSML",
        ssml: "<speak>The temperature in ${cityName} is ${temperature}.</speak>"
      }
    }
  };

  context.succeed(response);
}

Web service

In case you want to host your own service to provide the response, it needs to be https and follow this format

app = Flask(__name__)
@app.route('/alexa_end_point', methods=['POST'])

def alexa():
    event = request.get_json()
    req = event['request']

    if req['type'] == 'LaunchRequest':
        return handle_launch_request()

    elif req['type'] == 'IntentRequest':

        if req['intent']['name'] == 'TemperatureIntent':
            return handle_temperature_intent(req)

        elif req['intent']['name'] == 'WeatherIntent':
            return handle_weather_intent(req)

        else:
            return "", 400

    elif req['type'] == 'SessionEndedRequest':
        pass

Easy..!!

Rarely do we find men who willingly engage in hard, solid thinking. There is an almost universal quest for easy answers and half-baked solutions. Nothing pains some people more than having to think. – Martin Luther King, Jr.

Cheers – Amit Tomar