Part 1 of this blog post described how to get started with the Bing Speech API. This part describes how to enhance the speech to text capabilities with intent detection. If you haven’t already, I recommend that you read part 1.
Microsoft Cognitive Services are a new set of cloud APIs that can be used from practically any technical platform (Android apps, iOS apps, Windows apps, websites, etc) to leverage capabilities such as natural language processing, speech, vision, knowledge exploration, etc. These capabilities, like the whole field of artificial intelligence, are easy for humans but difficult for computers. Now you can leverage the results of Microsoft Research and powerful cloud computing to power your own applications with these capabilities.
Language Understanding Intelligent Service (LUIS)
To enable intent and keyword detection in Bing Speech API, you will need to use another service: LUIS. LUIS is a service that can help your applications understand any text.
- You build a model of intents and entities.
- You train the model (for instance by typing in text examples).
- You can use the trained model on any text.
Intents
Intents can be seen as commands. For example the sentence “I want a beer” in a restaurant would mean the intent is to place an order. If the customer then says “I want another one“, it means the same intent again.
Entities
Entities are the subjects references by intents. In the first example, the entity is beer. In the second example it is another one, which implicitly means beer. It is up to your application to translate the entity another one to beer.
Entities can also be of some prebuilt types, like numbers, datetimes, encyclopedia items, etc.
Getting a LUIS API key
First you will need an API key for LUIS. That can be requested from the Azure Portal. Click New -> Intelligence & Analytics -> Cognitive Services APIs.
Choose API type: Language Understanding Intelligent Service (LUIS). Also enter an account name and complete the remaining fields. You can select the free pricing tier F0.
Your account should be created within a few minutes. Open it in the All resources list and select Keys. Copy any of the keys.
Building the LUIS model
Go to the LUIS website, click on Sign In and then on New App.
The endpoint key can be entered later. Just choose a name for your application and click Create.
You will now come to the dashboard page where you can edit your LUIS application (your model). Note the application ID. You will need it later.
- Register a couple of intents and entities. Enter some sample texts for all intents.
- Go to the Train & Test page. Click on Train. Test your model by typing some sentences.
- Go to the Publish App page. Add the key you received from the Azure Portal. Click the Publish button.
Connecting to the LUIS application from the Speech API
You can now provide the Bing Speech API with the application id and subscription key for your LUIS application (model). The application id can be found in the application dashboard in LUIS (or simply from the URL of your LUIS application). The subscription key is from the Azure Portal and was registered on the Publish App page.
In the demo C# application on GitHub, there is nowhere in the user interface to enter the application id and subscription key. Instead it is a configuration file.
You can search the source code files for “appid” (make sure to use a non-case sensitive search). The setting will be found in app.config.
Enter your own application id and subscription key. Now you can start the program again and enable intent detection.
Finally you can test the application again with intent detection enabled.
Not only will it detect that the intent is “tellmeAbout”, but it will recognize the Bruce Springsteen is of the “encyclopedia” entity type and known as a musical artist, film actor and person.