The year 2020 is full of surprises – me rummaging through Siri documentation is one of the smaller ones. I had an idea for a new feature in one of my iOS apps and it quickly turned into an expedition into all things Siri. To save you embarking on a similar journey, I thought I’d attempt to share the knowledge I’ve learned and the missteps along the way, using a real world app as a case study.
Back in 2010, I released NumberRace, a fast-paced version of the Numbers Game round from mid-afternoon UK game show Countdown. Over the years I’ve intermittently added new features as time allows, including support for Auto Layout and Apple Watch.
This year, during The Situation We Find Ourselves In, I revived the app and finally got round to implementing a solver. Given a target number and six initial smaller numbers, the solver would attempt to find a way of reaching the target number by adding, subtracting, multiplying and dividing the initial smaller numbers.
With the solver in place, inspiration struck. Wouldn’t it be great if Siri could take you through the solving process interactively? Siri could ask for the target number and the initial numbers, and read out the solution. A few weeks ago I started to investigate how to achieve this goal.
I usually think of Siri as ‘that thing that replies when you talk to your phone’, but it may be more helpful to consider Siri as ‘a virtual assistant that helps make actions available when you need them’. There are a number of different features accessible through the set of developer APIs known as SiriKit, not all of them voice-related. Exploring some of these features may help us towards our goal. Here goes.
A shortcut is a link to an action (or, in Apple terminology, an intent) that can be handled by your app. In the case of NumberRace, we could create a ‘Solve Game’ intent that can be used to present the solver immediately to the user. A user can add this intent to the Shortcuts app and use it to present the solver. Furthermore, a user can associate a shortcut with a spoken phrase in the Shortcuts app so that the solver could be presented by saying ‘Hey Siri, Solve Game’. Or even ‘Hey Siri, Do a Clever Number Thing’ if the user decided to choose a flashier spoken phrase to open the shortcut.
But all this assumes that the user is familiar enough with the Shortcuts app. How else can we make our solver more accessible to the user?
As a developer, you can configure your app to notify Siri that a user has carried out a particular activity. These notifications are called donations. In our NumberRace app, for example, every time a user clicks on our ‘Solve game’ button, the app is able to donate a ‘Solve Game’ intent to Siri. Siri keeps track of these donations and works out if there’s a regular pattern to them. Siri can then make predictions as to when the user is likely to make use of our intents and then present this as a suggestion on the lock screen or the search screen. Our hypothetical user may be a fan of Countdown and watches the show at the same time every weekday. In that case, Siri may make the ‘Solve Game’ suggestion available on the user’s lock screen at the time when Countdown is broadcast.
Turning suggestions into shortcuts
Can we convert suggestions into shortcuts? That is, can we attach a custom spoken phrase to a suggestion so that we can access it through Siri’s voice interface? The answer is yes, but the way of doing this is somewhat cumbersome.
A developer can add an ‘Add to Siri’ button in the app’s UI. Clicking on this button will prompt the user for a spoken phrase and, once provided, a shortcut is created. For my app, I could add an ‘Add to Siri’ button on the solver screen and this will take the user through the process of adding the Solve Game intent as a shortcut. I’m not sure that adding the button, at a cost of some real estate and added complexity to the UI, is worth the trouble.
Alternatively, the user could perform the same task manually, by navigating to ‘Siri & Search’ in the Settings app and recording a custom phrase there. Again, this doesn’t feel particularly frictionless, relying on the user to visit the Settings app to perform this task.
We’ve been using the term intents a lot already, but what are they exactly? Intents enable developers to use Siri to provide an interactive voice interface to their apps. They come in two flavours: system intents, and custom intents.
Apple introduced system intents in iOS 10. System intents are limited to handling common tasks such as sending messages, paying bills, or playing media content. (There’s a full list of tasks over at Apple’s Human Interface Guidelines site). When a user says, for example, ‘Hey Siri, use WhatsApp to send a message’, Siri is able to interpret this as a Send Message system intent. Siri guides the flow of conversation, asking for more information interactively, reaching out to the target app to supply the data.
Unsurprisingly, there isn’t a system intent for solving a numbers game. That’s where custom intents come in, and these were introduced in iOS 12.
Custom intents can be written to handle actions not covered by system intents. This gives us a little more freedom to do what we want. However, a custom intent needs to belong to a particular category, so that Siri is able to respond using the correct verbs. For example, setting the “Order” category will ensure that Siri says order-related things during the interaction. (Again, the full list of categories is available at the Human Interface Guidelines site.)
Again, there’s no category specific enough for our solver, but there is a handy generic category. We’ll use the verb “Run” in that category for now.
Intents and shortcuts
There is a drawback, however, when it comes to custom intents. While Siri can easily interpret sentences related to system intents such as ‘use WhatsApp to send a message’, the same isn’t true of custom intents. Despite specifying the dialogue necessary to handle our custom intent, it is not currently possible for Siri to do so without extra work on the user’s part. The user needs to create a shortcut first, and attach a specific phrase to that shortcut. Our user then speaks that phrase to invoke the custom intent.
So where does that leave us? My desired end goal is already unachievable. It is not possible for a user, after installing the app, to say, ‘Hey Siri, use NumberRace to solve a numbers game’ and have Siri interpret the phrase ready for NumberRace to do its thing.
I’m therefore going to have to revise my expectations downwards – the user needs to create a shortcut to access the NumberRace solver through Siri. It’s not ideal, but we can also use suggestions to present the solver to the user – assuming of course that the suggestion is surfaced in the first place. We’ll be looking at suggestions in more detail in Part 4.
We now have the first steps of a vague plan. Firstly we’ll create a custom intent to implement a shortcut. We’ll have a crack at this Part 2, and see how far it takes us.