Slide 1

Slide 1 text

HACKING(?) SiriKit Roberto Esteban Torres - @esttorhe, Mobilization, 2016

Slide 2

Slide 2 text

I remember being at home sitting in the couch watching the keynote on my ! when they announced what we all have been hoping for ANNOUNCEMENT Roberto Esteban Torres - @esttorhe, Mobilization, 2016

Slide 3

Slide 3 text

! APPLE FINALLY RELEASES Siri Roberto Esteban Torres - @esttorhe, Mobilization, 2016

Slide 4

Slide 4 text

Right there and then I started thinking of all the possible applications for such a powerful framework. Can you imagine being able to use the power of Siri to command your app? ! APPLE FINALLY RELEASES Siri maybe Roberto Esteban Torres - @esttorhe, Mobilization, 2016

Slide 5

Slide 5 text

Back when Siri was released to us I was working for a company called Brewbot; which basically made a machine that helped you brew beer and control the whole process from your device. Overall it was a really nice product and a super ambitious idea. If you didn't know making beer is a very hands-on task, and the whole premise of Brewbot's product at the time was to make it hands on so you felt like making beer instead of making a coffee out of a Keurig machine. A LITTLE BIT OF HISTORY ! Brewbot Roberto Esteban Torres - @esttorhe, Mobilization, 2016

Slide 6

Slide 6 text

This is a very polarizing topic in the brew community but there's a point when had to stirr the mash, in our case, the app will send a PM telling the user to stirr and would wait until after the user confirms that they are done with the requested task. As you can imagine having your phone out there while stirring is quite risky, it could fall down and shatter the screen (which actually happened to us) or it could take a dive on the mash and turn into a brick, etc. So you could imagine how I was going crazy thinking about the possibility of telling Siri stuff like: ^> Siri: begin brewing False Bottomed Girls ^> Siri: continue the brew Which would potentially minimize the risk of damaging your STIRRING THE MASH Roberto Esteban Torres - @esttorhe, Mobilization, 2016

Slide 7

Slide 7 text

Keep in mind that all this possible scenarios were running in my head just in the few seconds between apple saying «we are releasing SiriKit» and apple saying «it will only work for a subset of applications» Needless to say I felt devastate ! THEN TRAGEDY Roberto Esteban Torres - @esttorhe, Mobilization, 2016

Slide 8

Slide 8 text

I understand the resoning behind apple's decision; at least on my mind its because  is setting up the basicss and defining how Siri capable apps should behave and once everything is nice and smooth, just then then will release/unleash Siri's power for us. There were 2 possible paths I could take at that point. Resign myself from my crazy ideas (which I sort of did) or maybe… LIMITATIONS Roberto Esteban Torres - @esttorhe, Mobilization, 2016

Slide 9

Slide 9 text

In order to check if there's a way to circumvent the current limitations; or even «hacking» its behaviour we first need to understand how it works. ! COULD IT BE «hacked»? Roberto Esteban Torres - @esttorhe, Mobilization, 2016

Slide 10

Slide 10 text

First the «boring» and easy to follow part; you configure your apps capabilities, entitlements, etc; which is no biggie, then comes the nice part. How do you enable or even integrate it in a nice way so the user can interact with it from anywhere on their phone; and even maybe say «Yo, Siri, do something»? How does it work?? Roberto Esteban Torres - @esttorhe, Mobilization, 2016

Slide 11

Slide 11 text

The most interesting part about integrating Siri the way we want it to is via app extensions If you don't know what those are they let's think of them like mini-apps that get embedded inside of your binary/app and they installed on your device along with the «main» app (like today widgets or ᴡᴀᴛᴄʜ apps. They come bundled with your app but they work outside of the app's realm so in order to communicate to and from the main app you would have to use shared containers to «surpass» 's sandboxing. App Extensions Roberto Esteban Torres - @esttorhe, Mobilization, 2016

Slide 12

Slide 12 text

This are the app extensions that you would need to support SiriKit You don't need BOTH depending on what and how you would like to support SiriKit. INTENTS & UI INTENTS Roberto Esteban Torres - @esttorhe, Mobilization, 2016

Slide 13

Slide 13 text

An «intent» is the extension one which will get called when you execute a Siri command for your app; its an intent because its what Siri will try to call if there's a match. Basically is the way to represent the users's «intent» of requesting a command The way this works is because you'll say something like: INTENT Roberto Esteban Torres - @esttorhe, Mobilization, 2016

Slide 14

Slide 14 text

Here you can see how I've highlighted some of the important words. First we are obviously launching Siri, then we are telling it to Send something (this will tell Siri that the app we want to use needs to be registered as an app that can send messages; so it will narrow down the apps scope), then we are telling Siri what's the content of the message (which is not necessarily required to be provided at the time of launching the intent); we are then specifying a receiver by saying to whom we want to send the message (also not required) and then by saying using we are specifying the app we want to use to handle our intent With all this information Siri can now deconstruct everything HEY SIRI, Send Hello to Travis using Commander Roberto Esteban Torres - @esttorhe, Mobilization, 2016

Slide 15

Slide 15 text

Now siri can actually call our intent telling it exactly all of this, your user intents to send a message with this information, there you can actually look for the recipient in your custom list of contacts (or piggy back on the phone's contacts book) and decide if you can handle the message or not. ▸ Type: Message app ▸ Recipient: Travis ▸ Content: Hello ▸ Application: Commander Roberto Esteban Torres - @esttorhe, Mobilization, 2016

Slide 16

Slide 16 text

Back to the components required to support SiriKit we have the UI Intent which gives visual feedback to the user and its probably the most interesting one for my hack project. The UI Intent could be a complex view depending on your app or a pretty simple one and should look something like this: UI INTENT Roberto Esteban Torres - @esttorhe, Mobilization, 2016

Slide 17

Slide 17 text

Here you can see some HORRIBLE UI I used when I was playing around with Siri. There are a few things to notice from this screenshot. 1. Travis; for the longest time i've wanted to launch Travis jobs with something other than a commit or the web ui, why? no idea; I just wanted to; so basically my hack app will hack Siri to communicate with Travis 2. From what I told you before we can infer that the app is called Commander 3. And the most important part; the UI… and of course i'm kidding, the Recipient on the bottom Roberto Esteban Torres - @esttorhe, Mobilization, 2016

Slide 18

Slide 18 text

I said send it to Travis but the UI reported a different recipient which sounds like an error… Or maybe it isn't; this pretty much tells us that the data can be «tweaked» or hacked or tinkered with, meaning that we can actually take advantage of SiriKit to Travis to Charlie's Roberto Esteban Torres - @esttorhe, Mobilization, 2016

Slide 19

Slide 19 text

We still have a long way ahead before being able to say something like that We still need to figure out a way to parse our actions and map our recipients perhaps… Roberto Esteban Torres - @esttorhe, Mobilization, 2016

Slide 20

Slide 20 text

But first, what is this app we are going to build? it has a couple of parts First one being a way to query Travis CI for a list of repos and their last build state ^ And also way for us to trigger a failed build so it can run again Commander Roberto Esteban Torres - @esttorhe, Mobilization, 2016

Slide 21

Slide 21 text

so; let's see some code and disect it so we can verify how «hackable» this is going to be. SEE SOME CODE Roberto Esteban Torres - @esttorhe, Mobilization, 2016

Slide 22

Slide 22 text

Maybe you can't see it but we'll disect it bit by bit so we can all understand what's going on and how to abuse it. func resolveRecipients(forSendMessage intent: INSendMessageIntent, with completion: @escaping ([INPersonResolutionResult]) -> Void) { var resolutionResults = [INPersonResolutionResult]() for recipient in recipients { … switch matchingContacts.count { case 2 ... Int.max: // We need Siri's help to ask user to pick one from the matches. resolutionResults += [INPersonResolutionResult.disambiguation(with: matchingContacts)] case 1: // We have exactly one matching contact resolutionResults += [INPersonResolutionResult.success(with: recipient)] case 0: // We have no contacts matching the description provided resolutionResults += [INPersonResolutionResult.unsupported()] … Roberto Esteban Torres - @esttorhe, Mobilization, 2016

Slide 23

Slide 23 text

First, what happens if multiple contacts match our recipient? as you can see we can send back «feedback»/information to the user so they can actually choose which contact we want, which in this case can be totally abused by completely ignoring any kind of «contacts» concept and instead using our own entities. The other scenarios are self explanatory, when we have one and only one contact we return success setting that contact as the receiver We have NO contacts and thus we need to report it back to siri so the user can provide the user WHAT HAPPENS WHEN MULTIPLE CONTACTS «MATCH»? case 2 ... Int.max: // We need Siri's help to ask user to pick one from the matches. resolutionResults += [INPersonResolutionResult.disambiguation(with: matchingContacts)] FEEDBACK TO THE USER Roberto Esteban Torres - @esttorhe, Mobilization, 2016

Slide 24

Slide 24 text

For the first part I thought about maybe having a predefined contact called Travis and then send it a command like fetch CALLING Travis Roberto Esteban Torres - @esttorhe, Mobilization, 2016

Slide 25

Slide 25 text

Pretty simple and works pretty nice, why? We are defaulting to Travis as the recipient so in our code we can actually check if the recipient is travis and then proceed. Send a message to Travis using Commander Roberto Esteban Torres - @esttorhe, Mobilization, 2016

Slide 26

Slide 26 text

As you can see here when confirming the command we get to actually execute and hit the Travis api calls and even fail and give feedback to the user if the service failed; there's even a flag to indicate that our service is down (messaging service though). CONFIRMING THE COMMAND // Handle the completed intent (required). func handle(sendMessage intent: INSendMessageIntent, completion: @escaping (INSendMessageIntentResponse) -> Void) { let userActivity = NSUserActivity(activityType: NSStringFromClass(INSendMessageIntent.self)) let response = INSendMessageIntentResponse(code: .success, userActivity: userActivity) // Here we should be hitting Travi's API completion(response) } Roberto Esteban Torres - @esttorhe, Mobilization, 2016

Slide 27

Slide 27 text

Basically we have most of the pseudo code ready, we are still lacking a way to actually parse the commands that the user is sending, we could do that on multiple parts of the app; but I think the best part to do so would be on resolveContent function VERIFYING THE COMMANDS Roberto Esteban Torres - @esttorhe, Mobilization, 2016

Slide 28

Slide 28 text

The code is pretty simple; at the top we are defining our actions as an enum with a string type so we can easily «parse» the commands sent to siri Inside the actual «resolve» we verify that the recipient is either Travis or a valid repo and that the user issued an expected command. If one of those requirements isn't met we can fail and tell the user: VERIFYING THE COMMANDS private enum Actions: String { case execute case status } func resolveContent(forSendMessage intent: INSendMessageIntent, with completion: @escaping (INStringResolutionResult) -> Void) { // Extract the 1st recipient (only expect 1 here) if let recipients = intent.recipients, let recipient = recipients.first { let key = (recipient.customIdentifier ?? recipient.displayName).lowercased() if (!self.repos.contains(key)) { // Repos should be read from a shared container DB completion(INStringResolutionResult.unsupported()) } } // Extract the command if let text = intent.content, !text.isEmpty { // Extract the actual action if let _ = Actions(rawValue: text.lowercased()) { completion(INStringResolutionResult.success(with: text)) } else { completion(INStringResolutionResult.unsupported()) } } else { completion(INStringResolutionResult.needsValue()) } } Roberto Esteban Torres - @esttorhe, Mobilization, 2016

Slide 29

Slide 29 text

Here the message is kind of misleading; but that's the whole point of hacking/piggy backing on another service. Simply we are failing because we are trying to send a message to Miguel which isn't Travis nor is a repo on the «predefined» account Roberto Esteban Torres - @esttorhe, Mobilization, 2016

Slide 30

Slide 30 text

In here we are sending a message to a valid recipient but we are trying to execute an invalid command, and thus we are failing with a more «valid» message. Roberto Esteban Torres - @esttorhe, Mobilization, 2016

Slide 31

Slide 31 text

Ok, we are almost there, we need to hit Travis api based on multiple different actionns and recipients. If the recipient is travis and the command is status we fetch the list of repos and their state Here I wanted to give a visual feedback to the user with the state of the repos via a custom UI Intent. The way it works is by specifying the type of Intent for which you want to support the UI on the info plist of the extensions; sadly for messaging apps the only valid value is for sending; which means there's no actual way to give the user immediate feedback of the status of the repos PUTTING ALL THE PARTS TOGETHER Roberto Esteban Torres - @esttorhe, Mobilization, 2016

Slide 32

Slide 32 text

Not too worry too much, we can keep on hacking over Siri to fit our needs. a couple of slides ago we said we were going to read the repos from a shared container db, basically when we execute the fetch we store the results in a shared container (if we want to also support querying and launching jobs from the actual app and not just the extensions) Then next time the user tries to send a message and doesn't specify a recipient we can «hijack» the recipient check and populate with the list of repos and a state flag/emoji SO WHAT THEN? ! Roberto Esteban Torres - @esttorhe, Mobilization, 2016

Slide 33

Slide 33 text

Here is where we actually abuse Siri the most, we should map from our list of repos and then build our INPerson entities from that data and pass it back as a disambiguation to siri, this will generate this output: // If no recipients were provided we'll need to prompt for a value. if recipients.count == 0 { // TODO: Read from the DB/store the list of repos and statuses let person = INPerson(personHandle: INPersonHandle(value: "RxViewModel", type: .unknown), nameComponents: nil, displayName: "✅\tRxViewModel", image: nil, contactIdentifier: nil, customIdentifier: "RxViewModel") completion([INPersonResolutionResult.disambiguation(with: matchingContacts)]) return } Roberto Esteban Torres - @esttorhe, Mobilization, 2016

Slide 34

Slide 34 text

Here we are giving the user enough visual feedback and information to make a conscious decision Once the user selects a repo siri will ask for a message or in this case a command Roberto Esteban Torres - @esttorhe, Mobilization, 2016

Slide 35

Slide 35 text

Here we can select which repo we want by taping on it and then saying execute which is a recognized task, then again, inside the function where we confirming sending the message we just need to extract the info we require and call the travis api with the «re run» command and that will actually launch a new job on the desired repo. Roberto Esteban Torres - @esttorhe, Mobilization, 2016

Slide 36

Slide 36 text

Well obviously we have proven that Siri is quite powerful due to the fact the it can interpret language really nice (caveat, depending on your language); english is ok- ish, same as spanish but i've heard that german needs to be quite robotic for siri to pick up context. The feedback and concepts are not too obvious or readable given that you need to abuse the vocabulary set for a completely different type of app SHORTCOMINGS Roberto Esteban Torres - @esttorhe, Mobilization, 2016

Slide 37

Slide 37 text

Basically SiriKit is quite powerful, it can be easily abused to fill our needs but is still a long way from being the native assistante we all want to integrate into our apps. Hopefully next year or maybe in a couple  will loosen up a little bit the restriction on apps that can use SiriKit and then maybe we can come up with really nice integreations The fitness apps ui intents are more dynamic and maybe could be abused in a really nice matter; just that for this specific scenario the vocabulary and foundations for fitness felt even weirder TAKEAWAYS Roberto Esteban Torres - @esttorhe, Mobilization, 2016

Slide 38

Slide 38 text

THANKS dzięki Roberto Esteban Torres - @esttorhe, Mobilization, 2016

Slide 39

Slide 39 text

YOU CAN FIND ME: ▸ Twitter: @esttorhe ▸ Github: esttorhe ▸ Blog: estebantorr.es ▸ Code sample1: github.com/esttorhe/travisboss #yatusabes 1 Code will be uploaded next week Roberto Esteban Torres - @esttorhe, Mobilization, 2016