I’m sitting here in a house with more robot assistants than most sci-fi movies. Next to me is a Amazon Echo, the original tower I got when they first went on sale. It’s sitting there unplugged after my 5th or 6th try to make the relationship work. It’s nothing personal, it just feels like we’re ships in the night. In the kitchen is a Google Home Mini, its mute switch flipped on after it also made me angry with how it did things. Finally there is my iPhone and Mac, with an assistant seemingly designed to make someone like me angry. How did we get here?
As an engineer I spend my days on the terminal using a series of CLIs or command line interfaces. These are the black and white text which in movies is exciting, but in reality once you learn how to use them are no more or less exciting than the GUI applications that came with your new computer. The difference between the GUI applications and the CLI ones is the level of severity. There is ONE correct way to do things with a CLI. You need to enter the EXACT right combination of strings and integers into the interface to get back the correct result. They’re not supposed to be friendly or easy to pick up. The focus of a CLI is empowering the power user to get their job faster, whereas a GUI needs to strike more of a balance between the two ideas.
GUIs are a bit different. These are applications like Firefox, or Chrome, or Slack. These visual designs are supposed to try and balance the needs of power users with advanced settings and keyboard shortcuts with the ease of use that newer users expect. While a CLI is frustrating when you are trying to figure out the correct combination to make it work, once you have that the CLI is remarkably consistent. It’s the same experience every time (for years and years usually). GUIs are constantly changing, resulting in an experience that is easier to pick up but can change on you at any moment.
Some voice assistants feel like a CLI right now. Calling them “AI” or “smart assistants” or jesus even “assistants” is a bit of a joke. Replace the black and white terminal with my voice and it’s exactly the same experience. Alexa needs me to speak the right magic incantation in order for it to play the podcast I want. Google understands the podcast, but its decided that the act of adding something to a list needs to live in an application called Google Express. Why can’t Alexa just go out and get podcasts? No idea. Why does Google feel the need to reinvent the idea of having a list of stuff? I don’t know either.
The worst though is Siri. CLIs can be frustrating for sure, but at least when you learn the right way to operate them it feels like a key opening a door. Google and Alexa feel much this way to me. While I find it irritating that I need to tell Alexa to have this other app to play a podcast, once I use it a few times and memorize it I’m fine with it. Why does Google put stuff into a weird app? I don’t know, but now I know how to do it and life moves on. Siri is literally the worst of both worlds to me. It both demands the precise syntax of the CLI while providing the frustrating inconsistency of the GUI.
The cardinal sin Siri commits is not that the utility doesn’t try or even that the utility never works. The sin Siri commits is that the same words said to the same device in a relatively short period of time seem to produce such infuriatingly different results. For all their faults, the other devices spying on me at least fail predictably. Relying on Siri in the car to send a text message or call someone is like having a personal assistant who hates you and wants you to fail.
Outside of their limited interfaces, these devices drive me insane because of how we talk about them. People talk about machine learning, AI, robots replacing jobs, human beings getting replaced. There’s so much discussion AROUND the things and what they MEAN that nobody ever sits down and says “what do people actually use these goddamn things for?” The answer of course, is nothing amazing. Listen to music, set timers, add things to shopping lists and check the weather. All this incredible technology and servers and cloud but what do we use them for? The exact same things we have always done in the kitchen except now my egg timer doesn’t work if the fucking WiFi cuts out.
I’m confident for the next 3-5 years this will be something people talk about. We’ll talk about how amazing they are and how clever the people who make them are. But at the end of the day we’ll use them for basically the same thing we used to have a radio and a timer for. Nobody will ever mention that, all these servers, power and packets flying through the air to replace a $4.99 Sony radio. But it’s true. Then after while they’ll stop working, or there will be an update, or Amazon will simply stop caring about them. And we’ll all move on.