It is literally the stuff of nightmares: an artificially intelligent computer laughing at its human users unbidden. But rather than taking place in some dystopian science-fiction movie, it actually happened last week.
Some users reported that Alexa, the AI assistant residing on Amazon’s popular Echo speakers, began laughing at them creepily without being asked to. Not that anyone is likely to make such a request voluntarily, outside of Halloween at least.
Amazon quickly acknowledged the issue and pushed out a fix. According to the company, the AI was activating on its own after mishearing instructions.
The solution involved changing its verbal cue from, “Alexa, laugh,” to “Alexa, can you laugh?”
The speedy acknowledgement was a good move, but the explanation wasn’t particularly satisfactory, given that some users posted video evidence showing Alexa laughing without any verbal commands, misheard or otherwise.
It’s possible the AI was indeed picking up distant sounds somewhere in the background, as Amazon suggests, but it’s also possible the company has yet to reveal the whole story.
Voice assistants are hot right now. Research firm Canalys expects global sales of 56 million smart speakers this year, while fellow research house Arizton expects the smart speaker market (encompassing products such as the Amazon Echo, Apple’s HomePod and Google Home) will be worth $4.8 billion by 2022.
But privacy and security concerns are never far from people’s minds. A good number of consumers still don’t trust or want an always-on microphone in their home. Alexa’s unprovoked laughter adds to that mistrust.
It’s unlikely Amazon’s AI has gained sentience and is about to launch a Terminator-like war on humanity, but outside mischief is also a real possibility. Smart speakers’ growing popularity is making them a target.
Even if that’s not the case in this situation, it’s reasonable to expect that a hack is more likely a case of when, not if.
It’s simply the reality of the modern digital world. When the hack does come, the repercussions could be more significant than just menacing laughter.
The problems are also likely to get worse as new technology allows for users’ voices to be spoofed.
Researchers at China’s Baidu, for example, recently reported that they have successfully developed a system that lets an AI mimic someone’s voice after analysing less than a minute of their speech.
As New Scientist magazine reports, the capability is coming along quickly. Voice cloning systems needed about 20 minutes of audio to successfully duplicate a person's voice as recently as two years ago.
Proponents of the technology tout its benefits – a mother could program an e-reader to read a bed-time story to her kids in her own voice, for example – but the potential downsides are also clear. For telephone banking and other AI voice assistants, it’s a nightmare waiting to happen.
At least part of the solution may lie in shifting assistants’ capabilities onto devices themselves. Rather than connecting to the internet to find answers to queries, an assistant would instead rely on data stored within a speaker, or whichever device it happens to be housed in. Alexander Wong, co-director of the Vision and Image Processing Research Group at the University of Waterloo in Canada, has a team working on just this problem.
He expects that AI assistants will be able to offload many of their capabilities from the internet within the next three or four years. “We’re trying to take these giant brains and cram them down,” he says.
“Working towards these types of assistants right on the device can help minimise the amount of data that needs to be transferred. That helps mitigate some of the risk.”
Apple has taken this approach to some extent with Siri, which is where the downside becomes apparent. In most objective tests, Siri ranks well behind Alexa and the Google Assistant in terms of accuracy.
Siri could ultimately develop into a “good-enough” AI that is more secure than competitors, but it will likely always lag those that continue to connect to the internet to draw information.
“The more accurate you want your model to be, the more data you have to feed into it,” says Nidhi Chappell, head of AI at Intel.
“The closer you get to a cellphone or a smart device, you have less data available.”
It’s increasingly looking like AI assistants are going to require users to choose between trade-offs. They’re either going to have to settle for those that are perhaps less useful and accurate but more secure, or they’ll have to accept a certain amount of risk to go along with full capabilities.
Unfortunately, going with the former option may carry with it the likelihood of your technology laughing at you from time to time.
Peter Nowak is a veteran technology writer and the author of Humans 3.0: The Upgrading of the Species