This morning, I asked my Alexa-enabled Bosch java machine to make maine a coffee. Instead of moving my routine, it told maine it couldn’t do that. Ever since I upgraded to Alexa Plus, Amazon’s generative-AI-powered sound assistant, it has grounded to reliably tally my java routine, coming up pinch a different excuse almost each clip I ask.
It’s 2025, and AI still can’t reliably power my smart home. I’m opening to wonderment if it ever will.
The potential for generative AI and ample connection models to return the complexity retired of the smart home, making it easier to group up, use, and negociate connected devices, is compelling. So is the committedness of a “new intelligence layer” that could unlock a proactive, ambient home.
But this twelvemonth has shown maine that we are a agelong measurement from immoderate of that. Instead, our reliable but constricted sound assistants person been replaced pinch “smarter” versions that, while amended conversationalists, can’t consistently do basal tasks for illustration operating appliances and turning connected the lights. I want to cognize why.

This wasn’t the early we were promised.
It was backmost successful 2023, during an question and reply pinch Dave Limp, that I first became intrigued by the possibilities of generative AI and ample connection models for improving the smart location experience. Limp, past the caput of Amazon’s Devices & Services section that oversees Alexa, was describing the capabilities of the caller Alexa they were soon to motorboat (spoiler alert: it wasn’t soon).
Along pinch a much conversational adjunct that could really understand what you said nary matter really you said it, what stood retired to maine was the committedness that this caller Alexa could usage its knowledge of the devices successful your smart home, mixed pinch the hundreds of APIs they plugged into it, to springiness the adjunct the discourse it needed to make your smart location easier to use.
From mounting up devices to controlling them, unlocking each their features, and managing really they tin interact pinch different devices, a smarter smart location adjunct seemed to clasp the imaginable to not only make it easier for enthusiasts to negociate their gadgets but besides make it easier for everyone to bask the benefits of the smart home.
Fast-forward 3 years, and the astir useful smart location AI upgrade we person is AI-powered descriptions for information camera notifications. It’s handy, but it’s hardly the oversea alteration I had hoped for.
It’s not that these caller smart location assistants are a complete failure. There’s a batch I for illustration astir Alexa Plus; I moreover named it arsenic my smart location package prime of the year. It is much conversational, understands earthy language, and tin reply galore much random questions than the aged Alexa.
While it sometimes struggles pinch basal commands, it can understand analyzable ones; saying “I want it dimmer successful present and warmer” will set the lights and crank up the thermostat. It’s amended astatine managing my calendar, helping maine cook, and different home-focused features. Setting up routines pinch sound is simply a immense betterment complete wrestling pinch the Alexa app — moreover if moving them isn’t arsenic reliable.

Google has promised akin capabilities pinch its Gemini for Home upgrade to its smart speakers, though that’s rolling retired astatine a glacial pace, and I haven’t been capable to effort it beyond some on-the-rails demos. I was capable to trial Gemini for Home’s characteristic that attempts to summarize what’s happened astatine my location utilizing AI-generated matter descriptions from Nest camera footage. It was wildly inaccurate. As for Apple’s Siri, it’s still firmly stuck successful the past decade of sound assistants, and it appears it will stay location for a while longer.
The problem is that the caller assistants aren’t arsenic accordant astatine controlling smart location devices arsenic the aged ones. While they were often frustrating to use, the aged Alexa and Google Assistant (and the existent Siri) would mostly ever move connected the lights erstwhile you asked them to, provided you utilized precise nomenclature.
Today, their “upgraded” counterparts struggle pinch consistency successful basal functions for illustration turning connected the lights, mounting timers, reporting connected the weather, playing music, and moving the routines and automations connected which galore of america person built our smart homes.
I’ve noticed this successful my testing, and online forums are afloat of users who person encountered it. Amazon and Google person acknowledged the struggles they’ve had successful making their revamped generative AI-powered assistants reliably execute basal tasks. And it’s not constricted to smart location assistants; ChatGPT can’t consistently show clip aliases count.
Why is this, and will it ever get better? To understand the problem, I said pinch 2 professors successful the section of human-centric artificial intelligence pinch acquisition pinch agentic AI and smart location systems. My takeaway from those conversations is that, while it’s imaginable to make these caller sound assistants do almost precisely what the aged ones did, it will return a batch of work, and that’s perchance activity astir companies conscionable aren’t willing successful doing.
Basically, we’re each beta testers for the AI.
Considering location are constricted resources successful this section and ample opportunity to do thing overmuch much breathtaking (and much profitable) than reliably move connected the lights, that’s the measurement they’re moving, according to experts I said with. Given each these factors, it seems the easiest measurement to amended the exertion is to conscionable deploy it successful the existent world and fto it amended complete time. Which is apt why Alexa Plus and Gemini for Home are successful “early access” phases. Basically, we’re each beta testers for the AI.
The bad news is it could beryllium a while until it gets better. In his research, Dhruv Jain, adjunct professor of Computer Science & Engineering astatine the University of Michigan and head of the Soundability Lab, has besides recovered that newer models of smart location assistants are little reliable. “It’s much conversational, group for illustration it, group for illustration to talk to it, but it’s not arsenic bully arsenic the erstwhile one,” he says. “I deliberation [tech companies’] exemplary has ever been to merchandise it reasonably fast, cod data, and amended connected it. So, complete a fewer years, we mightiness get a amended model, but astatine the costs of those fewer years of group wrestling pinch it.”

The inherent problem appears to beryllium that the aged and caller technologies don’t mesh. So, to build their caller sound assistants, Amazon, Google, and Apple person had to throw retired the old and build thing wholly new. However, they quickly discovered that these caller LLMs were not designed for the predictability and repetitiveness that their predecessors excelled at. “It was not arsenic trivial an upgrade arsenic everyone primitively thought,” says Mark Riedl, a professor astatine the School of Interactive Computing astatine Georgia Tech. “LLMs understand a batch much and are unfastened to much arbitrary ways to communicate, which past opens them to mentation and mentation mistakes.”
Basically, LLMs conscionable aren’t designed to do what anterior command-and-control-style sound assistants did. “Those sound assistants are what we telephone ‘template matchers,’” explains Riedl. “They look for a keyword, erstwhile they spot it, they cognize that location are 1 to 3 further words to expect.” For example, you opportunity “Play radio,” and they cognize to expect a position telephone codification next.
“It was not arsenic trivial an upgrade arsenic everyone primitively thought.”
— Mark Riedl
LLMs, connected the different hand, “bring successful a batch of stochasticity — randomness,” explains Riedl. Asking ChatGPT the aforesaid punctual aggregate times may nutrient aggregate responses. This is portion of their value, but it’s besides why erstwhile you inquire your LLM-powered sound adjunct to do the aforesaid point you asked it yesterday, it mightiness not respond the aforesaid way. “This randomness tin lead to misunderstanding basal commands because sometimes they effort to overthink things excessively much,” he says.
To hole this, companies for illustration Amazon and Google person developed ways to merge LLMs pinch the APIs astatine the bosom of our smart homes (and astir of everything we do connected the web). But this has perchance created a caller problem.
“The LLMs now person to constitute a usability telephone to an API, and it has to activity a full batch harder to correctly create the syntax to get the telephone precisely right,” Riedl posits. Where the aged systems conscionable waited for the keyword, LLM-powered assistants now person to laic retired an full codification series that the API tin recognize. “It has to support each that successful memory, and it’s different spot wherever it tin make mistakes.”
All of this is simply a technological measurement of explaining why my java instrumentality sometimes won’t make maine a cup of coffee, aliases why you mightiness tally into problem getting Alexa aliases Google’s adjunct to do thing it utilized to do conscionable fine.
So, why did these companies wantonness a exertion that worked for thing that doesn’t? Because of its potential. A sound adjunct that, alternatively than being constricted to responding to circumstantial inputs, tin understand earthy connection and return action based connected that knowing is infinitely much capable.
“What each the companies that make Alexa and Siri and things for illustration that really want to do is chaining of services,” explains Riedl. “That’s wherever you want a wide connection understanding, thing that tin understand analyzable relationships done tasks and really they’re conveyed by speech. They tin invent the if-else statements that concatenation everything together, connected the fly, and dynamically make the sequence.” They tin go agentic.
“The mobility is whether … the expanded scope of possibilities the caller exertion offers is worthy much than a 100 percent meticulous non-probabilistic model.”
— Dhruv Jain
This is why you propulsion distant the aged technology, says Riedl, because it had nary chance of doing this. “It’s astir the cost-benefit ratio,” says Jain. “[The caller technology] is not ever going to beryllium arsenic meticulous astatine this arsenic the non-probabilistic exertion before, but the mobility is whether that sufficiently precocious accuracy, positive the expanded scope of possibilities the caller exertion offers, is worthy much than a 100 percent meticulous non-probabilistic model.”
One solution is to usage aggregate models to powerfulness these assistants. Google’s Gemini for Home consists of 2 abstracted systems: Gemini and Gemini Live. Anish Kattukaran, caput of merchandise astatine Google Home and Nest, says the purpose is to yet person the much powerful Gemini Live tally everything, but today, the much tightly constrained Gemini for Home is successful charge. Amazon likewise uses aggregate models to equilibrium its various capabilities. But it’s an imperfect solution that has led to inconsistency and disorder successful our smart homes.
Riedl says that nary 1 has really figured retired really to train LLMs to understand erstwhile to beryllium very precise and erstwhile to clasp randomness, meaning moreover the “tame” LLMs tin still get things wrong. “If you wanted to person a instrumentality that conscionable was ne'er random astatine all, you could tamp it each down,” says Riedl. But that aforesaid chatbot would not beryllium much conversational aliases capable to show your kid fantastical bedtime stories — some capabilities that Alexa and Google are touting. “If you want it each successful one, you’re really making immoderate tradeoffs.”
These struggles successful its deployment successful the smart location could beryllium a harbinger of broader issues for the technology. If AI can’t move connected the lights reliably, why should anyone trust connected it to do much analyzable tasks, asks Riedl. “You person to locomotion earlier you tin run.”
But tech companies are known for their propensity to move accelerated and break things. “The communicative of connection models has ever been astir taming the LLMs,” says Riedl. “Over time, they go much tame, much reliable, much trustworthy. But we support pushing into the fringe of those spaces wherever they’re not.”
Riedl does judge successful the way to a purely agentic assistant. “I don’t cognize if we ever get to AGI, but I deliberation complete clip we do spot these things astatine slightest being much reliable.” The mobility for those of america dealing pinch these unreliable AIs successful our homes today, however, is are we consenting to hold and astatine what costs to the smart location successful the meantime?
Follow topics and authors from this communicative to spot much for illustration this successful your personalized homepage provender and to person email updates.
English (US) ·
Indonesian (ID) ·