Google’s John Mueller based on that LLM systems can’t usage files for illustration llms.txt to determine which websites to aboveground for a fixed query.
He made the comments connected a caller section of Search Off the Record, the podcast from Google’s Search Relations team.
His remark points to a broader awesome problem, not conscionable intentional gaming. Even a well-written llms.txt record is still self-reported accusation from the tract that wants to beryllium chosen.
For discovery, Mueller pointed backmost to normal HTML pages and soul links.
What Mueller Said
The speech started pinch a mobility astir whether publishers should person websites to Markdown for LLMs. Mueller and co-host Martin Splitt agreed that HTML is still the instauration for crawling and discovery.
The chat sewage circumstantial erstwhile Mueller turned to llms.txt. He described the find usage lawsuit arsenic a dormant end:
“It’s fundamentally you’re telling these systems, like, I person the champion website ever. And present are each of the pages that everyone must spell to. And you must bargain each of my products aliases immoderate you put successful there. So successful LLM system, it basically, by design, can’t spot what is present arsenic a measurement of differentiating betwixt different websites.”
His statement comes down to differentiating. If sites usage llms.txt to beforehand themselves, the files tin make akin claims. An LLM deciding which tract champion answers a query still needs different measurement to differentiate betwixt them.
What ‘By Design’ Might Mean
“By design” could mean 2 different things, and Mueller didn’t explain which.
One reference is architectural. LLM systems measure web contented and can’t usage self-reported files erstwhile picking sources.
The different reference treats it arsenic a awesome problem. Self-reported signals suffer worth erstwhile everyone provides them. Meta keywords stopped moving for the aforesaid reason. Every tract stuffed them, and hunt engines couldn’t extract a useful ranking signal.
Both readings scope the aforesaid conclusion connected discovery. But they connote different things astir whether the limitation could alteration complete time.
Where Mueller Sees A Role
Mueller didn’t cull each uses of llms.txt. He carved retired 1 lawsuit wherever it could help:
“If personification is already connected your website, possibly immoderate benignant of automated strategy is helpful.”
He utilized the illustration of an supplier trying to bargain a photograph from a circumstantial site. The LLM would sojourn the tract and look for instructions connected really to complete the purchase.
The statement splits find from navigation. llms.txt can’t thief an LLM take which tract to visit. But it could thief erstwhile the supplier is already there, for illustration a shop directory for personification who already walked in.
Beyond The Gaming Argument
Mueller has called building Markdown pages for bots “a stupid idea”. He’s besides compared llms.txt to the keywords meta tag.
SEJ’s Roger Montti wrote that llms.txt is “inherently untrustworthy” because thing stops tract owners from adding self-serving content. SE Ranking’s analysis of 300,000 domains recovered nary nexus betwixt llms.txt take and citation wave successful LLM answers.
Those arguments focused connected what happens erstwhile group crippled the files. Mueller’s podcast remark adds the nuance that there’s nary system wrong the files to thief an LLM prime 1 tract complete another.
Why This Matters
The gaming statement against llms.txt has ever had a counterargument available. Platforms could study to penalize manipulation, the measurement hunt engines handled spammy system data.
The differentiation statement leaves a harder problem. Penalizing manipulation whitethorn reside abuse, but it doesn’t explicate really self-reported files thief an LLM take 1 tract complete another. Your astir meticulous llms.txt record still can’t show an LLM to prime your tract complete a competitor’s.
Looking Ahead
Standards for really agents navigate sites haven’t settled yet, Mueller acknowledged. He mentioned WebMCP alongside different record types nether discussion.
None person go a standard. By his estimate, it could return six months to a year, aliases longer, for agentic systems to settee connected a format. The find layer, wherever HTML and soul linking already work, isn’t portion of that discussion.
English (US) ·
Indonesian (ID) ·