Google Exposes The Fundamental Flaw Of LLMs.txt

Jun 18, 2026 04:56 PM - 4 days ago 5052

Google’s John Mueller and Martin Splitt talked astir LLMs.txt and markdown, pinch Mueller offering a astonishing truth astir the original intent of LLMs.txt and besides explaining why the projected standards are person terrible shortcomings.

What Discovery Is And Why It Matters

In the discourse of accusation retrieval (search), find is astir a hunt motor discovering that a circumstantial web page exists. Discovery is simply a portion of the wide hunt motor architecture.

Search Engine Architecture:

  1. Discovery
    Discovering the URL (adding it to the crawl).
  2. Crawling
    Downloading and parsing the content.
  3. Indexing
    The process of analyzing the earthy information and storing it successful a system database optimized for retrieval.
  4. Ranking
    The portion that everyone’s willing in.
  5. Serving
    This is the past measurement which is serving the classed web pages successful the hunt results.

The supra is simply a simplified overview of what hunt is and Discovery is the very first portion of the process that yet ends pinch ranking and serving links to websites.

The takeaway present is that Discovery is simply a captious portion of getting a web page queued for crawling, indexed, ranked, and yet shown successful the hunt results. Without Discovery a web page is invisible.

Now present is why this is important: Discovery is not a portion of the projected LLMs.txt standard. use

Original Intent Of LLMs.txt

John Mueller said that he met 1 of the group responsible for creating the LLMs.txt connection and said that the creator explained that LLMs.txt was ne'er astir making a tract discoverable, it was ne'er meant to beryllium a portion of that process.

This is an important constituent because galore tract owners are spending time, money, and effort generating LLMs.txt for the intent of getting discovered and classed successful LLMs. That intends that the logic group are utilizing LLMs.txt is successful conflict pinch the existent intent of LLMs.txt, which has thing to do pinch Discovery.

Mueller explained:

“So I talked with, I think, 1 of the group who created that connection a while back. And the thought was really not to create thing that makes it easier for hunt engines aliases LLM systems to observe each of your content, but almost much that if an LLM already knows astir your tract and wants to find retired what other is here, past that mightiness beryllium an approach.

And I deliberation the facet of utilizing this arsenic a measurement to optimize for Discovery by AI systems aliases Discovery by hunt systems, that doesn’t make immoderate consciousness astatine all.”

Mueller adjacent explained that galore group are utilizing LLMs.txt successful the dream of aiding the process of Discovery contempt the truth that’s not the intent of LLMs.txt.

He past pivoted to the truth that LLMs.txt are inherently untrustworthy because it’s a tract proprietor saying what their site’s contented is about, which whitethorn aliases whitethorn not lucifer what’s successful the existent HTML.

He continued:

“Because it’s fundamentally you’re telling these systems, like, I person the champion website ever. And present are each of the pages that everyone must spell to. And you must bargain each of my products aliases immoderate you put successful there.

So successful an LLM system, it… basically, by design, can’t spot what is present arsenic a measurement of differentiating betwixt different websites.”

Agentic Instructions

Mueller past says that immoderate of these standards proposals could beryllium useful for helping an AI agent, which sounds for illustration possibly he’s talking astir the Web Model Context Protocol (WebMCP).

He explained:

“If personification is already connected your website, possibly immoderate benignant of automated strategy is helpful. Where if it goes, I want to spell to Martin’s Splitt and bargain a photograph, past the LLM strategy tin spell to your website and tin look around, like, really do you bargain a photograph? Maybe he has immoderate guidelines for maine arsenic an supplier for buying photographs. That benignant of makes sense.

But going disconnected and saying, I want to bargain a photograph, which website has one, the strategy is not going to spell to your website and 5 others and say, who has immoderate automated information? But rather, they’re trying, going to effort to find the champion website…”

LLMs.txt Is Not About Getting Discovered By AI

Mueller circled backmost to really group are misconstruing LLMs.txt arsenic a measurement to beryllium discovered by AI systems.

He reasoned astir this point:

“I deliberation from that constituent of view, optimizing arsenic a measurement of being discovered, that doesn’t make sense.

But what happens erstwhile an supplier is connected your website? I deliberation that besides conscionable mostly seems to beryllium an unfastened area for chat astatine the moment, successful that there’s LLMs.txt arsenic a proposal. There are different JSON files and well-known record types that are successful discussion.

There’s WebMCP, which I deliberation tries to do thing similar, wherever they say, well, you’re connected this page now, but we person a programmatic interface for this, added circumstantial URL aliases a circumstantial mechanism.

I deliberation those are past almost different discussions.”

Discovery And Ranking Are Still Tied To HTML

Mueller completed his thought by underlining the constituent that Discovery is astatine the HTML level.

He explained:

“So the generic SEO perspective of really do I find a website that sells maine a photograph is almost going to beryllium wholly bound to HTML pages and normal web pages.

And past if a personification decides to spell to a circumstantial service, past wrong that service, past location is simply a small spot much room for possibly helping an supplier aliases an LLM strategy to find the correct approach.

But what is interesting, of course, is tons of ideas. And nary of these person fundamentally crystallized arsenic the 1 point that everyone will use. So I’m judge complete the next, I don’t know, half year, year, aliases possibly longer, it’s going to return a bit. And immoderate of these agentic systems are going to benignant of unify astir immoderate modular record type aliases system aliases something.”

Mueller wasn’t pushing the WebMCP modular but if AI agents go a measurement that users interact pinch websites past it’s going to beryllium thing for illustration WebMCP and not LLMs.txt that will beryllium useful for websites, peculiarly for ecommerce sites.

WebMCP is the people amended fresh for ecommerce because it focuses connected giving AI agents actionable capabilities, for illustration really to select products, really to hunt and place products, immunodeficiency successful comparing different products, and immunodeficiency AI successful adding a merchandise to a shopping cart.

AI agents are capable to navigate utilizing the website HTML which was designed for humans. WebMCP makes it easier for AI agents to successfully interact pinch the website, thing that LLMs.txt does not do.

While neither LLMs.txt and WebMCP thief a website get discovered by AI, neither of them was created for that purpose. The Discovery part, the first shape for ranking, each happens pinch HTML. If that’s the case, what’s your adjacent move?

Listen To Google’s Search Off The Record Episode 111

Featured Image by Shutterstock/Master1305

Category News SEO
Follow Us On Google
More