Stay organized pinch collections Save and categorize contented based connected your preferences.

Friday, March 7, 2025

A long-standing instrumentality for website owners, robots.txt has been successful progressive usage for complete 30 years and is broadly supported by crawler operators (such arsenic devices for tract owners, services, and search engines). In this version of the robots refresher series, we'll return a person look astatine robots.txt arsenic a elastic measurement to show robots what you want them to do (or not do) connected your website.

Getting started pinch robots.txt

The measurement these files activity is simple: you make a matter record called "robots.txt" and past upload it to your website—and if you're utilizing a contented guidance strategy (CMS), it's apt moreover easier. You tin time off your robots.txt record quiet (or not person 1 astatine all) if your whole tract whitethorn beryllium crawled, aliases you tin adhd rules to negociate crawling. For example, to show each bots (also known arsenic crawlers, robots, spiders) to enactment retired of your "add to cart" page, you could constitute this successful your robots.txt file:

user-agent: * disallow: /cart

More circumstantial things you tin do pinch robots.txt

robots.txt is the Swiss Army weapon of expressing what you want different robots to do or not do connected your website: it tin beryllium conscionable a fewer lines, aliases it tin beryllium analyzable pinch much elaborate rules targeting very circumstantial URL patterns. You tin usage a robots.txt record for solving technical issues (such arsenic unnecessary paginated pages), aliases for editorial aliases individual reasons (such arsenic conscionable not wanting definite things crawled). For example, you could:

Inform aggregate bots (but not all) astir the aforesaid rule

This group tells some examplebot and otherbot to enactment distant from the /search path.

user-agent: examplebot user-agent: otherbot disallow: /search

Tell 1 bot to debar paths that incorporate a circumstantial portion of text

For example, you could show documentsbot not to crawl immoderate record that contains ".pdf" successful its name.

user-agent: documentsbot disallow: *.pdf

Tell a bot it whitethorn crawl your blog, but not the drafts

user-agent: documentsbot allow: /blog/ disallow: /blog/drafts/

Block a crawler from portion of your website, while allowing different crawlers to entree your site

This robots.txt record disallows the mentioned aicorp-trainer-bot from accessing thing different than the location page, while allowing different crawlers (such arsenic hunt engines) to entree the site.

user-agent: * allow: / user-agent: aicorp-trainer-bot disallow: / allow: /$

Leave a remark for your early self

You tin commencement a statement pinch # to punctual yourself astir why you put a definite norm there.

# I don't want bots successful my highschool photos user-agent: * disallow: /photos/highschool/

For moreover more, you tin cheque retired our list of useful robots.txt rules.

Making changes to your robots.txt record (practically)

The Robots Exclusion Protocol (REP) works by putting together rules ("allow" aliases "disallow") and specifying which robots these rules use to. You don't request to study programming aliases fiddle pinch tools; you tin conscionable put these rules successful a matter record and upload it to your website.

For astir websites, it's moreover simpler than that! If you're utilizing a CMS, it usually has thing already built successful to thief you alteration your robots.txt file. For example, immoderate CMSes let you customize your robots.txt record utilizing checkboxes aliases pinch a elemental form, and galore person plugins that thief you group up and constitute rules for your robots.txt file. To cheque what's imaginable within your CMS, you tin do a hunt for the sanction of your CMS +"edit robots.txt file".

Once you've sewage things group up, you tin besides trial to make judge your record is group up really you intended. There are galore testing devices built by the web organization to thief pinch this, specified as TametheBot's robots.txt testing tool and this robots.txt parser that are utilizing the open-source robots.txt parser library.

If you person immoderate questions astir robots.txt, you tin find america connected LinkedIn aliases chat pinch like-minded experts successful our organization forums.

Posted by Martin Splitt and John Mueller, Search relations team

Except arsenic different noted, the contented of this page is licensed nether the Creative Commons Attribution 4.0 License, and codification samples are licensed nether the Apache 2.0 License. For details, spot the Google Developers Site Policies. Java is simply a registered trademark of Oracle and/or its affiliates.

[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the accusation I need","missingTheInformationINeed","thumb-down"],["Too analyzable / excessively galore steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / codification issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],[],[],[]]