WikiPlus

How to Create a robots.txt File: Step-by-Step Guide [2026]

Creating a robots.txt file correctly is one of the most fundamental technical SEO tasks for any website. A robots.txt file tells search engine crawlers which pages they are allowed to access and which to skip, helping you manage crawl budget and prevent indexation of private or low-value content. WikiPlus Robots Generator at wikiplus.co produces a valid, syntax-correct robots.txt file from a simple form in under two minutes — no server access or technical knowledge required. This step-by-step guide covers every field and decision.

What a robots.txt File Does and Why It Matters

A robots.txt file is a plain text file placed at the root of your domain (yourdomain.com/robots.txt) that communicates crawling instructions to web robots — primarily search engine crawlers like Googlebot, Bingbot, and others. It uses a simple directive syntax: User-agent specifies which crawler the rule applies to, Disallow tells the crawler which paths to skip, Allow explicitly permits paths within a disallowed directory, and Sitemap points crawlers to your XML sitemap. Robots.txt is a crawling directive, not an indexing directive — a disallowed page can still be indexed if it has inbound links from other sites. For indexation control, use the noindex meta tag. For crawl budget management on large sites, robots.txt is the primary tool.

How WikiPlus Robots Generator Works

WikiPlus Robots Generator at wikiplus.co runs entirely in your browser — no data uploaded to a server. You select which crawlers you want to address (All robots, Googlebot, Bingbot, or specific bots), specify which paths to disallow (admin areas, login pages, search result pages, duplicate parameter URLs), add any explicit Allow rules needed to override a broad Disallow, and add your sitemap URL. The tool assembles these inputs into correctly formatted robots.txt syntax, displays a live preview, and lets you copy the output or download it as robots.txt. The generator validates your input in real time, flagging common errors like paths that do not begin with a forward slash.

Step-by-Step: Creating Your robots.txt

Open WikiPlus Robots Generator at wikiplus.co. Step one: choose your user-agent target. Use asterisk (*) to address all crawlers, or target specific ones. Step two: add Disallow rules for each path you want to block. Common blocks: /admin/, /login/, /cart/, /checkout/, /?s= (WordPress search results), /wp-json/, /tag/, /category/ (if paginated tag/category pages create duplicate content). Step three: add explicit Allow rules if you need to permit a specific page within a broadly disallowed directory. Step four: paste your XML sitemap URL (typically yoursite.com/sitemap.xml or yoursite.com/sitemap_index.xml). Step five: click Generate. Step six: copy the output and create a file named robots.txt (all lowercase) in your site root directory. Upload it to the root — not in a subfolder.

Common Mistakes That Break Crawl Control

The most dangerous robots.txt mistake is a Disallow: / rule that blocks all crawlers from the entire site. This is surprisingly common after migrations or CMS installations that default to a maintenance mode robots.txt. Check your robots.txt immediately after any major site change. Other common errors: blocking CSS and JavaScript files that Googlebot needs to render pages (Google needs to see your CSS to evaluate mobile usability); disallowing URLs with parameters using overly broad patterns that also block valid pages; placing the robots.txt file in a subdirectory rather than the site root; and using the wrong capitalisation (User-agent not user-agent, Disallow not disallow). WikiPlus Robots Generator produces correctly formatted syntax to eliminate these errors.

Frequently Asked Questions

Where does a robots.txt file go?
The robots.txt file must be placed at the root of your domain — accessible at yourdomain.com/robots.txt. It cannot be in a subdirectory. For WordPress sites, upload it to the public_html or www folder using your hosting control panel file manager or FTP. For static sites, place it in the root public or dist folder that your web server serves. For Shopify, the robots.txt is managed through the theme files. Verify placement by visiting yourdomain.com/robots.txt in a browser after uploading.
Does robots.txt prevent pages from being indexed?
No. Robots.txt controls crawling (whether Googlebot visits a URL) not indexing (whether a URL appears in search results). A disallowed page can still be indexed if other sites link to it — Google indexes the URL without crawling it. To prevent indexation, use the noindex meta robots tag on the page itself. A common mistake is blocking pages in robots.txt and assuming they will not appear in search results — they can and often do.
Should I block my WordPress /wp-admin/ in robots.txt?
Yes, but Googlebot already avoids admin areas by default. The primary reason to disallow /wp-admin/ is to save crawl budget and avoid exposing admin URL patterns to less-friendly bots. The correct rule is: Disallow: /wp-admin/. Note that you should allow /wp-admin/admin-ajax.php because some plugins use this endpoint for front-end functionality that Googlebot needs to render pages correctly.