What is the robots.txt file?
The robots.txt file controls search engine access to the site, allowing or restricting content indexing. Learn how to create and configure it effectively.
Introduction
The robots.txt file is a simple text file that is part of the Robots Exclusion Protocol (REP). It contains instructions for search engines on how to access and index the site. To be effective, the robots.txt file must be placed in the root folder of the site (for example, https://domeniu.ro/robots.txt).
The Importance of the robots.txt File
The robots.txt file is crucial for managing the interaction of bots with the site. There are many bots that can aggressively index the site, which can affect its performance. By using this file, you can:
- Control the search engines' access to the site's content.
- Allows indexing only by desired bots (e.g., Google, Bing).
- Restrict access to sensitive folders or files.
Usage examples
1. Blocking a specific search engine
To block access for the Bing search engine (bingbot), include the following lines in the robots.txt file:
User-agent: bingbotDisallow: /
Explanation:
User-agent: Specifies the search engine for which the settings apply.
Disallow: Defines the sections of the site to which the robot does not have access. The symbol / blocks access to the entire site.
2. Blocking all search engines
To prevent all search engines from accessing the site, use:
User-agent: *Disallow: /
Explanation: The symbol * (wildcard) in the User-agent field applies to all search engines, and / blocks access to the entire site.
3. Restricting access to certain folders or files
To block access to a folder and a specific file, configure:
User-agent: *Disallow: /blog/Disallow: /newsletter.php
Explanation: All search engines will be blocked from indexing the folder /blog/ and the file newsletter.php.
Creating the robots.txt file
To create a robots.txt file, you can use an online generator that helps you quickly and error-free customize the rules. Examples of online generators can be found through a search: Robots.txt Generator.
After you have created the file, upload it to the root directory of the site using a file manager or an FTP client.