Download the PHP package bopoda/robots-txt-parser without Composer
On this page you can find all versions of the php package bopoda/robots-txt-parser. It is possible to download/install these versions without Composer. Possible dependencies are resolved automatically.
Download bopoda/robots-txt-parser
More information about bopoda/robots-txt-parser
Files in bopoda/robots-txt-parser
Package robots-txt-parser
Short Description PHP Class for parsing robots.txt files according to Google, Yandex specifications.
License MIT
Homepage https://github.com/bopoda/robots-txt-parser
Informations about the package robots-txt-parser
robots-txt-parser
RobotsTxtParser — PHP class for parsing all the directives of the robots.txt files
RobotsTxtValidator — PHP class for check is url allow or disallow according to robots.txt rules.
Try demo of RobotsTxtParser on-line on live domains.
Parsing is carried out according to the rules in accordance with Google & Yandex specifications:
Last improvements:
- Pars the Clean-param directive according to the clean-param syntax.
- Deleting comments (everything following the '#' character, up to the first line break, is disregarded)
- The improvement of the Parse of Host — the intersection directive, should refer to the user-agent '*'; If there are multiple hosts, the search engines take the value of the first.
- From the class, unused methods are removed, refactoring done, the scope of properties of the class is corrected.
- Added more test cases, as well as test cases added to the whole new functionality.
- RobotsTxtValidator class added to check if url is allowed to parsing.
- With version 2.0, the speed of RobotsTxtParser was significantly improved.
Supported Directives:
- DIRECTIVE_ALLOW = 'allow';
- DIRECTIVE_DISALLOW = 'disallow';
- DIRECTIVE_HOST = 'host';
- DIRECTIVE_SITEMAP = 'sitemap';
- DIRECTIVE_USERAGENT = 'user-agent';
- DIRECTIVE_CRAWL_DELAY = 'crawl-delay';
- DIRECTIVE_CLEAN_PARAM = 'clean-param';
- DIRECTIVE_NOINDEX = 'noindex';
Installation
Install the latest version with
Run tests
Run phpunit tests using command
Usage example
You can start the parser by getting the content of a robots.txt file from a website:
Or simply using the contents of the file as input (ie: when the content is already cached):
This will output:
In order to validate URL, use the RobotsTxtValidator class:
Contribution
Feel free to create PR in this repository. Please, follow PSR style.
See the list of contributors which participated in this project.
Final Notes:
Please use v2.0+ version which works by same rules but is more highly performance.
All versions of robots-txt-parser with dependencies
ext-mbstring Version *