Download the PHP package webignition/robots-txt-file without Composer

On this page you can find all versions of the php package webignition/robots-txt-file. It is possible to download/install these versions without Composer. Possible dependencies are resolved automatically.

FAQ

After the download, you have to make one include require_once('vendor/autoload.php');. After that you have to import the classes with use statements.

Example:
If you use only one package a project is not needed. But if you use more then one package, without a project it is not possible to import the classes with use statements.

In general, it is recommended to use always a project to download your libraries. In an application normally there is more than one library needed.
Some PHP packages are not free to download and because of that hosted in private repositories. In this case some credentials are needed to access such packages. Please use the auth.json textarea to insert credentials, if a package is coming from a private repository. You can look here for more information.

  • Some hosting areas are not accessible by a terminal or SSH. Then it is not possible to use Composer.
  • To use Composer is sometimes complicated. Especially for beginners.
  • Composer needs much resources. Sometimes they are not available on a simple webspace.
  • If you are using private repositories you don't need to share your credentials. You can set up everything on our site and then you provide a simple download link to your team member.
  • Simplify your Composer build process. Use our own command line tool to download the vendor folder as binary. This makes your build process faster and you don't need to expose your credentials for private repositories.
Please rate this library. Is it a good library?

Informations about the package robots-txt-file

robots-txt-file Build Status

Introduction

Overview

Handles robots.txt files:

Robots.txt file format refresher

Let's quickly go over the format of a robots.txt file so that you can understand what you can get out of a \webignition\RobotsTxt\File\File object.

A robots.txt file contains a collection of records. A record provides a set of directives to a specified user agent. A directive instructs a user agent to do something (or not do something). A blank line is used to separate records.

Here's an example with two records:

User-agent: Slurp
Disallow: /

User-Agent: *
Disallow: /private

This instructs the user agent 'Slurp' that it is not allowed access to '/' (i.e. the whole site), and this instructs all other user agents that they are not allowed access to '/private'.

A robots.txt file can optionally contain directives that apply to all user agents irrespective of the specified records. These are included as a set of a directives that are not part of a record. A common use is the sitemap directive.

Here's an example with directives that apply to everyone and everything:

User-agent: Slurp
Disallow: /

User-Agent: *
Disallow: /private

Sitemap: http://example.com/sitemap.xml

Usage

Parsing a robots.txt file from a string into a model

This might not be too useful on it's own. You'd normally be retrieving information from a robots.txt file because you are a crawler and need to know what you are allowed to access (or disallowed) or because you're a tool or service that needs to locate a site's sitemap.xml file.

Inspecting a model to get directives for a user agent

Let's say we're the 'Slurp' user agent and we want to know what's been specified for us:

Ok, now we have a DirectiveList containing a collection of directives. We can call $directiveList->get() to get the directives applicable to us.

This raw set of directives is available in the model because it is there in the source robots.txt file. Often this raw data isn't immediately useful as-is. Maybe we want to inspect it further?

Check if a user agent is allowed to access a url path

That's more like it, let's inspect some of that data in the model.

Extract sitemap URLs

A robots.txt file can list the URLs of all relevant sitemaps. These directives are not specific to a user agent.

Let's say we're an automated web frontend testing service and we need to find a site's sitemap.xml to find a list of URLs that need testing. We know the site's domain and we know where to look for the robots.txt file and we know that this might specify the location of the sitemap.xml file.

Cool, we've found the URL for the first sitemap listed in the robots.txt file. There may be many, although just the one is most common.

Filtering directives for a user agent to a specific field type

Let's get all the disallow directives for Slurp:

Building

Using as a library in a project

If used as a dependency by another project, update that project's composer.json and update your dependencies.

"require": {
    "webignition/robots-txt-file": "*"      
}

This will get you the latest version. Check the list of releases for specific versions.

Developing

This project has external dependencies managed with composer. Get and install this first.

# Make a suitable project directory
mkdir ~/robots-txt-file && cd ~/robots-txt-file

# Clone repository
git clone [email protected]:webignition/robots-txt-file.git.

# Retrieve/update dependencies
composer update

# Run code sniffer and unit tests
composer cs
composer test

Testing

Have look at the project on travis for the latest build status, or give the tests a go yourself.

cd ~/robots-txt-file
composer test

All versions of robots-txt-file with dependencies

PHP Build Version
Package Version
Requires php Version >=7.2.0
ext-json Version *
ext-mbstring Version *
webignition/disallowed-character-terminated-string Version >=2,<3
Composer command for our command line client (download client) This client runs in each environment. You don't need a specific PHP version etc. The first 20 API calls are free. Standard composer command

The package webignition/robots-txt-file contains the following files

Loading the files please wait ....