PHP download

Download the PHP package luyadev/luya-module-crawler without Composer

On this page you can find all versions of the php package luyadev/luya-module-crawler. It is possible to download/install these versions without Composer. Possible dependencies are resolved automatically.

Table of contents
Download luyadev/luya-module-crawler
More information about luyadev/luya-module-crawler
Files in luyadev/luya-module-crawler

Vendor luyadev
Package luya-module-crawler
Short Description An full search page crawler to enable complex and customized searching abilities.
License MIT
Homepage https://luya.io

Keywords php module crawler yii yii2 luya yii2-pagecrawler pagecrawler luya-module

FAQ

After the download, you have to make one include require_once('vendor/autoload.php');. After that you have to import the classes with use statements.

Example:

If you use only one package a project is not needed. But if you use more then one package, without a project it is not possible to import the classes with use statements.

In general, it is recommended to use always a project to download your libraries. In an application normally there is more than one library needed.

Some PHP packages are not free to download and because of that hosted in private repositories. In this case some credentials are needed to access such packages. Please use the auth.json textarea to insert credentials, if a package is coming from a private repository. You can look here for more information.

Some hosting areas are not accessible by a terminal or SSH. Then it is not possible to use Composer.
To use Composer is sometimes complicated. Especially for beginners.
Composer needs much resources. Sometimes they are not available on a simple webspace.
If you are using private repositories you don't need to share your credentials. You can set up everything on our site and then you provide a simple download link to your team member.
Simplify your Composer build process. Use our own command line tool to download the vendor folder as binary. This makes your build process faster and you don't need to expose your credentials for private repositories.

Please rate this library. Is it a good library?

Example code of luyadev/luya-module-crawler

Informations about the package luya-module-crawler

Crawler

An easy to use full-website page crawler to make provide search results on your page. The crawler module gather all information about the sites on the configured domain and stores the index in the database. From there you can now create search queries to provide search results. There are also helper methods which provide intelligent search results by splitting the input into multiple search queries (used by default).

LUYA Crawler Search Stats

Installation

Install the module via composer:

After installation via Composer include the module to your configuration file within the modules section.

Where baseUrl is the domain you want to crawler all information.

After setup the module in your config you have to run the migrations and import command (to setup permissions):

Running the Crawler

To execute the command (and run the crawler proccess) use the crawler command crawl, you should put this command in cronjob to make sure your index is up-to-date:

Make sure your page is in utf8 mode (<meta charset="utf-8"/>) and make sure to set the language <html lang="<?= Yii::$app->composition->langShortCode; ?>">.

In order to provide current crawl results you should create a cronjob which crawls the page each night: cd httpdocs/current && ./vendor/bin/luya crawler/crawl

Crawler Arguments

All crawler arguments for crawler/crawl, an example would be crawler/crawl --pdfs=0 --concurrent=5 --linkcheck=0:

name	description	default
linkcheck	Whether all links should be checked after the crawler has indexed your site	true
pdfs	Whether PDFs should be indexed by the crawler or not	true
concurrent	The amount of conccurent page crawles	15

Stats

You can also get statistic results enabling a cronjob executing each week:

Create search form

Make a post request with query to the crawler/default/index route and render the view as follows:

Crawler Settings

You can use crawler tags to trigger certains events or store informations:

tag	example	description
CRAWL_IGNORE	`<!-- [CRAWL_IGNORE] -->Ignore this<!-- [/CRAWL_IGNORE] -->`	Ignores a certain content from indexing.
CRAWL_FULL_IGNORE	`<!-- [CRAWL_FULL_IGNORE] -->`	Ignore a full page for the crawler, keep in mind that links will be added to index inside the ignore page.
CRAWL_GROUP	`<!-- [CRAWL_GROUP]api[/CRAWL_GROUP] -->`	Sometimes you want to group your results by a section of a page, in order to let crawler know about the group/section of your current page. Now you can group your results by the `group` field.
CRAWL_TITLE	`<!-- [CRAWL_TITLE]My Title[/CRAWL_TITLE] -->`	If you want to make sure to always use your customized title you can use the CRAWL_TITLE tag to ensure your title for the page:

All versions of luya-module-crawler with dependencies

PHP Build Version

Package Version

Version 3.7.2 Release 31. Oct 2023
create-project require 0 people chose require and
0 people chose create-project.

Download

Download latest version of luya-module-crawler from vendor luyadev

Requires php Version >=7.1
nadar/crawler Version ^1.3
nadar/stemming Version ^1.0
smalot/pdfparser Version ^2.1

Composer command for our command line client (download client) This client runs in each environment. You don't need a specific PHP version etc. The first 20 API calls are free. Standard composer command

The package luyadev/luya-module-crawler contains the following files

Loading the files please wait ....