Download the PHP package dachcom-digital/dynamic-search-data-provider-crawler without Composer

On this page you can find all versions of the php package dachcom-digital/dynamic-search-data-provider-crawler. It is possible to download/install these versions without Composer. Possible dependencies are resolved automatically.

FAQ

After the download, you have to make one include require_once('vendor/autoload.php');. After that you have to import the classes with use statements.

Example:
If you use only one package a project is not needed. But if you use more then one package, without a project it is not possible to import the classes with use statements.

In general, it is recommended to use always a project to download your libraries. In an application normally there is more than one library needed.
Some PHP packages are not free to download and because of that hosted in private repositories. In this case some credentials are needed to access such packages. Please use the auth.json textarea to insert credentials, if a package is coming from a private repository. You can look here for more information.

  • Some hosting areas are not accessible by a terminal or SSH. Then it is not possible to use Composer.
  • To use Composer is sometimes complicated. Especially for beginners.
  • Composer needs much resources. Sometimes they are not available on a simple webspace.
  • If you are using private repositories you don't need to share your credentials. You can set up everything on our site and then you provide a simple download link to your team member.
  • Simplify your Composer build process. Use our own command line tool to download the vendor folder as binary. This makes your build process faster and you don't need to expose your credentials for private repositories.
Please rate this library. Is it a good library?

Informations about the package dynamic-search-data-provider-crawler

Dynamic Search | Data Provider: Web Crawler

Latest Release Tests PhpStan

A spider crawler extension for Pimcore Dynamic Search.

Release Plan

Release Supported Pimcore Versions Supported Symfony Versions Release Date Maintained Branch
3.x 11.0 ^6.2 28.09.2023 Feature Branch master
2.x 10.0 - 10.6 ^5.4 19.12.2021 No 2.x
1.x 6.6 - 6.9 ^4.4 18.04.2021 No 1.x

Installation

Dynamic Search Bundle

You need to install / enable the Dynamic Search Bundle first. Read more about it here. After that, proceed as followed:

Add Bundle to bundles.php:


Basic Setup


Provider Options

always

Name Default Value Description
own_host_only false
allow_subdomains false
allow_query_in_url false
allow_hash_in_url false
allowed_mime_types ['text/html', 'application/pdf']
allowed_schemes ['http']
content_max_size 0

full_dispatch

Name Default Value Description
seed null
valid_links []
user_invalid_links []
max_link_depth 15
max_crawl_limit 0

single_dispatch

Name Default Value Description
host null

Resource Normalizer

DefaultResourceNormalizer

Identifier: web_crawler_default_resource_normalizer Normalize simple documents Options: none

LocalizedResourceNormalizer

Identifier: web_crawler_localized_resource_normalizer Scaffold localized documents

Options:

Name Default Value Allowed Type Description
locales all pimcore enabled languages array
skip_not_localized_documents true bool if false, an exception rises if a document/object has no valid locale

Transformer

Scaffolder

HttpResponseHtmlDataScaffolder

Identifier: http_response_html_scaffolder
Simple object scaffolder.
Supported types: VDB\Spider\Resource with content-type text/html.

HttpResponsePdfDataScaffolder

Identifier: http_response_pdf_scaffolder
Simple object scaffolder.
Supported types: VDB\Spider\Resource with content-type application/pdf.

PimcoreElementScaffolder

Identifier: pimcore_element_scaffolder
Simple object scaffolder.
Supported types: Asset, Document, DataObject\Concrete.

Field Transformer

UriExtractor

Identifier: resource_uri_extractor
Supported Scaffolder: http_response_html_scaffolder, http_response_pdf_scaffolder

Return Type: string|null
Options: none

LanguageExtractor

Identifier: resource_language_extractor
Supported Scaffolder: http_response_html_scaffolder, http_response_pdf_scaffolder

Return Type: string|null Options: none

MetaExtractor

Identifier: resource_meta_extractor
Supported Scaffolder: http_response_html_scaffolder

Return Type: string|null Options:

Name Default Value Allowed Type Description
name null string The name of the meta tag to fetch the value from
HtmlTagExtractor

Identifier: resource_html_tag_content_extractor
Supported Scaffolder: http_response_html_scaffolder

Return Type: string|null Options: none

TextExtractor

Identifier: resource_text_extractor
Supported Scaffolder: http_response_html_scaffolder, http_response_pdf_scaffolder

Return Type: string|null

Name Default Value Allowed Type Description
content_start_indicator <!-- main-content --> string Marks the begin of the indexable page content
content_end_indicator <!-- /main-content --> string Marks the end of the indexable page conten
content_exclude_start_indicator null null|string Marks the begin of the text to be excluded from indexing
content_exclude_end_indicator null null|string Marks the end of the text to be excluded from indexing
TitleExtractor

Identifier: resource_title_extractor
Supported Scaffolder: http_response_html_scaffolder, http_response_pdf_scaffolder

Return Type: string|null Options: none


Copyright and License

Copyright: DACHCOM.DIGITAL
For licensing details please visit LICENSE.md

Upgrade Info

Before updating, please check our upgrade notes!


All versions of dynamic-search-data-provider-crawler with dependencies

PHP Build Version
Package Version
Requires pimcore/pimcore Version ^11.0
vdb/php-spider Version ^0.7
dachcom-digital/dynamic-search Version ^3.0
Composer command for our command line client (download client) This client runs in each environment. You don't need a specific PHP version etc. The first 20 API calls are free. Standard composer command

The package dachcom-digital/dynamic-search-data-provider-crawler contains the following files

Loading the files please wait ....