Download the PHP package nizek/crawler without Composer
On this page you can find all versions of the php package nizek/crawler. It is possible to download/install these versions without Composer. Possible dependencies are resolved automatically.
Informations about the package crawler
Selenium PHP Crawler
A PHP package to automate web crawling and element retrieval using Selenium WebDriver. This package allows you to connect to a Selenium server, navigate to web pages, and interact with elements by tags, classes, IDs, and CSS selectors. The package is set up for Chrome in headless mode, making it suitable for use in server environments.
Requirements
- PHP 7.4 or newer
- Selenium Server
- ChromeDriver
- chromium
Using Docker
If you prefer to use Docker, a compatible Dockerfile can be found here. Simply run the following commands to build and start the Selenium server
After running these commands, you can access the Selenium server by opening http://localhost:4444 in your browser.
Usage
Initialization
To use the , instantiate it using the static method, which provides a preconfigured WebDriver instance connected to Selenium.
Setting the URL
To set the URL for the crawler to navigate to:
Methods
The following methods are provided for interacting with and retrieving elements from the webpage:
setUrl(string $url)
Sets the URL for the crawler to visit.
Parameters:
$url: The URL to navigate to.
Returns:
Returns the Crawler instance for method chaining.
Example:
parseXMLUrls()
Parses all URLs from XML content in
Returns:
An array of URLs found in <loc> tags on the page.
Example:
getElementByTagName(string $tagName)
Finds the first element with the given tag name.
Parameters:
$tagName: The name of the tag to search for.
Returns:
A WebElement object representing the element.
Example:
getElementsByTagName(string $tagName)
Finds all elements with the given tag name.
Parameters:
$tagName: The name of the tag to search for.
Returns:
An array of WebElement objects representing the elements.
Example:
getElementBySelector(string $selector)
Finds the first element with the given selector.
Parameters:
$className: The name of the class to search for.
Returns:
A WebElement object representing the element.
Example:
This will find all tags with class name with value equals (just like filtering page element)
getElementsBySelector(string $selector)
Finds all elements with the given selector.
Parameters:
$className: The name of the class to search for.
Returns:
An array of WebElement objects representing the elements.
Example:
getElementByClassName(string $className)
Finds the first element with the given class name.
Parameters:
$className: The name of the class to search for.
Returns:
A WebElement object representing the element.
Example:
getElementsByClassName(string $className)
Finds all elements with the given class name.
Parameters:
$className: The name of the class to search for.
Returns:
An array of WebElement objects representing the elements.
Example:
getElementById(string $id)
Finds the element with the given ID.
Parameters:
$id: The ID of the element to search for.
Returns:
A WebElement object representing the element.
Example:
getPageContent()
Retrieves the inner HTML content of the current page.
Returns:
A string containing the inner HTML of the page.
Example: