PHP code example of thewinterwind / arachnid

1. Go to this page and download the library: Download thewinterwind/arachnid library. Choose the download type require.

2. Extract the ZIP file and open the index.php.

3. Add this code to the index.php.
    
        
<?php
require_once('vendor/autoload.php');

/* Start to develop here. Best regards https://php-download.com/ */

    

thewinterwind / arachnid example snippets


      $url = 'http://www.example.com';
    $linkDepth = 3;
    // Initiate crawl    
    $crawler = new \Arachnid\Crawler($url, $linkDepth);
    $crawler->traverse();

    // Get link data
    $links = $crawler->getLinks();
    print_r($links);

## Advanced Usage:
   There are other options you can set to the crawler:


   Set additional options to underlying guzzle client, by specifying array of options in constructor 
or passing it to `setCrawlerOptions`:


    
        //third parameter is the options used to configure guzzle client
        $crawler = new \Arachnid\Crawler('http://github.com',2, 
                                 ['auth'=>array('username', 'password')]);
           
        //or using separate method `setCrawlerOptions`
        $options = array(
            'curl' => array(
                CURLOPT_SSL_VERIFYHOST => false,
                CURLOPT_SSL_VERIFYPEER => false,
            ),
            'timeout' => 30,
            'connect_timeout' => 30,
        );
                        
        $crawler->setCrawlerOptions($options);


   You can inject a [PSR-3][psr3] compliant logger object to monitor crawler activity (like [Monolog][monolog]):

        
    $crawler = new \Arachnid\Crawler($url, $linkDepth); // ... initialize crawler   

    //set logger for crawler activity (compatible with PSR-3)
    $logger = new \Monolog\Logger('crawler logger');
    $logger->pushHandler(new \Monolog\Handler\StreamHandler(sys_get_temp_dir().'/crawler.log'));
    $crawler->setLogger($logger);