Download the PHP package kaishiyoku/hera-rss-crawler without Composer

On this page you can find all versions of the php package kaishiyoku/hera-rss-crawler. It is possible to download/install these versions without Composer. Possible dependencies are resolved automatically.

FAQ

After the download, you have to make one include require_once('vendor/autoload.php');. After that you have to import the classes with use statements.

Example:
If you use only one package a project is not needed. But if you use more then one package, without a project it is not possible to import the classes with use statements.

In general, it is recommended to use always a project to download your libraries. In an application normally there is more than one library needed.
Some PHP packages are not free to download and because of that hosted in private repositories. In this case some credentials are needed to access such packages. Please use the auth.json textarea to insert credentials, if a package is coming from a private repository. You can look here for more information.

  • Some hosting areas are not accessible by a terminal or SSH. Then it is not possible to use Composer.
  • To use Composer is sometimes complicated. Especially for beginners.
  • Composer needs much resources. Sometimes they are not available on a simple webspace.
  • If you are using private repositories you don't need to share your credentials. You can set up everything on our site and then you provide a simple download link to your team member.
  • Simplify your Composer build process. Use our own command line tool to download the vendor folder as binary. This makes your build process faster and you don't need to expose your credentials for private repositories.
Please rate this library. Is it a good library?

Informations about the package hera-rss-crawler

About

This project tries to make fetching and parsing RSS feeds easier. With Hera RSS you can discover, fetch and parse RSS feeds.

Installation

  1. simply run composer require kaishiyoku/hera-rss-crawler
  2. create a new crawler instance using $heraRssCrawler = new HeraRssCrawler()
  3. discover a feed, for example $feedUrls = $heraRssCrawler->discoverFeedUrls('https://laravel-news.com/')
  4. pick the feed you like to use; if there were multiple feeds discovered pick one
  5. fetch the feed: $feed = $heraRssCrawler->parseFeed($feedUrls->get(0))
  6. fetch the articles: $feedItems = $feed->getFeedItems()

Breaking Changes

Version 6.x

Version 5.x

Version 4.x

Version 3.x

Available crawler options

Determines how many retries parsing or discovering feeds will be made when an exception occurs, e.g. if the feed was unreachable.

Set your own logger instance, e.g. a simple file logger.

Useful for websites which redirect to another subdomain when visiting the site, e.g. for Reddit.

With that you can set your own feed discoverers.

You can even write your own, just make sure to implement the FeedDiscoverer interface:

The default feed discoverers are as follows:

The ordering is important here because the discoverers will be called sequentially until at least one feed URL has been found and then stops.

That means that once the discoverer found a feed remaining discoverers won't be called.

If you want to mainly discover feeds by using HTML anchor elements, the FeedDiscovererByHtmlAnchorElements discoverer should be the first discoverer in the collection.

Available crawler methods

Simply fetch and parse the feed of a given feed url. If no consumable RSS feed is being found null is being returned.

Discover feeds from a website url and return all parsed feeds in a collection.

Discover feeds from a website url and return all found feed urls in a collection. There are multiple ways the crawler tries to discover feeds. The order is as follows:

  1. discover feed urls by content type
    if the given url is already a valid feed return this url
  2. discover feed urls by HTML head elements
    find all feed urls inside a HTML document
  3. discover feed urls by HTML anchor elements
    get all anchor elements of a HTML element and return the urls of those which include rss in its urls
  4. discover feed urls by Feedly
    fetch feed urls using the Feedly API

Fetch the favicon of the feed's website. If none is found then null is being returned.

Check if a given url is a consumable RSS feed.

Contribution

Found any issues or have an idea to improve the crawler? Feel free to open an issue or submit a pull request.

Plans for the future

Author

Email: [email protected]
Website: https://andreas-wiedel.de


All versions of hera-rss-crawler with dependencies

PHP Build Version
Package Version
Requires php Version ^8.1
ext-json Version *
ext-dom Version *
ext-simplexml Version *
ext-libxml Version *
symfony/dom-crawler Version ^5.4.21|^6.2.7
symfony/css-selector Version ^5.4.21|^6.2.7
guzzlehttp/guzzle Version ^7.5.0
illuminate/support Version ^9.0|^10.0|^11.0
nesbot/carbon Version ^2.66.0
laminas/laminas-xml Version ^1.5.0
laminas/laminas-feed Version ^2.20.0
monolog/monolog Version ^2.9.1|^3.3.1
Composer command for our command line client (download client) This client runs in each environment. You don't need a specific PHP version etc. The first 20 API calls are free. Standard composer command

The package kaishiyoku/hera-rss-crawler contains the following files

Loading the files please wait ....