Download the PHP package edwinhuish/querylist without Composer
On this page you can find all versions of the php package edwinhuish/querylist. It is possible to download/install these versions without Composer. Possible dependencies are resolved automatically.
Download edwinhuish/querylist
More information about edwinhuish/querylist
Files in edwinhuish/querylist
Package querylist
Short Description Simple, elegant, extensible PHP Web Scraper (crawler/spider),Use the css3 dom selector,Based on DomQuery! 简洁、优雅、可扩展的PHP采集工具(爬虫),基于DomQuery。
License MIT
Homepage http://querylist.cc
Informations about the package querylist
QueryList
QueryList
is a simple, elegant, extensible PHP Web Scraper (crawler/spider) ,based on DomQuery.
中文文档
Features
- Have the same CSS3 DOM selector as jQuery
- Have the same DOM manipulation API as jQuery
- Have a generic list crawling program
- Have a strong HTTP request suite, easy to achieve such as: simulated landing, forged browser, HTTP proxy and other complex network requests
- Have a messy code solution
- Have powerful content filtering, you can use the jQuey selector to filter content
- Has a high degree of modular design, scalability and strong
- Have an expressive API
- Has a wealth of plug-ins
Through plug-ins you can easily implement things like:
- Multithreaded crawl
- Crawl JavaScript dynamic rendering page (PhantomJS/headless WebKit)
- Image downloads to local
- Simulate browser behavior such as submitting Form forms
- Web crawler
- .....
Requirements
- PHP >= 7.1
Installation
By Composer installation:
Usage
DOM Traversal and Manipulation
-
Crawl「GitHub」all picture links
-
Crawl Google search results
- More usage
List crawl
Crawl the title and link of the Google search results list:
Results:
extract() more usage:
Result:
Handlers
Use handle() method to add handlers to Querylist.
HTTP Client (GuzzleHttp)
-
Carry cookie login GitHub
-
Use the Http proxy
- Analog login
Submit forms
Login GitHub
Bind function extension
Customize the extension of a myHttp
method:
Or package to class, and then bind:
Plugin used
-
Use the PhantomJS plugin to crawl JavaScript dynamically rendered pages:
- Using the CURL multithreading plug-in, multi-threaded crawling GitHub trending :
Plugins
- jae-jae/QueryList-PhantomJS:Use PhantomJS to crawl Javascript dynamically rendered page.
- jae-jae/QueryList-CurlMulti : Curl multi threading.
- jae-jae/QueryList-AbsoluteUrl : Converting relative urls to absolute.
- jae-jae/QueryList-Rule-Google : Google searcher.
- jae-jae/QueryList-Rule-Baidu : Baidu searcher.
View more QueryList plugins and QueryList-based products: QueryList Community
Contributing
Welcome to contribute code for the QueryList。About Contributing Plugins can be viewed:QueryList Plugin Contributing Guide
Author
Jaeger [email protected]
If this library is useful for you, say thanks buying me a beer :beer:!
Lisence
QueryList is licensed under the license of MIT. See the LICENSE for more details.
All versions of querylist with dependencies
ext-dom Version *
ext-iconv Version *
tightenco/collect Version ^5
jaeger/g-http Version ^1.1
edwinhuish/domquery Version ^1.0
pguardiario/phpuri Version ^1.0