Statements (158)
Predicate | Object |
---|---|
gptkbp:instance_of |
gptkb:website
|
gptkbp:bfsLayer |
3
|
gptkbp:bfsParent |
gptkb:Library
|
gptkbp:can_be_extended_by |
custom middlewares
custom pipelines custom spiders |
gptkbp:developed_by |
gptkb:Scrapinghub
Scrapy contributors Scrapy maintainers |
gptkbp:first_released |
gptkb:2010
|
gptkbp:has |
gptkb:document
tutorials community support web-based interface command line interface active community support for multiple data formats built-in support for exporting data scrapy shell spider management support for distributed crawling |
gptkbp:has_documentation |
https://docs.scrapy.org/en/latest/
|
gptkbp:has_feature |
Command line interface
asynchronous processing Asynchronous processing Middleware support Extensible architecture middleware support support for testing logging support support for extensions Support for caching Integration with other libraries support for command line interface Support for logging Support for testing caching support support for multiple data formats support for cookies support for webhooks item pipelines scrapy shell spider management Support for scraping web pages with dynamic loading Built-in support for handling requests Item pipelines Selectors based on X Path and CSS Support for cookies and sessions Support for distributed scraping Support for proxies Support for scraping AP Is Support for scraping Atom feeds Support for scraping CSV Support for scraping Excel files Support for scraping HTML Support for scraping JSON Support for scraping Java Script-heavy websites Support for scraping PD Fs Support for scraping RSS feeds Support for scraping XML Support for scraping dynamic content Support for scraping forms Support for scraping images Support for scraping multiple pages Support for scraping static content Support for scraping text files Support for scraping web pages with AJAX Support for scraping web pages with authentication Support for scraping web pages with captchas Support for scraping web pages with iframes Support for scraping web pages with rate limiting Support for scraping web pages with redirects Support for signals Support for throttling requests Support for user agents Support for web crawling built-in support for handling requests support for JSON and XML output support for SQ Lite and Mongo DB support for X Path and CSS selectors support for cloud-based scraping support for distributed scraping support for exporting to CSV and JSON support for proxies support for scraping AP Is support for scraping Java Script-heavy websites support for scraping dynamic content support for scraping with AP Is support for scraping with Graph QLAP Is support for scraping with Playwright support for scraping with Puppeteer support for scraping with RES Tful AP Is support for scraping with SOAPAP Is support for scraping with Selenium support for scraping with Splash support for scraping with headless browsers support for scraping with scraping applications support for scraping with scraping approaches support for scraping with scraping frameworks support for scraping with scraping libraries support for scraping with scraping methodologies support for scraping with scraping platforms support for scraping with scraping practices support for scraping with scraping services support for scraping with scraping solutions support for scraping with scraping strategies support for scraping with scraping techniques support for scraping with scraping technologies support for scraping with scraping tools support for scraping with web scraping services support for scraping with web services support for scraping with webhooks support for sessions support for signals support for user agents Support for scraping web pages with session management Built-in support for exporting data in various formats |
https://www.w3.org/2000/01/rdf-schema#label |
Scrapy
|
gptkbp:is_available_on |
gptkb:Py_PI
gptkb:archive |
gptkbp:is_compatible_with |
gptkb:Pandas
gptkb:software_framework gptkb:drug |
gptkbp:is_integrated_with |
gptkb:Scrapy_Cloud
databases AP Is |
gptkbp:is_known_for |
flexibility
high performance scalability ease of use |
gptkbp:is_part_of |
data science toolkit
|
gptkbp:is_supported_by |
gptkb:Scrapinghub
|
gptkbp:is_used_by |
gptkb:physicist
gptkb:software data scientists |
gptkbp:is_used_for |
data mining
data extraction web crawling |
gptkbp:is_used_in |
gptkb:academic_research
gptkb:film_production_company e-commerce price comparison content aggregation data journalism social media analysis news aggregation SEO analysis job scraping real estate data collection |
gptkbp:latest_version |
2.5.0
|
gptkbp:license |
gptkb:BSD_License
|
gptkbp:provides |
asynchronous processing
middleware support item pipelines |
gptkbp:repository |
https://github.com/scrapy/scrapy
|
gptkbp:supports |
gptkb:Python_3
Python 3.6+ |
gptkbp:written_in |
gptkb:Library
|