[TOC]
# osmosis
Web scraper for NodeJS
https://github.com/rchipka/node-osmosis
# scrape-it
?? A Node.js scraper for humans. https://github.com/IonicaBizau/scrape-it
# node-crawler
https://github.com/bda-research/node-crawler
# supercrawler
https://github.com/brendonboshell/supercrawler
A web crawler. Supercrawler automatically crawls websites. Define custom handlers to parse content. Obeys robots.txt, rate limits and concurrency limits.
# x-ray
https://github.com/matthewmueller/x-ray
The next web scraper. See through the noise.
# headless-chrome-crawler
Distributed crawler powered by Headless Chrome https://github.com/yujiosaka/headless-chrome-crawler
# simplecrawler
Flexible event driven crawler for node.
https://github.com/simplecrawler/simplecrawler
# goose-parser
Universal scrapping tool, which allows you to extract data using multiple environments
https://github.com/redco/goose-parser
# apify
https://github.com/apify/apify-js
the scalable web crawling and scraping library for JavaScript
# webster
https://github.com/zhuyingda/webster
a reliable high-level web crawling & scraping framework for Node.js.
# schabbi-webscraper
Small and easy to use NodeJS webcrawler project. Returns basic information about the crawled sites. https://github.com/PatrickSchababerle/schabbi-webscraper
# website-scraper
Download website to local directory (including all css, images, js, etc.
https://github.com/website-scraper/node-website-scraper
# cheerio-httpcli
https://www.npmjs.com/package/cheerio-httpcli
# ScrapingBee
If you want to learn how to avoid getting blocked, read our[complete guide](https://www.scrapingbee.com/blog/web-scraping-without-getting-blocked/), and if you don't want to deal with this, you can always use our[web scraping API](https://www.scrapingbee.com/).
Happy Scraping!
## Resources
Would you like to read more? Check these links out:
* [NodeJS Website](https://nodejs.org/en/about/)\- Contains documentation and a lot of information on how to get started.
* [Puppeteer's Docs](https://developers.google.com/web/tools/puppeteer)\- Contains the API reference and guides for getting started.
* [Playright](https://www.scrapingbee.com/blog/playwright-web-scraping/)An alternative to Puppeteer, backed by Microsoft.
* [ScrapingBee's Blog](https://www.scrapingbee.com/blog/)\- Contains a lot of information about Web Scraping goodies on multiple platforms.
* [Handling infinite scroll with Puppeteer](https://www.scrapingbee.com/blog/infinite-scroll-puppeteer/)
* [Node-unblocker](https://www.scrapingbee.com/blog/node-unblocker/)\- a Node.js package to facilitate web scraping through proxies.
# github scrapper crawler
[Search · scrapper (github.com)](https://github.com/search?l=JavaScript&o=desc&p=97&q=scrapper&s=&type=Repositories)
- 講解 Markdown
- 示例
- SVN
- Git筆記
- github 相關
- DESIGNER'S GUIDE TO DPI
- JS 模塊化
- CommonJS、AMD、CMD、UMD、ES6
- AMD
- RequrieJS
- r.js
- 模塊化打包
- 學習Chrome DevTools
- chrome://inspect
- Chrome DevTools 之 Elements
- Chrome DevTools 之 Console
- Chrome DevTools 之 Sources
- Chrome DevTools 之 Network
- Chrome DevTools 之 Memory
- Chrome DevTools 之 Performance
- Chrome DevTools 之 Resources
- Chrome DevTools 之 Security
- Chrome DevTools 之 Audits
- 技巧
- Node.js
- 基礎知識
- package.json 詳解
- corepack
- npm
- yarn
- pnpm
- yalc
- 庫處理
- Babel
- 相關庫
- 轉譯基礎
- 插件
- AST
- Rollup
- 基礎
- 插件
- Webpack
- 詳解配置
- 實現 loader
- webpack 進階
- plugin 用法
- 輔助工具
- 解答疑惑
- 開發工具集合
- 花樣百出的打包工具
- 紛雜的構建系統
- monorepo
- 前端工作流
- 爬蟲
- 測試篇
- 綜合
- Jest
- playwright
- Puppeteer
- cypress
- webdriverIO
- TestCafe
- 其他
- 工程開發
- gulp篇
- Building With Gulp
- Sass篇
- PostCSS篇
- combo服務
- 編碼規范檢查
- 前端優化
- 優化策略
- 高性能HTML5
- 瀏覽器端性能
- 前后端分離篇
- 分離部署
- API 文檔框架
- 項目開發環境
- 基于 JWT 的 Token 認證
- 扯皮時間
- 持續集成及后續服務
- 靜態服務器搭建
- mock與調試
- browserslist
- Project Starter
- Docker
- 文檔網站生成
- ddd