Simple scraper.
Go to file
Simon Vieille 8826cd2ba8
release
2020-04-14 16:32:56 +02:00
src release 2020-04-14 16:32:56 +02:00
.gitignore add scraper 2020-04-11 22:18:33 +02:00
README.md add documentation 2020-04-11 22:45:15 +02:00
package.json release 2020-04-14 16:32:56 +02:00
yarn.lock add dependences 2020-04-11 22:18:09 +02:00

README.md

Scraper

This project is a basic tool to scrap a data from a website using a CSS selector.

For example, you are able to retrieve the number of a project's releases hosted on github:

node src/index.js \
  --url https://github.com/foo/bar \
  --selector '.repository-content .numbers-summary li:nth-child(4) a' \
  --tags \
  --breaks \
  --spaces \
  --breaks \
  --trim

...will show XXX releases.

More help with node src/index.js --help.

Installation

Requirements:

  • node >= 10
  • yarn
$ git clone https://gitnet.fr/deblan/scraper.git
$ cd scraper
$ yarn