37 lines
603 B
Markdown
37 lines
603 B
Markdown
Scraper
|
|
=======
|
|
|
|
This project is a basic tool to scrap a data from a website
|
|
using a CSS selector.
|
|
|
|
For example, you are able to retrieve the number of a project's releases hosted on github:
|
|
|
|
```
|
|
node src/index.js \
|
|
--url https://github.com/foo/bar \
|
|
--selector '.repository-content .numbers-summary li:nth-child(4) a' \
|
|
--tags \
|
|
--breaks \
|
|
--spaces \
|
|
--breaks \
|
|
--trim
|
|
```
|
|
|
|
...will show `XXX releases`.
|
|
|
|
More help with `node src/index.js --help`.
|
|
|
|
Installation
|
|
------------
|
|
|
|
Requirements:
|
|
|
|
* node >= 10
|
|
* yarn
|
|
|
|
```
|
|
$ git clone https://gitnet.fr/deblan/scraper.git
|
|
$ cd scraper
|
|
$ yarn
|
|
```
|