diff --git a/README.md b/README.md
index b551d74..2a4f667 100644
--- a/README.md
+++ b/README.md
@@ -1,70 +1,326 @@
SmartISSPosts
==============================
-_Work In Progress_
-Project to identify nice pics from live ISS
+## Goal
-## Poster examples
+Goal of this project was to exploit my ISS database picture, coming from my [ISS-HDEV-wallpaper project](https://github.com/prise6/ISS-HDEV-wallpaper). These pictures was taken from the ISS HDEV live. Most of them was posted on [instagram](https://www.instagram.com/earthfromiss/). I decided to cluster images in order to identify which images could be cool to post or to find out which cluster are ugly ...
-![Poster 1](data/poster_1.jpg)
-![Poster 2](data/poster_2.jpg)
-![Poster 3](data/poster_3.jpg)
+Unfortunately, HDEV stopped sending any data on July 18, 2019, it was declared, on August 22, 2019, to have reached its end of life... :'(
-
-Project Organization
-------------
-
- ├── LICENSE
- ├── Makefile <- Makefile with commands like `make data` or `make train`
- ├── README.md <- The top-level README for developers using this project.
- ├── data
- │ ├── external <- Data from third party sources.
- │ ├── interim <- Intermediate data that has been transformed.
- │ ├── processed <- The final, canonical data sets for modeling.
- │ └── raw <- The original, immutable data dump.
- │
- ├── docs <- A default Sphinx project; see sphinx-doc.org for details
- │
- ├── models <- Trained and serialized models, model predictions, or model summaries
- │
- ├── notebooks <- Jupyter notebooks. Naming convention is a number (for ordering),
- │ the creator's initials, and a short `-` delimited description, e.g.
- │ `1.0-jqp-initial-data-exploration`.
- │
- ├── references <- Data dictionaries, manuals, and all other explanatory materials.
- │
- ├── reports <- Generated analysis as HTML, PDF, LaTeX, etc.
- │ └── figures <- Generated graphics and figures to be used in reporting
- │
- ├── requirements.txt <- The requirements file for reproducing the analysis environment, e.g.
- │ generated with `pip freeze > requirements.txt`
- │
- ├── setup.py <- makes project pip installable (pip install -e .) so src can be imported
- ├── src <- Source code for use in this project.
- │ ├── __init__.py <- Makes src a Python module
- │ │
- │ ├── data <- Scripts to download or generate data
- │ │ └── make_dataset.py
- │ │
- │ ├── features <- Scripts to turn raw data into features for modeling
- │ │ └── build_features.py
- │ │
- │ ├── models <- Scripts to train models and then use trained models to make
- │ │ │ predictions
- │ │ ├── predict_model.py
- │ │ └── train_model.py
- │ │
- │ └── visualization <- Scripts to create exploratory and results oriented visualizations
- │ └── visualize.py
- │
- └── tox.ini <- tox file with settings for running tox; see tox.testrun.org
-
-
---------
+My new goal aim to create nice poster composed of different kind of cluster i found.
Project based on the cookiecutter data science project template. #cookiecutterdatascience
+## Poster examples
+
+
+
+
+## Environment
+
+I use docker, see `docker-compose.yaml` file. Most of my routines are in `Makefile` file.
+
+#### Manage containers
+```
+make docker_start
+make docker_stop
+```
+
+#### Inside jupyter container
+
+I usually start a console inside my jupyter container (tensorflow jhub)
+```
+make docker_bash
+```
+
+And then, initialize environment
+```
+make requirements
+```
+
+I use visual studio code outside my container. To execute some code, i use console and type for example:
+
+```
+python -m iss.exec.bdd
+# or
+make populate_db
+```
+
+To use vscode debug, i use *ptvsd*
+```
+make debug src.exec.bdd
+```
+
+#### Config
+
+Configuration file of project is in `config/config_.yaml`. See template in source.
+
+#### .env
+
+Root directory contains a `.env` file with some environment variables
+
+```
+MODE=dev
+PROJECT_DIR="/home/jovyan/work"
+```
+
+The `MODE` value will load `config/config_MODE.yaml` configuration. See `iss/tools/config.py`.
+
+## Steps
+
+### Synchronize images
+
+ISS images are stored online on personal server, i need to collect all of them (>14k images).
+
+```
+make sync_collections
+```
+
+i used `data/raw/collections` directory.
+
+i have an history of location of ISS for some images, i store it in `data/raw/history/history.txt`
+
+```
+12.656456313474;-75.371420423828;20180513-154001;Caribbean Sea
+-43.891574367708;-21.080797293704;20180513-160001;South Atlantic Ocean
+-10.077472167643;-82.562993796116;20180513-172001;South Pacific Ocean
+-51.783078834111;-3.9925568092913;20180513-174001;South Atlantic Ocean
+27.255631526786;-134.89231579188;20180513-184001;North Pacific Ocean
+```
+
+See extract of `config/config_dev.yml`:
+
+```{yaml}
+directory:
+ project_dir: ${PROJECT_DIR}
+ data_dir: ${PROJECT_DIR}/data
+ collections: ${PROJECT_DIR}/data/raw/collections
+ isr_dir: ${PROJECT_DIR}/data/isr
+```
+
+### Populate DB
+
+I use mysql database running in container to store in table :
+
+* locations: history file
+* embedding: clustering results
+
+```
+make populate_db
+```
+
+*adminer is running to monitor mysql db*
+
+### Sampling images
+
+My clustering consist in using auto encoder to define a latent representation of my images. Latent representation are then use in a classical clustering.
+
+I split into train, test and validation set
+
+```
+make sampling
+```
+
+See extract of `config/config_dev.yml`:
+
+```{yaml}
+sampling:
+ autoencoder:
+ seed: 37672
+ proportions:
+ train: 0.7
+ test: 0.2
+ valid: 0.1
+ directory:
+ from: collections
+ base: ${PROJECT_DIR}/data/processed/models/autoencoder
+ train: ${PROJECT_DIR}/data/processed/models/autoencoder/train/k
+ test: ${PROJECT_DIR}/data/processed/models/autoencoder/test/k
+ valid: ${PROJECT_DIR}/data/processed/models/autoencoder/valid/k
+```
+
+### Training auto-encoder
+
+Newbie here, i tried home made models:
+
+* simple auto encoder: `iss/models/SimpleAutoEncoder.py`
+* simple convolutional auto encoder: `iss/models/SimpleConvAutoEncoder.py` **<- model selected**
+* Variational auto encoder: `iss/models/VariationalAutoEncoder.py`
+* Variational convolutional auto encoder: `iss/models/VariationaConvlAutoEncoder.py`
+
+
+See extract of `config/config_dev.yml`:
+
+```
+models:
+ simple_conv:
+ save_directory: ${PROJECT_DIR}/models/simple_conv
+ model_name: model_dev
+ sampling: autoencoder
+ input_width: 48
+ input_height: 27
+ input_channel: 3
+ latent_width: 6
+ latent_height: 3
+ latent_channel: 16
+ learning_rate: 0.001
+ epochs: 2
+ batch_size: 128
+ verbose: 0
+ initial_epoch: 0
+ workers: 1
+ use_multiprocessing: false
+ steps_per_epoch: 4
+ validation_steps: 2
+ validation_freq:
+ activation: sigmoid
+ callbacks:
+ csv_logger:
+ directory: ${PROJECT_DIR}/models/simple_conv/log
+ append: true
+ checkpoint:
+ directory: ${PROJECT_DIR}/models/simple_conv/checkpoint
+ verbose: 1
+ period: 20
+ tensorboard:
+ log_dir: ${PROJECT_DIR}/models/simple_conv/tensorboard
+ limit_image: 5
+ floyd: True
+```
+
+I create simple training framework and launch it with:
+
+```
+make training
+# or
+python -m iss.exec.training --model-type=simple_conv
+```
+
+Actually, i use [floydhub](https://www.floydhub.com/) to train my models.
+
+i added a `floyd.yml` file in root directory containing something like this:
+
+```
+env: tensorflow-1.12
+task:
+ training:
+ input:
+ - destination: /iss-autoencoder
+ source: prise6/datasets/iss/1
+ machine: gpu
+ description: training autoencoder (simple_conv)
+ command: mv .env-floyd .env && make training
+
+ training_prod:
+ input:
+ - destination: /iss-autoencoder
+ source: prise6/datasets/iss/1
+ machine: gpu2
+ description: training autoencoder (simple_conv)
+ command: mv .env-floyd .env && make training
+```
+
+i use a special config file for floydhub so i provide a different `.env` file.
+
+Training dashboard and dataset are public and available [here](https://www.floydhub.com/prise6/projects/smart-iss-posts/22)
+
+I tested google colab and train the final model with it, but result are really similar to the floydhub model.
+
+### Clustering
+
+Having fun with different approachs:
+
+* Classical Clustering (PCA + kmeans + Hierarchical clustering): `iss/clustering/ClassicalClustering.py`
+* Advanced Clustering: `iss/clustering/AdvancedClustering.py` (no really used)
+* Not2Deep Clustering (see [paper](https://github.com/rymc/n2d)): `iss/clustering/N2DClustering.py`
+**<- selected**
+* DbScan Clustering: `iss/clustering/DBScanClustering.py` (no really used)
+
+Clustering are trained onver a sample ~2.5k images. I create 50 clusters in order to find clusters with very similar images.
+
+```
+make exec_clustering
+```
+
+Parameters are in `config/config_dev.yml`:
+
+```{yaml}
+clustering:
+ n2d:
+ version: 3
+ model:
+ type: 'simple_conv'
+ name: 'model_colab'
+ umap:
+ random_state: 98372
+ metric: euclidean
+ n_components: 2
+ n_neighbors: 5
+ min_dist: 0
+ kmeans:
+ n_clusters: 50
+ random_state: 883302
+ save_directory: ${PROJECT_DIR}/models/clustering/n2d
+```
+
+
+#### Embeddings
+
+I save umap/t-sne embedding of latent space to plot it with bokeh:
+
+*(screenshot)*
+![umap_bokeh](data/umap_bokeh.png)
+
+I populate my embedding mysql table with `iss/exec/bdd.py` script
+
+#### Silhouette
+
+Compute silhouette score on latent representation for every cluster to see quality.
+
+![silhouette](data/silhouettes_score.png)
+
+
+#### Mosaic plot
+
+Example of 0.2 silhouette score (cluster 1):
+
+![cluster_01](data/cluster_01.png)
+
+Another example of 0.2 silhouette score (cluster 39):
+
+![cluster_39](data/cluster_39.png)
+
+We see why it's low, but we detect the pattern why it's gathered.
+
+Example of 0.8 silhouette score (cluster 10):
+
+![cluster_10](data/cluster_10.png)
+
+live is off, easy to cluster.
+
+Example of negative silhouette score (cluster 35):
+
+![cluster_35](data/cluster_35.png)
+
+A bit messy.
+
+
+#### Facets
+
+*WIP*
+
+
+### Posters
+
+Generate multiple posters based on a template.
+See poster examples on top.
+
+```
+make posters
+```
## Personal Note:
diff --git a/data/cluster_01.png b/data/cluster_01.png
new file mode 100644
index 0000000..c8e65ea
Binary files /dev/null and b/data/cluster_01.png differ
diff --git a/data/cluster_10.png b/data/cluster_10.png
new file mode 100644
index 0000000..12f97c2
Binary files /dev/null and b/data/cluster_10.png differ
diff --git a/data/cluster_35.png b/data/cluster_35.png
new file mode 100644
index 0000000..e57c597
Binary files /dev/null and b/data/cluster_35.png differ
diff --git a/data/cluster_39.png b/data/cluster_39.png
new file mode 100644
index 0000000..51afb4d
Binary files /dev/null and b/data/cluster_39.png differ
diff --git a/data/silhouettes_score.png b/data/silhouettes_score.png
new file mode 100644
index 0000000..84b1afa
Binary files /dev/null and b/data/silhouettes_score.png differ
diff --git a/data/umap_bokeh.png b/data/umap_bokeh.png
new file mode 100644
index 0000000..21119b1
Binary files /dev/null and b/data/umap_bokeh.png differ