### Content

* `README.md` - you are reading me right now :)
* `final_run/` - logs from final run of `experiments.ipynb`
* `common` - symlink to `../common/` directory with all common `Python` files
* `constants.py` - file with constants and helper functions regarding them
* `scrape_levels.py` - script for scraping all levels
* `scrape_user_data.py` - script for scraping time data
* `state_graph.py` - class for creating graph and computing all features
* `comp_features.py` - script for creating features file from data
* `experiments.ipynb` - Jupyter notebook for running experiments
* `graphs.ipynb` - Jupyter notebook for drawing graphs

## Running

Data directory is the directory on the same path, but with `data` instead of `code`.

If you don't use `virtualenv` as recommended in the README of root directory, run
everything using `python3`.

### Scraping data

For scraping **all** levels, run:
```
python scrape_levels.py
```

This script will open Firefox browser, log in and navigate through all instances.

It will create `levels` directory in the data directory, containing all descriptions
of levels.

For scraping user time data run:
```
python scrape_user_data.py
```

This script will generate file in data directory with name `constants.USER_TIME_FILE`.

In `scrape_user_data.py` we hardcoded level IDs, which we are interested in.

### Computing features

File with name held in `constants.USER_TIME_FILE` has to be in data directory.

Create directories `graph`, `tmp` and `bins`. The first will contain all graphs, the second all
temp files, like input files created for `C++` and the last one contains all executable files.

We do not create these directories autimatically, because first two are symlinks to HDD, 
since my SSD is too small.

For computing all features run:
```
python comp_features.py
```

This script will create or replace file in data directory
with name `constants.FEATURES_FILE`.

For recomputing one or more features, file `constants.FEATURES_FILE` has to be
in data directory. Then run:
```
python comp_features.py <feature1> <feature2> ...
```

It will replace all columns with new values but leave other features the same.

### Creating histograms

Create directory `images/cz_sokoban_hist`.

Open Jupyter notebook `graphs.ipynb` and run all cells.

It will generate histograms for all features into images directory and
generate latex string for all figures, witch is copied into clipboard.

### Running experiments

Open Jupyter notebook `experiments.ipynb`.

With every run it generates new log directory. 

First, it loads data, then it runs `GridSearch`, `RFECV` and `RFE`.

All results are displayed as tables and saved to log files.
