Getting Started with Reproducible-ML¶
We recommend to setup Miniconda or Conda to create a virtual environment.
Creating a virtual environment¶
- Create a virtual environment from the environment file, which can be found in the home directory of this project:
conda env create -f environment reproducible
- Activate the new environment:
source activate reproducible
Set up MongoDB¶
We recommend MongoDB since it is a noSQL database and it allows easy storage of arbitrary JSON documents. Sacred [14] itself recommends the MongoDB.
Start by downloading the binaries for Ubuntu 16.04:
mkdir mongodb
cd mongodb
wget https://fastdl.mongodb.org/linux/mongodb-linux-x86_64-ubuntu1604-4.0.3.tgz
Add path to your .bashrc file:
export PATH="$HOME/mongodb/bin:$PATH"
Then you can start the MongoDB daemon:
mkdir reproducible-db
mongodb --dbpath reproducible-db --fork --logpath reproducible-db
More information concerning the MongoDB can be found here.
How to use Sacredboard¶
Having set up the Mongo database, you can track your experiments in the database by just adding -m sacred:
python <train.py with params> -m sacred
This will create a new entry in the Sacred MongoDB. If you access a server via ssh make sure you open a port. e.g.:
ssh username@someserver.ch -L 10000:localhost:10000
Then you can open the Sacredboard via:
sacredboard -m sacred --port 10000 --no-browser
Workflow¶
Since all is ste up now, you can try to run some experiments and tests.
From the root folder /reproducible-ml you can run experiments as modules:
python -m exps.brain.gan
Preferably you run your Code on a GPU. If you want to run it on CPUs change the tensorflow-gpu package to the TensorFlow package in your environment.
There are several tests you can run:
python -m unittest datasets.brain.test_serialize
Or run all tests at once:
python -m unittest