CI/CD for ML models and datasets

RU / Day 2 / 12:30 / Track 3

Many people in the industry are familiar with the situation where you quickly deploy a DS model and a month later when it needs to be retrained with a new data/feature, it turns out that DS cannot do this.

Taking a model into production means not only packing it into a conditional container but also fixing the process of its training and monitoring its work. A detailed description of how the model was obtained avoids loss of knowledge and experimental results.

Odnoklassniki builds a process in which:

  • all training parameters, dependencies, and artifacts are committed to git;
  • models have trained automatically in a controlled environment;
  • models are reviewed and entered the master;
  • models fly to production.

Mikhail will tell about:

  • processes and tools;
  • how to organize versioned storage of data sets on dvc;
  • how to organize rollouts through the repository;
  • the path of the model from the task in JIRA to the production and back;
  • how to organize automatic retraining.