Bojan Karlaš, Matteo Interlandi, Cedric Renggli, Wentao Wu, Ce Zhang, Deepak Mukunthu Iyappan Babu, Jordan Edwards, Chris Lauren, Andy Xu, Markus Weimer

Abstract

Continuous integration (CI) has been a de facto standard for building industrial-strength software. Yet, there is little attention towards applying CI to the development of machine learning (ML) applications until the very recent effort on the theoretical side. In this paper, we take a step forward to bring the theory into practice. We develop the first CI system for ML, to the best of our knowledge, that integrates seamlessly with existing ML development tools. We present its design and implementation details.

Download PDF, ACM

BibTeX

@inproceedings{10.1145/3394486.3403290,
    author = {Karla\v{s}, Bojan and Interlandi, Matteo and Renggli, Cedric and Wu, Wentao and Zhang, Ce and Mukunthu Iyappan Babu,     Deepak and Edwards, Jordan and Lauren, Chris and Xu, Andy and Weimer, Markus},
    title = {Building Continuous Integration Services for Machine Learning},
    year = {2020},
    isbn = {9781450379984},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/3394486.3403290},
    doi = {10.1145/3394486.3403290},
    abstract = {Continuous integration (CI) has been a de facto standard for building industrial-strength software. Yet, there is     little attention towards applying CI to the development of machine learning (ML) applications until the very recent effort on the     theoretical side. In this paper, we take a step forward to bring the theory into practice.We develop the first CI system for ML,     to the best of our knowledge, that integrates seamlessly with existing ML development tools. We present its design and     implementation details.},
    booktitle = {Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining},
    pages = {2407–2415},
    numpages = {9},
    keywords = {data management, machine learning, continuous integration, overfitting prevention, testing},
    location = {Virtual Event, CA, USA},
    series = {KDD '20}
}