Importance sampling for minibatches

Dominik Csiba, Peter Richtárik

Research output: Contribution to journalArticlepeer-review

48 Scopus citations

Abstract

Minibatching is a very well studied and highly popular technique in supervised learning, used by practitioners due to its ability to accelerate training through better utilization of parallel processing power and reduction of stochastic variance. Another popular technique is importance sampling|a strategy for preferential sampling of more important examples also capable of accelerating the training process. However, despite considerable effort by the community in these areas, and due to the inherent technical diffculty of the problem, there is virtually no existing work combining the power of importance sampling with the strength of minibatching. In this paper we propose the first practical importance sampling for minibatches and give simple and rigorous complexity analysis of its performance. We illustrate on synthetic problems that for training data of certain properties, our sampling can lead to several orders of magnitude improvement in training time. We then test the new sampling on several popular data sets, and show that the improvement can reach an order of magnitude.
Original languageEnglish (US)
Pages (from-to)1-21
Number of pages21
JournalJournal of Machine Learning Research
Volume19
StatePublished - Aug 1 2018
Externally publishedYes

ASJC Scopus subject areas

  • Artificial Intelligence
  • Software
  • Statistics and Probability
  • Control and Systems Engineering

Fingerprint

Dive into the research topics of 'Importance sampling for minibatches'. Together they form a unique fingerprint.

Cite this