Scaling up machine learning: Introduction

Ron Bekkerman, Mikhail Bilenko, John Langford

Research output: Chapter in Book/Report/Conference proceedingChapterpeer-review


Distributed and parallel processing of very large datasets has been employed for decades in specialized, high-budget settings, such as financial and petroleum industry applications. Recent years have brought dramatic progress in usability, cost effectiveness, and diversity of parallel computing platforms, with their popularity growing for a broad set of data analysis and machine learning tasks. The current rise in interest in scaling up machine learning applications can be partially attributed to the evolution of hardware architectures and programming frameworks that make it easy to exploit the types of parallelism realizable in many learning algorithms. A number of platforms make it convenient to implement concurrent processing of data instances or their features. This allows fairly straightforward parallelization of many learning algorithms that view input as an unordered batch of examples and aggregate isolated computations over each of them. Increased attention to large-scale machine learning is also due to the spread of very large datasets across many modern applications. Such datasets are often accumulated on distributed storage platforms, motivating the development of learning algorithms that can be distributed appropriately. Finally, the proliferation of sensing devices that perform real-time inference based on high-dimensional, complex feature representations drives additional demand for utilizing parallelism in learning-centric applications. Examples of this trend include speech recognition and visual object detection becoming commonplace in autonomous robots and mobile devices.

Original languageEnglish
Title of host publicationScaling up Machine Learning
Subtitle of host publicationParallel and Distributed Approaches
PublisherCambridge University Press
Number of pages20
ISBN (Electronic)9781139042918
ISBN (Print)9780521192248
StatePublished - 1 Jan 2011
Externally publishedYes

Bibliographical note

Publisher Copyright:
© Cambridge University Press 2012.

ASJC Scopus subject areas

  • General Computer Science


Dive into the research topics of 'Scaling up machine learning: Introduction'. Together they form a unique fingerprint.

Cite this