Gradient Boosting Versus Mixed Integer Programming for Sparse Additive Modeling

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Gradient boosting is a widely used algorithm for fitting sparse additive models over flexible classes of basis functions. Despite its popularity, the performance of gradient boosting as an approximation algorithm to the empirical risk minimizing model with a specific number k of selected basis functions is poorly understood. We provide a theoretical lower bound of 1/2-1/(4k-2) on the worst-case approximation ratio for the risk reduction that gradient boosting achieves relative to the optimal model when both are limited to k terms. This result reveals an inherent limitation in boosting’s ability to approximate the best possible sparse additive model, raising the question of how tight and representative this bound is in practice. To empirically answer this question, we employ mixed integer programming (MIP) to approximate the optimal additive models on 21 real datasets. The experimental results do not show larger gaps than the theoretical analysis, indicating that the theoretical lower bound is tight. Moreover, for twelve datasets, the approximation gaps are of the same order of magnitude as the theoretical lower bound, which shows the representativeness of the theoretical bound. To that end, the study also has the practical implication that the presented MIP approach frequently offers notable improvements over gradient boosting.

Original languageEnglish
Title of host publicationMachine Learning and Knowledge Discovery in Databases. Research Track - European Conference, ECML PKDD 2025, Proceedings
EditorsRita P. Ribeiro, Bernhard Pfahringer, Nathalie Japkowicz, Pedro Larrañaga, Alípio M. Jorge, Carlos Soares, Pedro H. Abreu, João Gama
PublisherSpringer Science and Business Media Deutschland GmbH
Pages453-470
Number of pages18
ISBN (Print)9783032060778
DOIs
StatePublished - 2026
EventEuropean Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2025 - Porto, Portugal
Duration: 15 Sep 202519 Sep 2025

Publication series

NameLecture Notes in Computer Science
Volume16016 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

ConferenceEuropean Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2025
Country/TerritoryPortugal
CityPorto
Period15/09/2519/09/25

Bibliographical note

Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2026.

Keywords

  • Additive model
  • Approximation gap
  • Gradient boosting
  • Mixed integer programming
  • Rule learning

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Gradient Boosting Versus Mixed Integer Programming for Sparse Additive Modeling'. Together they form a unique fingerprint.

Cite this