Abstract
Detecting and understanding out-of-distribution (OOD) samples is crucial in machine learning (ML) to ensure reliable model performance. Current OOD studies primarily focus on extrapolatory (outside) OOD, neglecting potential cases of interpolatory (inside) OOD. In this study, we introduce a novel perspective on OOD by suggesting that it can be divided into inside and outside cases. We examine the inside–outside OOD profiles of datasets and their impact on ML model performance, using normalized root mean squared error (RMSE) and F1 score as the performance metrics on synthetically generated datasets with both inside and outside OOD. Our analysis demonstrates that different inside–outside OOD profiles lead to unique effects on ML model performance, with outside OOD generally causing greater performance degradation, on average. These findings highlight the importance of distinguishing between inside and outside OOD for developing effective counter-OOD methods.
| Original language | English |
|---|---|
| Article number | 43 |
| Journal | International Journal of Data Science and Analytics |
| Volume | 21 |
| Issue number | 1 |
| DOIs | |
| State | Published - Jun 2026 |
Bibliographical note
Publisher Copyright:© The Author(s) 2025.
Keywords
- High-dimensional analysis
- Machine learning robustness
- Out-of-distribution profile
- Performance evaluation
ASJC Scopus subject areas
- Information Systems
- Modeling and Simulation
- Computer Science Applications
- Computational Theory and Mathematics
- Applied Mathematics