Advanced Insurance Risk Modeling for Pseudo-New Customers Using Balanced Ensembles and Transfomer Architectures

Solly, Finn L.; Soriano-González, Raquel; Juan, Angel A.; Guerrero, Antoni

doi:10.3390/risks14040091

Advanced Insurance Risk Modeling for Pseudo-New Customers Using Balanced Ensembles and Transfomer Architectures

Archivos

SollySoriano-GonzalezJuan - Advanced Insurance Risk Modeling for Pseudo-New Customers Using Balan....pdf (912.23 KB)

Fecha

2026-04-17

Autores

Solly, Finn L.

Soriano-González, Raquel

Juan, Angel A.

Guerrero, Antoni

09.- Desarrollar infraestructuras resilientes, promover la industrialización inclusiva y sostenible, y fomentar la innovación

Compartir

Handle

https://riunet.upv.es/handle/10251/236401

Cita bibliográfica

Solly, FL.; Soriano-González, Raquel; Juan, Angel A.; Guerrero, A. (2026). Advanced Insurance Risk Modeling for Pseudo-New Customers Using Balanced Ensembles and Transfomer Architectures. Risks. 14(4). https://doi.org/10.3390/risks14040091

Resumen

[EN] In insurance portfolios, classifying customers without a prior history at a given company is particularly challenging due to the absence of historical behavior, extreme class imbalance, heavy-tailed loss distributions, and strict operational constraints. Traditional machine learning approaches, including the baseline methodology proposed in previous studies, typically optimize global predictive accuracy and therefore fail to capture business-critical outcomes, especially the identification of high-risk clients. This study extends the existing approach by evaluating two complementary business-aware classification strategies: (i) a balanced bagging ensemble specifically designed to handle class imbalance and maximize expected profit under explicit customer-omission constraints, and (ii) a lightweight Transformer-based architecture capable of learning richer feature representations. Both approaches incorporate the asymmetric financial cost structure of insurance and operate under operational selection limits. The empirical analysis is conducted on a proprietary large-scale auto insurance dataset comprising 51,618 customers and is complemented by validation on nine synthetic datasets to assess robustness. Model performance is evaluated using statistical tests (ANOVA, Friedman, and pair-wise comparisons) together with business-oriented metrics. The results show that both proposed approaches consistently outperform the baseline methodology (p < 0.001) in terms of profit, with the ensemble offering a better balance of performance and efficiency, while the Transformer shows stronger robustness and generalization under data perturbations. The balanced ensemble provides the most favourable trade-off between predictive performance, robustness, interpretability, and computational efficiency, making it suitable for deployment in regulated insurance environments, while the Transformer achieves competitive results and exhibits stronger generalization under data perturbations. The proposed approach aligns machine learning with actuarial portfolio optimization by explicitly integrating profit-driven objectives and operational constraints, offering two practical and scalable solutions for risk-based decision-making in real-world insurance settings.

Palabras clave

Insurance risk modeling, Customer classification, High-risk customer detection, Balanced ensembles, Transformer models

Fuente

Risks

DOI

10.3390/risks14040091

Versión del editor

https://doi.org/10.3390/risks14040091

Colecciones

Artículos, conferencias, monografías

Página completa del ítem

Advanced Insurance Risk Modeling for Pseudo-New Customers Using Balanced Ensembles and Transfomer Architectures

Archivos

Fecha

Autores

Directores

Editores

Otras autorías

Unidades organizativas

Compartir

Handle

Cita bibliográfica

Titulación

Resumen

Palabras clave

Fuente

DOI

Versión del editor

Enlaces relacionados

URL

Colecciones