Cookies on this website

We use cookies to ensure that we give you the best experience on our website. If you click 'Accept all cookies' we'll assume that you are happy to receive all cookies and you won't see this message again. If you click 'Reject all non-essential cookies' only necessary cookies providing core functionality such as security, network management, and accessibility will be enabled. Click 'Find out more' for information on how to change your cookie settings.

Accurate, real-time estimation of core body temperature (CBT) during physical activity is essential for monitoring heat strain and mitigating the risk of heat-related illness under hot environmental conditions. Although numerous data-driven algorithms using wearable sensors have been proposed, their practical reliability remains unclear due to substantial methodological heterogeneity and the absence of standardized evaluation. This study combined a systematic review with a standardized quantitative benchmark. A total of 38 studies employing non-invasive inputs for CBT estimation were identified. Of these, 14 eligible models, including Kalman filter–based methods, statistical models, and machine-learning approaches, were re-implemented and evaluated under identical preprocessing and evaluation settings using two independent datasets: Dataset 1 (treadmill walking, n=16) and Dataset 2 (cycling, n=13). The benchmark revealed notable differences between originally reported performance and reproduced performance under standardized conditions. For the widely used heart-rate–based extended Kalman filter, the root mean square error (RMSE) increased from typically reported values of ∼0.21–0.41 ∘C to 0.41 ∘C on Dataset 1 and 0.66 ∘C on Dataset 2. Incorporating skin temperature improved tracking accuracy in some configurations, but performance gains were highly dependent on measurement site and dataset. Sensitivity for detecting elevated CBT (≥38.0 ∘C) varied markedly across methods, particularly for the cycling protocol. In conclusion, no single CBT estimation approach consistently outperformed others across all settings. Heart-rate–only models provided a stable baseline under limited sensing conditions, whereas multimodal approaches offered conditional benefits in more controlled scenarios. This work establishes a standardized benchmark framework to support fair comparison, method selection, and future development of (wearable) CBT estimation technologies.

More information Original publication

DOI

10.1016/j.buildenv.2026.114591

Type

Journal article

Publication Date

2026-06-01T00:00:00+00:00

Volume

297