Cookies on this website

We use cookies to ensure that we give you the best experience on our website. If you click 'Accept all cookies' we'll assume that you are happy to receive all cookies and you won't see this message again. If you click 'Reject all non-essential cookies' only necessary cookies providing core functionality such as security, network management, and accessibility will be enabled. Click 'Find out more' for information on how to change your cookie settings.

BACKGROUND: Recently, deep learning via convolutional neural networks (CNNs) has largely superseded conventional methods for proton (1 H)-MRI lung segmentation. However, previous deep learning studies have utilized single-center data and limited acquisition parameters. PURPOSE: Develop a generalizable CNN for lung segmentation in 1 H-MRI, robust to pathology, acquisition protocol, vendor, and center. STUDY TYPE: Retrospective. POPULATION: A total of 809 1 H-MRI scans from 258 participants with various pulmonary pathologies (median age (range): 57 (6-85); 42% females) and 31 healthy participants (median age (range): 34 (23-76); 34% females) that were split into training (593 scans (74%); 157 participants (55%)), testing (50 scans (6%); 50 participants (17%)) and external validation (164 scans (20%); 82 participants (28%)) sets. FIELD STRENGTH/SEQUENCE: 1.5-T and 3-T/3D spoiled-gradient recalled and ultrashort echo-time 1 H-MRI. ASSESSMENT: 2D and 3D CNNs, trained on single-center, multi-sequence data, and the conventional spatial fuzzy c-means (SFCM) method were compared to manually delineated expert segmentations. Each method was validated on external data originating from several centers. Dice similarity coefficient (DSC), average boundary Hausdorff distance (Average HD), and relative error (XOR) metrics to assess segmentation performance. STATISTICAL TESTS: Kruskal-Wallis tests assessed significances of differences between acquisitions in the testing set. Friedman tests with post hoc multiple comparisons assessed differences between the 2D CNN, 3D CNN, and SFCM. Bland-Altman analyses assessed agreement with manually derived lung volumes. A P value of <0.05 was considered statistically significant. RESULTS: The 3D CNN significantly outperformed its 2D analog and SFCM, yielding a median (range) DSC of 0.961 (0.880-0.987), Average HD of 1.63 mm (0.65-5.45) and XOR of 0.079 (0.025-0.240) on the testing set and a DSC of 0.973 (0.866-0.987), Average HD of 1.11 mm (0.47-8.13) and XOR of 0.054 (0.026-0.255) on external validation data. DATA CONCLUSION: The 3D CNN generated accurate 1 H-MRI lung segmentations on a heterogenous dataset, demonstrating robustness to disease pathology, sequence, vendor, and center. EVIDENCE LEVEL: 4. TECHNICAL EFFICACY: Stage 1.

Original publication




Journal article


J Magn Reson Imaging

Publication Date





1030 - 1044


CNN, deep learning, lung, segmentation, Female, Humans, Male, Deep Learning, Protons, Retrospective Studies, Magnetic Resonance Imaging, Lung, Image Processing, Computer-Assisted