1 Veracyte, Inc., South San Francisco, California.
2 Department of Laboratory Medicine and Pathology, Mayo Clinic, Scottsdale, Arizona.
Ann Am Thorac Soc. 2017 Nov;14(11):1646-1654. doi: 10.1513/AnnalsATS.201612-947OC.
Usual interstitial pneumonia (UIP) is the histopathologic hallmark of idiopathic pulmonary fibrosis. Although UIP can be detected by high-resolution computed tomography of the chest, the results are frequently inconclusive, and pathology from transbronchial biopsy (TBB) has poor sensitivity. Surgical lung biopsy may be necessary for a definitive diagnosis.
To develop a genomic classifier in tissue obtained by TBB that distinguishes UIP from non-UIP, trained against central pathology as the reference standard.
Exome enriched RNA sequencing was performed on 283 TBBs from 84 subjects. Machine learning was used to train an algorithm with high rule-in (specificity) performance using specimens from 53 subjects. Performance was evaluated by cross-validation and on an independent test set of specimens from 31 subjects. We explored the feasibility of a single molecular test per subject by combining multiple TBBs from upper and lower lobes. To address whether classifier accuracy depends upon adequate alveolar sampling, we tested for correlation between classifier accuracy and expression of alveolar-specific genes.
The top-performing algorithm distinguishes UIP from non-UIP conditions in single TBB samples with an area under the receiver operator characteristic curve (AUC) of 0.86, with specificity of 86% (confidence interval = 71-95%) and sensitivity of 63% (confidence interval = 51-74%) (31 test subjects). Performance improves to an AUC of 0.92 when three to five TBB samples per subject are combined at the RNA level for testing. Although we observed a wide range of type I and II alveolar-specific gene expression in TBBs, expression of these transcripts did not correlate with classifier accuracy.
We demonstrate proof of principle that genomic analysis and machine learning improves the utility of TBB for the diagnosis of UIP, with greater sensitivity and specificity than pathology in TBB alone. Combining multiple individual subject samples results in increased test accuracy over single sample testing. This approach requires validation in an independent cohort of subjects before application in the clinic.
特发性肺纤维化的组织病理学特征是普通型间质性肺炎(UIP)。虽然高分辨率 CT 可检测到 UIP,但结果常不明确,经支气管镜肺活检(TBB)的敏感性较差。为明确诊断,可能需要进行外科肺活检。
利用经支气管镜肺活检标本,开发一种基于基因组的分类器,通过机器学习算法,以中心病理为参考标准,对 UIP 与非 UIP 进行区分。
对 84 例患者的 283 例 TBB 标本进行外显子组富集 RNA 测序。利用 53 例患者的标本,采用机器学习方法训练具有高规则入(特异性)性能的算法。通过交叉验证和 31 例患者的独立测试集评估性能。我们通过结合上、下叶多个 TBB 来探索对每位患者进行单次分子检测的可行性。为了研究分类器准确性是否取决于肺泡采样是否充分,我们测试了分类器准确性与肺泡特异性基因表达之间的相关性。
表现最佳的算法可通过单个 TBB 样本区分 UIP 和非 UIP ,受试者工作特征曲线下面积(AUC)为 0.86,特异性为 86%(置信区间 71%95%),敏感性为 63%(置信区间 51%74%)(31 例测试患者)。当每个患者的 TBB 样本数为 3 到 5 个时,在 RNA 水平上进行组合检测,性能可提高到 AUC 0.92。虽然我们观察到 TBB 中有广泛的 I 型和 II 型肺泡特异性基因表达,但这些转录物的表达与分类器准确性无关。
我们证明了通过基因组分析和机器学习提高 TBB 诊断 UIP 的实用性,其敏感性和特异性均高于单独 TBB 的病理。与单个样本检测相比,组合多个个体样本可提高检测准确性。在将该方法应用于临床之前,需要在独立的患者队列中进行验证。