Wang Juliana
Polygence, São Paulo, Brazil.
Sci Rep. 2025 May 2;15(1):15408. doi: 10.1038/s41598-025-98935-8.
The search for exoplanets aims to identify planets with compositions similar to Earth's, providing insights into planetary formation and habitability. As a result, efforts to enhance the efficiency of exoplanet research have led to the development of various detection methods, including transit photometry. Despite their effectiveness, these methods produce data that require detailed interpretation, such as identifying dips in light curves. Machine learning has then emerged as a powerful alternative, offering rapid image classification and the ability to analyze complex datasets in a short span of time. This paper applies a convolutional neural network (CNN) to the Kepler dataset, which consists of time-series light curve data from the Kepler Space Telescope, used for detecting exoplanets through transit events. The final CNN architecture, with hyperparameters set as (300, 200, 200, 100, 100), was identified as the best-performing model after evaluating multiple configurations. These results highlight the model's strengths and areas for improvement; while it excels at identifying false positives (low miss rate of 5%), its higher miss rate for the 'CONFIRMED' class (40%) suggests a need for better detection of true exoplanets. The AUC score of 0.91 further underscores the model's strong overall performance.
寻找系外行星旨在识别成分与地球相似的行星,以便深入了解行星形成和宜居性。因此,提高系外行星研究效率的努力促使了包括凌星测光法在内的各种探测方法的发展。尽管这些方法很有效,但它们产生的数据需要详细解读,比如识别光变曲线中的下降。机器学习于是成为一种强大的替代方法,能够快速进行图像分类,并在短时间内分析复杂的数据集。本文将卷积神经网络(CNN)应用于开普勒数据集,该数据集由开普勒太空望远镜的时间序列光变曲线数据组成,用于通过凌星事件探测系外行星。在评估了多种配置后,最终超参数设置为(300, 200, 200, 100, 100)的CNN架构被确定为性能最佳的模型。这些结果凸显了该模型的优势和有待改进之处;虽然它在识别误报方面表现出色(漏报率低至5%),但其对“已确认”类别较高的漏报率(40%)表明需要更好地探测真正的系外行星。0.91的AUC分数进一步凸显了该模型整体的强大性能。