• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

解析器组合子:用于生成核磁共振数据解析器的实际应用

Parser Combinators: a Practical Application for Generating Parsers for NMR Data.

作者信息

Fenwick Matthew, Weatherby Gerard, Ellis Heidi Jc, Gryk Michael R

机构信息

Department of Microbial, Molecular and Structural Biology, University of Connecticut Health Center, 263 Farmington Avenue Farmington, Connecticut 06030.

Department of Computer Science / Information Technology, Western New England University, Springfield, Massachusetts.

出版信息

Proc Int Conf Inf Technol New Gener. 2013. doi: 10.1109/ITNG.2013.39.

DOI:10.1109/ITNG.2013.39
PMID:24352525
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3859343/
Abstract

Nuclear Magnetic Resonance (NMR) spectroscopy is a technique for acquiring protein data at atomic resolution and determining the three-dimensional structure of large protein molecules. A typical structure determination process results in the deposition of a large data sets to the BMRB (Bio-Magnetic Resonance Data Bank). This data is stored and shared in a file format called NMR-Star. This format is syntactically and semantically complex making it challenging to parse. Nevertheless, parsing these files is crucial to applying the vast amounts of biological information stored in NMR-Star files, allowing researchers to harness the results of previous studies to direct and validate future work. One powerful approach for parsing files is to apply a Backus-Naur Form (BNF) grammar, which is a high-level model of a file format. Translation of the grammatical model to an executable parser may be automatically accomplished. This paper will show how we applied a model BNF grammar of the NMR-Star format to create a free, open-source parser, using a method that originated in the functional programming world known as "parser combinators". This paper demonstrates the effectiveness of a principled approach to file specification and parsing. This paper also builds upon our previous work [1], in that 1) it applies concepts from Functional Programming (which is relevant even though the implementation language, Java, is more mainstream than Functional Programming), and 2) all work and accomplishments from this project will be made available under standard open source licenses to provide the community with the opportunity to learn from our techniques and methods.

摘要

核磁共振(NMR)光谱学是一种用于在原子分辨率下获取蛋白质数据并确定大型蛋白质分子三维结构的技术。一个典型的结构确定过程会导致将大量数据集存入BMRB(生物磁共振数据库)。这些数据以一种名为NMR-Star的文件格式存储和共享。这种格式在语法和语义上都很复杂,解析起来具有挑战性。然而,解析这些文件对于应用存储在NMR-Star文件中的大量生物学信息至关重要,它能让研究人员利用先前研究的结果来指导和验证未来的工作。一种强大的文件解析方法是应用巴克斯-诺尔范式(BNF)语法,它是一种文件格式的高级模型。可以自动完成将语法模型转换为可执行解析器的操作。本文将展示我们如何应用NMR-Star格式的模型BNF语法,使用一种源自函数式编程领域的方法“解析器组合子”来创建一个免费的开源解析器。本文展示了一种有原则的文件规范和解析方法的有效性。本文还基于我们之前的工作[1],即1)它应用了函数式编程的概念(尽管实现语言Java比函数式编程更主流,但该概念仍然相关),2)该项目的所有工作和成果都将根据标准开源许可提供,以便社区有机会学习我们的技术和方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5a3e/3859343/667a8173cc2f/nihms499539f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5a3e/3859343/d240cc47f88b/nihms499539f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5a3e/3859343/389208c1dcf8/nihms499539f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5a3e/3859343/08bf93e7168f/nihms499539f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5a3e/3859343/667a8173cc2f/nihms499539f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5a3e/3859343/d240cc47f88b/nihms499539f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5a3e/3859343/389208c1dcf8/nihms499539f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5a3e/3859343/08bf93e7168f/nihms499539f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5a3e/3859343/667a8173cc2f/nihms499539f4.jpg

相似文献

1
Parser Combinators: a Practical Application for Generating Parsers for NMR Data.解析器组合子:用于生成核磁共振数据解析器的实际应用
Proc Int Conf Inf Technol New Gener. 2013. doi: 10.1109/ITNG.2013.39.
2
A fast and efficient python library for interfacing with the Biological Magnetic Resonance Data Bank.一个用于与生物磁共振数据库接口的快速高效的Python库。
BMC Bioinformatics. 2017 Mar 17;18(1):175. doi: 10.1186/s12859-017-1580-5.
3
"gnparser": a powerful parser for scientific names based on Parsing Expression Grammar.“gnparser”:一种基于解析表达式语法的强大的学名解析器。
BMC Bioinformatics. 2017 May 26;18(1):279. doi: 10.1186/s12859-017-1663-3.
4
BioMagResBank (BMRB) as a Resource for Structural Biology.生物磁共振数据库(BMRB)作为结构生物学资源。
Methods Mol Biol. 2020;2112:187-218. doi: 10.1007/978-1-0716-0270-6_14.
5
How to design a connectionist holistic parser.如何设计一个联结主义整体解析器。
Neural Comput. 1999 Nov 15;11(8):1995-2016. doi: 10.1162/089976699300016061.
6
Robust FCS Parsing: Exploring 211,359 Public Files.稳健的 FCS 解析:探索 211359 个公共文件。
Cytometry A. 2020 Nov;97(11):1180-1186. doi: 10.1002/cyto.a.24187. Epub 2020 Jul 15.
7
A Generalized Earley Parser for Human Activity Parsing and Prediction.用于人类活动解析和预测的广义 Earley 解析器。
IEEE Trans Pattern Anal Mach Intell. 2021 Aug;43(8):2538-2554. doi: 10.1109/TPAMI.2020.2976971. Epub 2021 Jul 1.
8
Parsing clinical text: how good are the state-of-the-art parsers?解析临床文本:最先进的解析器有多出色?
BMC Med Inform Decis Mak. 2015;15 Suppl 1(Suppl 1):S2. doi: 10.1186/1472-6947-15-S1-S2. Epub 2015 May 20.
9
: an error-correcting CIF parser for the Perl language.用于Perl语言的纠错CIF解析器。
J Appl Crystallogr. 2016 Feb 1;49(Pt 1):292-301. doi: 10.1107/S1600576715022396.
10
Domain adaption of parsing for operative notes.手术记录解析的领域适应
J Biomed Inform. 2015 Apr;54:1-9. doi: 10.1016/j.jbi.2015.01.016. Epub 2015 Feb 7.

本文引用的文献

1
An Open-Source Sandbox for Increasing the Accessibility of Functional Programming to the Bioinformatics and Scientific Communities.一个用于提高生物信息学和科学界对函数式编程可及性的开源沙盒。
Proc Int Conf Inf Technol New Gener. 2012;2012:89-94. doi: 10.1109/ITNG.2012.21.
2
Extensions to the STAR File syntax.STAR 文件语法扩展。
J Chem Inf Model. 2012 Aug 27;52(8):1901-6. doi: 10.1021/ci300074v. Epub 2012 Jul 31.
3
Iterative Development of an Application to Support Nuclear Magnetic Resonance Data Analysis of Proteins.
支持蛋白质核磁共振数据分析的应用程序的迭代开发
Proc Int Conf Inf Technol New Gener. 2011 Apr 11:1014-1020. doi: 10.1109/ITNG.2011.215.
4
BioMagResBank.生物磁共振数据库
Nucleic Acids Res. 2008 Jan;36(Database issue):D402-8. doi: 10.1093/nar/gkm957. Epub 2007 Nov 4.