Ruminant Diseases and Immunology Research Unit, USDA Agriculture Research Service, National Animal Disease Center, Ames, IA 50010.
Ruminant Diseases and Immunology Research Unit, USDA Agriculture Research Service, National Animal Disease Center, Ames, IA 50010; Oak Ridge Institute for Science and Education, Oak Ridge Associated Universities, Oak Ridge, TN 37830.
J Dairy Sci. 2019 May;102(5):4741-4754. doi: 10.3168/jds.2018-15267. Epub 2018 Sep 27.
Sequencing the first genome took 15 yr and $3 billion to complete. Currently, a genome can be sequenced in a day for a few thousand dollars. Comparing the relative abundance of nearly every mRNA transcript and small RNAs from cells and tissues from different experimental conditions has become so easy that it can take longer to transfer the data between computers than to perform the experiment. Nucleotide sequencing techniques have become so sensitive that the greatest concern is not detecting a gene or transcript but rather, falsely identifying one. Better genome sequencing has led to more complete transcriptomic and proteomic databases and, combined with more sensitive instrumentation and separation techniques, is bringing us closer to detecting complete transcriptomes and proteomes. The promise of these powerful omics techniques is to lead us to new and unexpected connections between molecular processes in the context of animal health. This promise cannot be achieved without hypothesis-driven research that connects omics data with animal health experiments. Any researcher who wishes to invest the time and resources in omics experiments should be aware of the common pitfalls and limitations of these techniques so they can avoid these issues and maximize the use of these research tools. Several important questions must be asked: What is the quality of the databases and how they are annotated? Are the annotations based on experimental results or computational predictions? What assumptions are made by the analysis algorithms, and how will this affect the result? Finally, how can the research community use the vast amount of data being generated by omics experiments in ways to achieve the goals of better animal health and production (which is the promise of omics technologies)? Until the observations shown in omics data sets are used to achieve the goals of better animal health and production, the potential of omics technology will not be fully realized.
测序第一个基因组花费了 15 年时间和 30 亿美元才完成。如今,一天内花费几千美元就可以对一个基因组进行测序。比较不同实验条件下细胞和组织中几乎所有 mRNA 转录本和小 RNA 的相对丰度变得如此容易,以至于在计算机之间传输数据所需的时间可能比执行实验的时间还要长。核苷酸测序技术变得如此敏感,以至于最大的问题不是检测到一个基因或转录本,而是错误地识别出一个基因或转录本。更好的基因组测序导致了更完整的转录组和蛋白质组数据库,并且与更敏感的仪器和分离技术相结合,使我们更接近于检测完整的转录组和蛋白质组。这些强大的组学技术的承诺是在动物健康背景下为我们带来分子过程之间新的和意想不到的联系。如果没有将组学数据与动物健康实验联系起来的假设驱动研究,这一承诺就无法实现。任何希望投入时间和资源进行组学实验的研究人员都应该意识到这些技术的常见陷阱和局限性,以便他们能够避免这些问题并最大限度地利用这些研究工具。必须提出几个重要问题:数据库的质量如何,以及它们是如何注释的?注释是基于实验结果还是计算预测?分析算法做出了哪些假设,这将如何影响结果?最后,研究界如何以实现更好的动物健康和生产的目标(这是组学技术的承诺)来利用组学实验产生的大量数据?直到在组学数据集上的观察结果被用于实现更好的动物健康和生产的目标,组学技术的潜力才不会得到充分实现。