• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

从高通量实验中发现具有信息性的综合描述符。

Discovering collectively informative descriptors from high-throughput experiments.

机构信息

Eshelman School of Pharmacy, University of North Carolina at Chapel Hill, NC, USA.

出版信息

BMC Bioinformatics. 2009 Dec 18;10:431. doi: 10.1186/1471-2105-10-431.

DOI:10.1186/1471-2105-10-431
PMID:20021653
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2813853/
Abstract

BACKGROUND

Improvements in high-throughput technology and its increasing use have led to the generation of many highly complex datasets that often address similar biological questions. Combining information from these studies can increase the reliability and generalizability of results and also yield new insights that guide future research.

RESULTS

This paper describes a novel algorithm called BLANKET for symmetric analysis of two experiments that assess informativeness of descriptors. The experiments are required to be related only in that their descriptor sets intersect substantially and their definitions of case and control are consistent. From resulting lists of n descriptors ranked by informativeness, BLANKET determines shortlists of descriptors from each experiment, generally of different lengths p and q. For any pair of shortlists, four numbers are evident: the number of descriptors appearing in both shortlists, in exactly one shortlist, or in neither shortlist. From the associated contingency table, BLANKET computes Right Fisher Exact Test (RFET) values used as scores over a plane of possible pairs of shortlist lengths 12. BLANKET then chooses a pair or pairs with RFET score less than a threshold; the threshold depends upon n and shortlist length limits and represents a quality of intersection achieved by less than 5% of random lists.

CONCLUSIONS

Researchers seek within a universe of descriptors some minimal subset that collectively and efficiently predicts experimental outcomes. Ideally, any smaller subset should be insufficient for reliable prediction and any larger subset should have little additional accuracy. As a method, BLANKET is easy to conceptualize and presents only moderate computational complexity. Many existing databases could be mined using BLANKET to suggest optimal sets of predictive descriptors.

摘要

背景

高通量技术的改进及其广泛应用产生了许多高度复杂的数据集,这些数据集通常都在研究类似的生物学问题。合并这些研究中的信息可以提高结果的可靠性和普遍性,并产生新的见解来指导未来的研究。

结果

本文描述了一种名为 BLANKET 的新算法,用于对两个评估描述符信息量的实验进行对称分析。这些实验只需在其描述符集有很大的交集且病例和对照的定义一致这一点上具有相关性。从按信息量排序的 n 个描述符的结果列表中,BLANKET 从每个实验中确定描述符的短列表,通常长度为 p 和 q 不同。对于任何一对短列表,有四个数字是明显的:出现在两个短列表中的描述符数量、仅出现在一个短列表中的描述符数量,或者不出现在两个短列表中的描述符数量。从相关的列联表中,BLANKET 计算右 Fisher 精确检验(RFET)值,用作在可能的短列表长度 12 的平面上的得分。BLANKET 然后选择一对或多对具有低于阈值的 RFET 得分的短列表;该阈值取决于 n 和短列表长度限制,并代表少于 5%的随机列表所达到的交集质量。

结论

研究人员在描述符的宇宙中寻求一些最小的子集,这些子集共同且有效地预测实验结果。理想情况下,任何更小的子集都不足以进行可靠的预测,而任何更大的子集都不会有太多额外的准确性。作为一种方法,BLANKET 易于理解,并且只具有中等的计算复杂性。许多现有的数据库可以使用 BLANKET 进行挖掘,以建议最佳的预测描述符集。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ce1d/2813853/f858f2007382/1471-2105-10-431-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ce1d/2813853/71a60864a17a/1471-2105-10-431-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ce1d/2813853/2113d1e624ab/1471-2105-10-431-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ce1d/2813853/095a113edf33/1471-2105-10-431-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ce1d/2813853/03cb4dd70341/1471-2105-10-431-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ce1d/2813853/f858f2007382/1471-2105-10-431-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ce1d/2813853/71a60864a17a/1471-2105-10-431-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ce1d/2813853/2113d1e624ab/1471-2105-10-431-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ce1d/2813853/095a113edf33/1471-2105-10-431-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ce1d/2813853/03cb4dd70341/1471-2105-10-431-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ce1d/2813853/f858f2007382/1471-2105-10-431-5.jpg

相似文献

1
Discovering collectively informative descriptors from high-throughput experiments.从高通量实验中发现具有信息性的综合描述符。
BMC Bioinformatics. 2009 Dec 18;10:431. doi: 10.1186/1471-2105-10-431.
2
Reoptimization of MDL keys for use in drug discovery.重新优化用于药物发现的分子描述符语言(MDL)键。
J Chem Inf Comput Sci. 2002 Nov-Dec;42(6):1273-80. doi: 10.1021/ci010132r.
3
BINK: Biological binary keypoint descriptor.BINK:生物二进制关键点描述符。
Biosystems. 2017 Dec;162:147-156. doi: 10.1016/j.biosystems.2017.10.007. Epub 2017 Oct 13.
4
Translational Metabolomics of Head Injury: Exploring Dysfunctional Cerebral Metabolism with Ex Vivo NMR Spectroscopy-Based Metabolite Quantification头部损伤的转化代谢组学:基于体外核磁共振波谱的代谢物定量分析探索脑代谢功能障碍
5
The development of PIPA: an integrated and automated pipeline for genome-wide protein function annotation.PIPA的开发:一种用于全基因组蛋白质功能注释的集成自动化流程
BMC Bioinformatics. 2008 Jan 25;9:52. doi: 10.1186/1471-2105-9-52.
6
Selecting diversified compounds to build a tangible library for biological and biochemical assays.选择多样化的化合物来构建用于生物和生化测定的有形文库。
Molecules. 2010 Jul 23;15(7):5031-44. doi: 10.3390/molecules15075031.
7
Using kernel alignment to select features of molecular descriptors in a QSAR study.使用核对齐选择 QSAR 研究中分子描述符的特征。
IEEE/ACM Trans Comput Biol Bioinform. 2011 Sep-Oct;8(5):1373-84. doi: 10.1109/TCBB.2011.31.
8
DescFold: a web server for protein fold recognition.DescFold:用于蛋白质折叠识别的网络服务器。
BMC Bioinformatics. 2009 Dec 14;10:416. doi: 10.1186/1471-2105-10-416.
9
[Research on the application of pattern selection algorithm based on bioinformatic data].基于生物信息数据的模式选择算法应用研究
Sheng Wu Yi Xue Gong Cheng Xue Za Zhi. 2011 Oct;28(5):901-6.
10
A Bayesian network model for predicting aquatic toxicity mode of action using two dimensional theoretical molecular descriptors.使用二维理论分子描述符预测水生毒性作用模式的贝叶斯网络模型。
Aquat Toxicol. 2016 Nov;180:11-24. doi: 10.1016/j.aquatox.2016.09.006. Epub 2016 Sep 13.

引用本文的文献

1
Lineage-Specific Genome Architecture Links Enhancers and Non-coding Disease Variants to Target Gene Promoters.谱系特异性基因组结构将增强子和非编码疾病变异与靶基因启动子联系起来。
Cell. 2016 Nov 17;167(5):1369-1384.e19. doi: 10.1016/j.cell.2016.09.037.
2
CHiCAGO: robust detection of DNA looping interactions in Capture Hi-C data.芝加哥:在捕获Hi-C数据中对DNA环化相互作用进行稳健检测。
Genome Biol. 2016 Jun 15;17(1):127. doi: 10.1186/s13059-016-0992-2.
3
Exact statistical tests for the intersection of independent lists of genes.针对独立基因列表交集的精确统计检验。

本文引用的文献

1
Widespread changes in protein synthesis induced by microRNAs.微小RNA诱导的蛋白质合成的广泛变化。
Nature. 2008 Sep 4;455(7209):58-63. doi: 10.1038/nature07228. Epub 2008 Jul 30.
2
Differences in gene expression levels between early and later stages of human lung development are opposite to those between normal lung tissue and non-small lung cell carcinoma.人类肺部发育早期和后期之间的基因表达水平差异与正常肺组织和非小细胞肺癌之间的差异相反。
Lung Cancer. 2008 Oct;62(1):23-34. doi: 10.1016/j.lungcan.2008.02.011. Epub 2008 Apr 3.
3
VEGF induces Tie2 shedding via a phosphoinositide 3-kinase/Akt dependent pathway to modulate Tie2 signaling.
Ann Appl Stat. 2012 Jun;6(2):521-541. doi: 10.1214/11-AOAS510.
血管内皮生长因子通过磷脂酰肌醇3激酶/蛋白激酶B依赖的途径诱导Tie2脱落,从而调节Tie2信号传导。
Arterioscler Thromb Vasc Biol. 2007 Dec;27(12):2619-26. doi: 10.1161/ATVBAHA.107.150482. Epub 2007 Sep 27.
4
MicroRNAs in disease and potential therapeutic applications.疾病中的微小RNA及其潜在治疗应用
Mol Ther. 2007 Dec;15(12):2070-9. doi: 10.1038/sj.mt.6300311. Epub 2007 Sep 18.
5
Dysadherin: a new player in cancer progression.去黏附素:癌症进展中的新角色。
Cancer Lett. 2007 Oct 8;255(2):161-9. doi: 10.1016/j.canlet.2007.02.018. Epub 2007 Apr 17.
6
Statistical tools for synthesizing lists of differentially expressed features in related experiments.用于综合相关实验中差异表达特征列表的统计工具。
Genome Biol. 2007;8(4):R54. doi: 10.1186/gb-2007-8-4-r54.
7
Fisher's combined p-value for detecting differentially expressed genes using Affymetrix expression arrays.使用Affymetrix表达阵列检测差异表达基因的费舍尔联合p值。
BMC Genomics. 2007 Apr 9;8:96. doi: 10.1186/1471-2164-8-96.
8
Identification of functional cell adhesion molecules with a potential role in metastasis by a combination of in vivo phage display and in silico analysis.通过体内噬菌体展示和计算机分析相结合的方法鉴定在转移中可能发挥作用的功能性细胞粘附分子。
OMICS. 2007 Spring;11(1):41-57. doi: 10.1089/omi.2006.0004.
9
Oncomine 3.0: genes, pathways, and networks in a collection of 18,000 cancer gene expression profiles.Oncomine 3.0:18000个癌症基因表达谱集合中的基因、通路和网络
Neoplasia. 2007 Feb;9(2):166-80. doi: 10.1593/neo.07112.
10
Genetic deletions in sputum as diagnostic markers for early detection of stage I non-small cell lung cancer.痰液中的基因缺失作为I期非小细胞肺癌早期检测的诊断标志物。
Clin Cancer Res. 2007 Jan 15;13(2 Pt 1):482-7. doi: 10.1158/1078-0432.CCR-06-1593.