Lu Jun, Luo Liaofu
Laboratory of Theoretical Biophysics, Faculty of Science and Technology, Inner Mongolia University, Hohhot 010021, P.R.China.
Bioinformation. 2008 Apr 28;2(7):316-21. doi: 10.6026/97320630002316.
The accurate identification of promoter regions and transcription start sites is a challenge to the construction of human transcription regulation networks. Thus, an efficient prediction method based on theoretical formulation is necessary for this purpose. We used the method of increment diversity with quadratic discriminant analysis (IDQD) to predict transcription start sites (TSS). The method produced sensitivity and positive predictive value of more than 65% with positives to negatives ratio of 1:58. The performance evaluation using Receiver Operator Characteristics (ROC) showed an auROC (area under ROC) of greater than 96%. The evaluation by Precision Recall Curves (PRC) showed an auPRC (area under PRC) of about 26% for positives to negatives ratio of 1:679 and about 64% for positives to negatives ratio of 1:113. The results documented in this approach are either better or comparable to other known methods.
准确识别启动子区域和转录起始位点是构建人类转录调控网络面临的一项挑战。因此,为此目的需要一种基于理论公式的高效预测方法。我们使用增量多样性与二次判别分析(IDQD)方法来预测转录起始位点(TSS)。该方法产生的灵敏度和阳性预测值超过65%,阳性与阴性比例为1:58。使用受试者工作特征(ROC)进行的性能评估显示,曲线下面积(auROC)大于96%。通过精确召回率曲线(PRC)进行的评估显示,对于阳性与阴性比例为1:679,曲线下面积(auPRC)约为26%;对于阳性与阴性比例为1:113,曲线下面积(auPRC)约为64%。该方法记录的结果优于或与其他已知方法相当。