Department of Biology, McGill University, Montreal, QC, H3A 1B1, Canada.
Proteomics. 2018 Nov;18(21-22):e1800069. doi: 10.1002/pmic.201800069. Epub 2018 Oct 29.
Compositionally biased regions (BRs) occur when a few amino-acid types are enriched in a protein segment. There are possibly BR types in the known protein universe that have not been characterized experimentally. The UniProt protein database has been surveyed for evidence of such compositionally ''dark matter''. A ''dark biased region'' (DBR) is defined as a biased region with low probability of being an individual structural domain or intrinsically disordered region. The bias annotation program fLPS is used to generate a list of >13 million BRs, which is then thoroughly filtered for structure and intrinsic disorder. About a third of BRs (31%) has both substantial intrinsic disorder and structure. After filtering, there are ≈0.9 million DBRs (≈7% of the original BRs in ≈1.4% of proteins). These DBRs are hugely enriched in eukaryotes and hugely depleted in bacteria. They tend to be more hydrophobic than other protein regions, but are made of less extreme combinations of hydrophobic/hydrophilic residues. Given varying assumptions, It has been estimated that how many DBRs there might be for the high bias levels examined (with p-values < 1 × 10 ), deriving a reasonable range of 0.7-7.2% of proteins having such DBRs. Hypotheses are examined about what such DBRs might be, that is, that they are from un- or undersampled domain/region categories or are unappreciated categories somewhat like existing ones.
组成偏向区域(BRs)是指在蛋白质片段中几种氨基酸类型丰富的情况。在已知的蛋白质宇宙中,可能存在尚未通过实验表征的 BR 类型。对 UniProt 蛋白质数据库进行了调查,以寻找这种组成上的“暗物质”的证据。“暗偏向区域”(DBR)定义为具有低概率成为单个结构域或固有无序区域的偏向区域。偏置注释程序 fLPS 用于生成超过 1300 万个 BR 的列表,然后对其进行全面的结构和固有无序性过滤。大约三分之一的 BR(31%)具有大量的固有无序性和结构。经过过滤后,约有 0.9 万个 DBR(原始 BR 的约 7%,占蛋白质的约 1.4%)。这些 DBR 在真核生物中高度富集,在细菌中大量缺失。它们往往比其他蛋白质区域更疏水,但由更极端的疏水/亲水残基组合组成。根据不同的假设,对于所检查的高偏置水平(p 值<1×10),已经估计了可能存在多少个 DBR,得出了一个合理的范围,即 0.7%至 7.2%的蛋白质具有这种 DBR。关于这些 DBR 可能是什么,即它们是来自未采样或采样不足的结构域/区域类别,还是像现有类别一样不太受重视的类别,进行了假设检验。