Suppr超能文献

鼠染色体着丝粒 DNA 含量:测序和计算机分析。

Mouse chromocenters DNA content: sequencing and in silico analysis.

机构信息

Institute of Cytology RAS, St.-Petersburg, 194064, Russia.

Far Eastern Federal University, Vladivostok, 690922, Russia.

出版信息

BMC Genomics. 2018 Feb 20;19(1):151. doi: 10.1186/s12864-018-4534-z.

Abstract

BACKGROUND

Chromocenters are defined as a punctate condensed blocks of chromatin in the interphase cell nuclei of certain cell types with unknown biological significance. In recent years a progress in revealing of chromocenters protein content has been made although the details of DNA content within constitutive heterochromatin still remain unclear. It is known that these regions are enriched in tandem repeats (TR) and transposable elements. Quick improvement of genome sequencing does not help to assemble the heterochromatic regions due to lack of appropriate bioinformatics techniques.

RESULTS

Chromocenters DNA have been isolated by a biochemical approach from mouse liver cells nuclei and sequenced on the Illumina MiSeq resulting in ChrmC dataset. Analysis of ChrmC dataset by the bioinformatics tools available revealed that the major component of chromocenter DNA are TRs: ~ 66% MaSat and ~ 4% MiSat. Other previously classified TR families constitute ~ 1% of ChrmC dataset. About 6% of chromocenters DNA are mostly unannotated sequences. In the contigs assembled with IDBA_UD there are many fragments of heterochromatic Y-chromosome, rDNA and other pseudo-genes and non-coding DNA. A protein coding sfi1 homolog gene fragment was also found in contigs. The Sfi1 homolog gene is located on the chromosome 11 in the reference genome very close to the Golden Pass Gap (a ~ 3 Mb empty region reserved to the pericentromeric region) and proves the purity of chromocenters isolation. The second major fraction are non-LTR retroposons (SINE and LINE) with overwhelming majority of LINE - ~ 11% of ChrmC. Most of the LINE fragments are from the ~ 2 kb region at the end of the 2nd ORF and its' flanking region. The precise LINEs' segment of ~ 2 kb is the necessary mouse constitutive heterohromatin component together with TR. The third most abundant fraction are ERVs. The ERV distribution in chromocenters differs from the whole genome: IAP (ERV2 class) is the most numerous in ChrmC while MaLR (ERV3 class) prevails in the reference genome. IAP and its LTR also prevail in TR containing contigs extracted from the WGS dataset. In silico prediction of IAP and LINE fragments in chromocenters was confirmed by direct fluorescent in situ hybridization (FISH).

CONCLUSION

Our data of chromocenters' DNA (ChrmC) sequencing demonstrate that IAP with LTR and a precise ~ 2 kb fragment of LINE represent a substantial fraction of mouse chromocenters (constitutive heteroсhromatin) along with TRs.

摘要

背景

染色体中心被定义为某些细胞类型的间期细胞核中呈点状浓缩的染色质块,其生物学意义未知。近年来,尽管组成性异染色质内的 DNA 含量细节仍不清楚,但在揭示染色体中心蛋白含量方面取得了进展。已知这些区域富含串联重复序列 (TR) 和转座元件。由于缺乏适当的生物信息学技术,基因组测序的快速改进无助于组装异染色质区域。

结果

通过生化方法从小鼠肝细胞核中分离出染色体中心 DNA,并在 Illumina MiSeq 上进行测序,得到 ChrmC 数据集。使用现有生物信息学工具对 ChrmC 数据集进行分析表明,染色体中心 DNA 的主要成分是 TR:66%MaSat 和4%MiSat。其他先前分类的 TR 家族构成 ChrmC 数据集的1%。大约 6%的染色体中心 DNA 主要是未注释的序列。在使用 IDBA_UD 组装的 contigs 中,有许多异染色质 Y 染色体、rDNA 和其他假基因和非编码 DNA 的片段。在 contigs 中还发现了一个蛋白质编码 sfi1 同源基因片段。sfi1 同源基因位于参考基因组的 11 号染色体上,非常靠近 Golden Pass Gap(一个3 Mb 的空区,保留给着丝粒区),证明了染色体中心分离的纯度。第二个主要部分是非 LTR 反转录转座子(SINE 和 LINE),其中绝大多数是 LINE-11%的 ChrmC。大多数 LINE 片段来自第二个 ORF 末端及其侧翼区域的2kb 区域。~2kb 的精确 LINE 片段是与 TR 一起的小鼠组成性异染色质的必要成分。第三丰富的部分是 ERV。ERV 在染色体中心的分布与整个基因组不同:IAP(ERV2 类)在 ChrmC 中数量最多,而 MaLR(ERV3 类)在参考基因组中占优势。IAP 及其 LTR 也在从 WGS 数据集提取的包含 TR 的 contigs 中占优势。染色体中心中 IAP 和 LINE 片段的计算机预测通过直接荧光原位杂交(FISH)得到证实。

结论

我们对染色体中心 DNA(ChrmC)测序的数据表明,带有 LTR 的 IAP 和精确的~2kb LINE 片段代表了构成性异染色质(组成性异染色质)的重要部分,与 TR 一起。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3a9a/5819297/5f8e9e480356/12864_2018_4534_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验