Department of Human Genetics, University of Utah, 15 S 2030 E, Salt Lake City, UT 84112, USA.
USTAR Center for Genetic Discovery, University of Utah, 15 S 2030 E, Salt Lake City, UT 84112, USA.
Gigascience. 2017 Nov 1;6(11):1-6. doi: 10.1093/gigascience/gix090.
The BAM and CRAM formats provide a supplementary linear index that facilitates rapid access to sequence alignments in arbitrary genomic regions. Comparing consecutive entries in a BAM or CRAM index allows one to infer the number of alignment records per genomic region for use as an effective proxy of sequence depth in each genomic region. Based on these properties, we have developed indexcov, an efficient estimator of whole-genome sequencing coverage to rapidly identify samples with aberrant coverage profiles, reveal large-scale chromosomal anomalies, recognize potential batch effects, and infer the sex of a sample. Indexcov is available at https://github.com/brentp/goleft under the MIT license.
BAM 和 CRAM 格式提供了一个补充的线性索引,便于在任意基因组区域中快速访问序列比对。比较 BAM 或 CRAM 索引中的连续条目,可以推断出每个基因组区域的比对记录数,作为每个基因组区域序列深度的有效替代物。基于这些特性,我们开发了 indexcov,这是一种高效的全基因组测序覆盖度估计器,可快速识别覆盖度异常的样本,揭示大规模染色体异常,识别潜在的批次效应,并推断样本的性别。indexcov 可在 MIT 许可证下在 https://github.com/brentp/goleft 上获得。