Schneider T D, Stephens R M
National Cancer Institute, Frederick Cancer Research and Development Center, MD 21701.
Nucleic Acids Res. 1990 Oct 25;18(20):6097-100. doi: 10.1093/nar/18.20.6097.
A graphical method is presented for displaying the patterns in a set of aligned sequences. The characters representing the sequence are stacked on top of each other for each position in the aligned sequences. The height of each letter is made proportional to its frequency, and the letters are sorted so the most common one is on top. The height of the entire stack is then adjusted to signify the information content of the sequences at that position. From these 'sequence logos', one can determine not only the consensus sequence but also the relative frequency of bases and the information content (measured in bits) at every position in a site or sequence. The logo displays both significant residues and subtle sequence patterns.
本文介绍了一种用于展示一组比对序列中模式的图形方法。对于比对序列中的每个位置,代表序列的字符相互堆叠。每个字母的高度与其出现频率成正比,并且字母按频率排序,最常见的字母在顶部。然后调整整个堆叠的高度以表示该位置处序列的信息含量。从这些“序列标识”中,不仅可以确定共有序列,还可以确定位点或序列中每个位置的碱基相对频率和信息含量(以比特为单位)。该标识既显示了重要残基,也显示了微妙的序列模式。