Homma Keiichi, Anbo Hiroto, Ota Motonori, Fukuchi Satoshi
Program for Information Systems, Division of Informatics, Bioengineering and Bioscience, Maebashi Institute of Technology, Maebashi, Japan
Program for Information Systems, Division of Informatics, Bioengineering and Bioscience, Maebashi Institute of Technology, Maebashi, Japan.
Life Sci Alliance. 2025 Sep 23;8(12). doi: 10.26508/lsa.202403148. Print 2025 Dec.
Insertions and deletions (indels) are known to preferentially encode intrinsically disordered regions (IDRs), regions that by themselves do not form unique three-dimensional structures. As we previously showed that long internal exons tend to encode IDRs, we decided to analyze how indels alter internal exons and affect IDRs of the encoded proteins. Here, we analyzed eight eukaryotes to select indels commonly observed in all variants ("fixed" indels). The fixed indels in internal exons mostly encode IDRs. Residue-wise ∼50% of the indels are attributable to alterations in tandem repeats. Deletion is generally more prevalent in long internal exons and in most species the same trend is detected in insertion. Tandem repeats occur preferentially in long internal exons, indicating that their alterations partly account for the high frequency of indels in long internal exons. Also, since tandem repeats mostly encode IDRs, this finding partially explains the high incidence of IDRs in long internal exons. We propose that long internal exons had been produced in early eukaryotes mainly by repeat expansion that added IDRs to the encoded proteins.