[an error occurred while processing this directive] [an error occurred while processing this directive]
[an error occurred while processing this directive]收稿日期: 2018-08-14
录用日期: 2018-12-10
网络出版日期: 2018-12-10
基金资助
陕西省“百人计划”(SXBR8025)
Origin and Evolution of Soybean Protein-coding Genes
Received date: 2018-08-14
Accepted date: 2018-12-10
Online published: 2018-12-10
物种基因组成是一个高度动态的进化过程, 其中相对较近起源的种系和物种特异性基因会持续整合到包含古老基因的原始基因网络中。新基因在塑造基因组结构中发挥重要作用, 能提高物种适应性。基因复制和新基因的从头起源是产生新基因及改变基因家族大小的2种方式。目前, 大豆(Glycine max)基因起源时间与进化模式的相互联系很大程度上还未被探索。该研究选择19种具有代表性的被子植物基因组, 分析基因含量动态性与大豆基因起源之间的潜在联系。采用基因出现法, 研究显示约58.7%的大豆基因能追溯到大约1.5亿年前, 同时有21.7%的基因为最近起源的orphan基因。研究结果表明, 与新基因相比, 古老基因受到更强的负选择压并且更加保守。此外, 古老基因的表达水平更高且更可能发生选择性剪切。此外, 具有不同拷贝数的基因在上述特征中也具有明显差异。研究结果有助于认识不同年龄基因的进化模式。
唐康,杨若林 . 大豆蛋白编码基因起源与进化[J]. 植物学报, 2019 , 54(3) : 316 -327 . DOI: 10.11983/CBB18176
The evolution of gene composition of a species is a highly dynamic process, wherein lineage- and species-specific genes originated relatively recently are continuously integrated into the original gene network of older genes. These young genes play important roles in shaping the genome architecture, thereby leading to improved adaptation for organisms. Gene duplication and de novo origination of new genes are two ways to create new genes, causing different gene families with various copy numbers. To what extent and how the evolutionary pattern of genes depends on the timing of gene origination are still largely unexplored in soybean. In this study, we selected 19 representative angiosperms and analyzed the potential relations of the gene content dynamics with the origination of soybean (Glycine max) genes. Using the gene emergence approach, we found that 58.7% of soybean genes could be dated to ~150 million years ago and 21.7% orphan genes had recently originated. As expected, in comparison with young genes, older genes tend to be subjected to stronger purifying selection and were more conserved. In addition, older genes featured higher expression levels and were more likely to undergo alternative splicing. Furthermore, genes with different copy numbers showed a difference in these aspects. These findings may help understand the evolutionary models of genes with different ages.
Key words: angiosperms; gene duplication; gene family; gene origin; soybean
[1] | 孙红正, 葛颂 ( 2010). 重复基因的进化——回顾与进展. 植物学报 45, 13-22. |
[2] | Albalat R, Ca?estro C ( 2016). Evolution by gene loss. Nat Rev Genet 17, 379-391. |
[3] | Amborella Genome Project ( 2013). The Amborella genome and the evolution of flowering plants. Science 342, 124-1089. |
[4] | Bolger AM, Lohse M, Usadel B ( 2014). Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114-2120. |
[5] | Cai JJ, Borenstein E, Chen R, Petrov DA ( 2009). Similarly strong purifying selection acts on human disease genes of all evolutionary ages. Genome Biol Evol 1, 131-144. |
[6] | Chen SD, Krinsky BH, Long MY ( 2013). New genes as drivers of phenotypic evolution. Nat Rev Genet 14, 645-660. |
[7] | Chen TW, Wu TH, Ng WV, Lin WC ( 2011). Interrogation of alternative splicing events in duplicated genes during evolution. BMC Genomics 12(Suppl3), S16. |
[8] | Domazet-Lo?o T, Brajkovi? J, Tautz D ( 2007). A phylostratigraphy approach to uncover the genomic history of major adaptations in metazoan lineages. Trends Genet 23, 533-539. |
[9] | Doyle JJ, Luckow MA ( 2003). The rest of the iceberg. Legume diversity and evolution in a phylogenetic context. Plant Physiol 131, 900-910. |
[10] | Enright AJ, Van Dongen S, Ouzounis CA ( 2002). An efficient algorithm for large-scale detection of protein families. Nucleic Acids Res 30, 1575-1584. |
[11] | Foissac S, Sammeth M ( 2007). ASTALAVISTA: dynamic and flexible analysis of alternative splicing events in custom gene datasets. Nucleic Acids Res 35, W297-W299. |
[12] | Freeling M ( 2009). Bias in plant gene content following different sorts of duplication: tandem, whole-genome, segmental, or by transposition. Annu Rev Plant Biol 60, 433-453. |
[13] | Guo YL ( 2013). Gene family evolution in green plants with emphasis on the origination and evolution of Arabidopsis thaliana genes. Plant J 73, 941-951. |
[14] | Jiao YN, Paterson AH ( 2014). Polyploidy-associated genome modifications during land plant evolution. Philos Trans R Soc Lond B Biol Sci 369, 20130355. |
[15] | Kaessmann H ( 2010). Origins, evolution, and phenotypic impact of new genes. Genome Res 20, 1313-1326. |
[16] | Keren H, Lev-Maor G, Ast G ( 2010). Alternative splicing and evolution: diversification, exon definition and function. Nat Rev Genet 11, 345-355. |
[17] | Kim D, Langmead B, Salzberg SL ( 2015). HISAT: a fast spliced aligner with low memory requirements. Nat Me- thods 12, 357-360. |
[18] | Larkin MA, Blackshields G, Brown NP, Chenna R, McGettigan PA, McWilliam H, Valentin F, Wallace IM, Wilm A, Lopez R, Thompson JD, Gibson TJ, Higgins DG ( 2007). Clustal W and Clustal X version 2.0. Bioinformatics 23, 2947-2948. |
[19] | Li L, Stoeckert CJ Jr, Roos DS ( 2003). OrthoMCL: identification of ortholog groups for eukaryotic genomes. Genome Res 13, 2178-2189. |
[20] | Long M, Betrán E, Thornton K, Wang W ( 2003). The origin of new genes: glimpses from the young and old. Nat Rev Genet 4, 865-875. |
[21] | Lynch M, Conery JS ( 2000). The evolutionary fate and consequences of duplicate genes. Science 290, 1151-1155. |
[22] | Merkin J, Russell C, Chen P, Burge CB ( 2012). Evolutionary dynamics of gene and isoform regulation in mam- malian tissues. Science 338, 1593-1599. |
[23] | Michael TP, Jackson S ( 2013). The first 50 plant genomes. Plant Gen 6, 2. |
[24] | Michael TP, VanBuren R ( 2015). Progress, challenges and the future of crop genomes. Curr Opin Plant Biol 24, 71-81. |
[25] | Ohno S ( 1970). Evolution by Gene Duplication. Berlin, Heidelberg: Springer. pp. 1-160. |
[26] | Panchy N, Lehti-Shiu M, Shiu SH ( 2016). Evolution of gene duplication in plants. Plant Physiol 171, 2294-2316. |
[27] | Pertea M, Pertea GM, Antonescu CM, Chang TC, Mendell JT, Salzberg SL ( 2015). StringTie enables improved reconstruction of a transcriptome from RNA-seq reads. Nat Biotechnol 33, 290-295. |
[28] | Quint M, Drost HG, Gabel A, Ullrich KK, B?nn M, Grosse I ( 2012). A transcriptomic hourglass in plant embryogenesis. Nature 490, 98-101. |
[29] | Reddy ASN, Marquez Y, Kalyna M, Barta A ( 2013). Complexity of the alternative splicing landscape in plants. Plant Cell 25, 3657-3683. |
[30] | Schmutz J, Cannon SB, Schlueter J, Ma JX, Mitros T, Nelson W, Hyten DL, Song QJ, Thelen JJ, Cheng JL, Xu D, Hellsten U, May GD, Yu Y, Sakurai T, Umezawa T, Bhattacharyya MK, Sandhu D, Valliyodan B, Lindquist E, Peto M, Grant D, Shu SQ, Goodstein D, Barry K, Futrell-Griggs M, Abernathy B, Du JC, Tian ZX, Zhu LC, Gill N, Joshi T, Libault M, Sethuraman A, Zhang XC, Shinozaki K, Nguyen HT, Wing RA, Cregan P, Specht J, Grimwood J, Rokhsar D, Stacey G, Shoemaker RC, Jackson SA ( 2010). Genome sequence of the palaeopolyploid soybean. Nature 463, 178-183. |
[31] | Shen YT, Zhou ZK, Wang Z, Li WY, Fang C, Wu M, Ma YM, Liu TF, Kong LA, Peng DL, Tian ZX ( 2014). Global dissection of alternative splicing in paleopolyploid soybean. Plant Cell 26, 996-1008. |
[32] | Suyama M, Torrents D, Bork P ( 2006). PAL2NAL: robust conversion of protein sequence alignments into the corresponding codon alignments. Nucleic Acids Res 34, W609-W612. |
[33] | Tasdighian S, Van Bel M, Li Z, Van de Peer Y, Carretero-Paulet L, Maere S ( 2017). Reciprocally retained genes in the angiosperm lineage show the hallmarks of dosage balance sensitivity. Plant Cell 29, 2766-2785. |
[34] | Tautz D, Domazet-Lo?o T ( 2011). The evolutionary origin of orphan genes. Nat Rev Genet 12, 692-702. |
[35] | Trapnell C, Williams BA, Pertea G, Mortazavi A, Kwan G, van Baren MJ, Salzberg SL, Wold BJ, Pachter L ( 2010). Transcript assembly and quantification by RNA-seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat Biotechnol 28, 511-515. |
[36] | Yanai I, Benjamin H, Shmoish M, Chalifa-Caspi V, Shklar M, Ophir R, Bar-Even A, Horn-Saban S, Safran M, Domany E, Lancet D, Shmueli O ( 2005). Genome-wide midrange transcription profiles reveal expression level re- lationships in human tissue specification. Bioinformatics 21, 650-659. |
[37] | Yang ZH ( 2007). PAML 4: phylogenetic analysis by maximum likelihood. Mol Biol Evol 24, 1586-1591. |
[38] | Zhang JZ ( 2003). Evolution by gene duplication: an update. Trends Ecol Evol 18, 292-298. |
/
〈 | 〉 |