Chinese Bulletin of Botany ›› 2019, Vol. 54 ›› Issue (3): 316-327.DOI: 10.11983/CBB18176

• RESEARCH ARTICLE • Previous Articles     Next Articles

Origin and Evolution of Soybean Protein-coding Genes

Kang Tang,Ruolin Yang()   

  1. College of Life Sciences, Northwest A&F University, Yangling 712100, China
  • Received:2018-08-14 Accepted:2018-12-10 Online:2019-07-01 Published:2019-11-24
  • Contact: Ruolin Yang


The evolution of gene composition of a species is a highly dynamic process, wherein lineage- and species-specific genes originated relatively recently are continuously integrated into the original gene network of older genes. These young genes play important roles in shaping the genome architecture, thereby leading to improved adaptation for organisms. Gene duplication and de novo origination of new genes are two ways to create new genes, causing different gene families with various copy numbers. To what extent and how the evolutionary pattern of genes depends on the timing of gene origination are still largely unexplored in soybean. In this study, we selected 19 representative angiosperms and analyzed the potential relations of the gene content dynamics with the origination of soybean (Glycine max) genes. Using the gene emergence approach, we found that 58.7% of soybean genes could be dated to ~150 million years ago and 21.7% orphan genes had recently originated. As expected, in comparison with young genes, older genes tend to be subjected to stronger purifying selection and were more conserved. In addition, older genes featured higher expression levels and were more likely to undergo alternative splicing. Furthermore, genes with different copy numbers showed a difference in these aspects. These findings may help understand the evolutionary models of genes with different ages.

Key words: angiosperms, gene duplication, gene family, gene origin, soybean