Gene discovery and tissue-specific transcriptome analysis in chickpea with massively parallel pyrosequencing and web resource development
Chickpea (Cicer arietinum) is an important food legume crop but lags in the availability of genomic resources. In this study, we have generated about 2 million high-quality sequences of average length of 372 bp using pyrosequencing technology. The optimization of de novo assembly clearly indicated that hybrid assembly of long-read and short-read primary assemblies gave better results. The hybrid assembly generated a set of 34,760 transcripts with an average length of 1,020 bp representing about 4.8% (35.5 Mb) of the total chickpea genome. We identified more than 4,000 simple sequence repeats, which can be developed as functional molecular markers in chickpea. Putative function and Gene Ontology terms were assigned to at least 73.2% and 71.0% of chickpea transcripts, respectively. We have also identified several chickpea transcripts that showed tissue-specific expression and validated the results using real-time polymerase chain reaction analysis. Based on sequence comparison with other species within the plant kingdom, we identified two sets of lineage-specific genes, including those conserved in the Fabaceae family (legume specific) and those lacking significant similarity with any non chickpea species (chickpea specific). Finally, we have developed a Web resource, Chickpea Transcriptome Database, which provides public access to the data and results reported in this study. The strategy for optimization of de novo assembly presented here may further facilitate the transcriptome sequencing and characterization in other organisms. Most importantly, the data and results reported in this study will help to accelerate research in various areas of genomics and implementing breeding programs in chickpea.
Additional details for this publication include: