Full-length de novo assembly of RNA-seq data in pea (Pisum sativum L.) provides a gene expression atlas and gives insights into root nodulation in this species
Publication Overview
Abstract Next-generation sequencing technologies allow an almost exhaustive survey of the transcriptome even in species with no available genome sequence. To produce a Unigene set representing most of the expressed genes of pea, 20 cDNA libraries produced from various plant tissues harvested at different developmental stages on plants grown in contrasted N-nutritive conditions were sequenced. Around one billion reads and 100 Gb of sequence were de novo assembled. Following several steps of redundancy reduction, 46,099 contigs with N50 length of 1,667 nt were identified. They constitute the 'Cameor' pea Unigene set. The high depth of sequencing allowed the identification of rare transcripts and detected expression for ca. 80% of contigs in each library. The Unigene set is now available on a website (http ://bios.dijon. inra .fr/FATAL/cgi/pscam .cgi) that allows (i) searching the pea orthologs of candidate genes based on gene sequences from other species, or based on annotation, (ii) determining transcript expression patterns using different metrics, (iii) identifying uncharacterized genes with interesting patterns of expression, and (iv) comparing gene ontology pathways between tissues. This resource has allowed identification of the pea orthologues of major nodulation genes characterized in recent years in model species, as a major step towards deciphering unresolved pea nodulation phenotypes. Besides a remarkable conservation of the early transcriptome nodulation apparatus between pea and Medicago truncatula, some specific features were highlighted. The resource provides a reference for the pea exome and will facilitate transcriptome and proteome approaches as well as SNP discovery in pea. This article is protected by copyright. All rights reserved. Properties
Additional
details for this publication include:
|